Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.04607
Cited By
IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization
10 September 2021
Fajri Koto
Jey Han Lau
Timothy Baldwin
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization"
21 / 21 papers shown
Title
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia
Fajri Koto
ELM
131
3
0
13 Sep 2024
Liputan6: A Large-scale Indonesian Dataset for Text Summarization
Fajri Koto
Jey Han Lau
Timothy Baldwin
63
46
0
02 Nov 2020
IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP
Fajri Koto
Afshin Rahimi
Jey Han Lau
Timothy Baldwin
51
263
0
02 Nov 2020
LEGAL-BERT: The Muppets straight out of Law School
Ilias Chalkidis
Manos Fergadiotis
Prodromos Malakasiotis
Nikolaos Aletras
Ion Androutsopoulos
AILaw
63
259
0
06 Oct 2020
Parsing with Multilingual BERT, a Small Corpus, and a Small Treebank
Ethan C. Chau
Lucy H. Lin
Noah A. Smith
65
15
0
29 Sep 2020
Improving Bi-LSTM Performance for Indonesian Sentiment Analysis Using Paragraph Vector
Ayu Purwarianti
Ida Ayu Putu Ari Crisdayanti
55
39
0
12 Sep 2020
BERTweet: A pre-trained language model for English Tweets
Dat Quoc Nguyen
Thanh Tien Vu
A. Nguyen
VLM
99
919
0
20 May 2020
Extending Multilingual BERT to Low-Resource Languages
Zihan Wang
Karthikeyan K
Stephen D. Mayhew
Dan Roth
VLM
63
132
0
28 Apr 2020
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
164
2,435
0
23 Apr 2020
Inexpensive Domain Adaptation of Pretrained Language Models: Case Studies on Biomedical NER and Covid-19 QA
Nina Poerner
Ulli Waltinger
Hinrich Schütze
OOD
57
50
0
07 Apr 2020
SemEval-2017 Task 4: Sentiment Analysis in Twitter
Sara Rosenthal
N. Farra
Preslav Nakov
VLM
92
799
0
02 Dec 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
680
24,541
0
26 Jul 2019
Publicly Available Clinical BERT Embeddings
Emily Alsentzer
John R. Murphy
Willie Boag
W. Weng
Di Jin
Tristan Naumann
Matthew B. A. McDermott
AI4MH
192
1,983
0
06 Apr 2019
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
Jinhyuk Lee
Wonjin Yoon
Sungdong Kim
Donghyeon Kim
Sunkyu Kim
Chan Ho So
Jaewoo Kang
OOD
180
5,672
0
25 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,175
0
11 Oct 2018
Parsing Tweets into Universal Dependencies
Yijia Liu
Yi Zhu
Wanxiang Che
Bing Qin
Nathan Schneider
Noah A. Smith
61
74
0
23 Apr 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
786
132,363
0
12 Jun 2017
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Zhiwen Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
911
6,796
0
26 Sep 2016
Enriching Word Vectors with Subword Information
Piotr Bojanowski
Edouard Grave
Armand Joulin
Tomas Mikolov
NAI
SSL
VLM
232
9,980
0
15 Jul 2016
Neural Machine Translation of Rare Words with Subword Units
Rico Sennrich
Barry Haddow
Alexandra Birch
228
7,757
0
31 Aug 2015
Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
686
31,544
0
16 Jan 2013
1