RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019

Luke Zettlemoyer

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 10,878 papers shown

Title
Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and their Social Biases in Downstream Tasks Masahiro Kaneko Danushka Bollegala Naoaki Okazaki 102 46 0 06 Oct 2022
Binding Language Models in Symbolic Languages Zhoujun Cheng Tianbao Xie Peng Shi Chengzu Li Rahul Nadkarni ... Dragomir R. Radev Mari Ostendorf Luke Zettlemoyer Noah A. Smith Tao Yu LMTD 263 215 0 06 Oct 2022
Time Will Change Things: An Empirical Study on Dynamic Language Understanding in Social Media Classification Yuji Zhang Jing Li 85 5 0 06 Oct 2022
XDoc: Unified Pre-training for Cross-Format Document Understanding Jingye Chen Tengchao Lv Lei Cui Changrong Zhang Furu Wei 108 14 0 06 Oct 2022
Just ClozE! A Novel Framework for Evaluating the Factual Consistency Faster in Abstractive Summarization Yiyang Li Lei Li Marina Litvak N. Vanetik Dingxing Hu Yuze Li Yanquan Zhou HILM 95 0 0 06 Oct 2022
Revisiting Structured Dropout Yiren Zhao Oluwatomisin Dada Xitong Gao Robert D. Mullins BDL 77 2 0 05 Oct 2022
"No, they did not": Dialogue response dynamics in pre-trained language models Sanghee Kim Lang-Chi Yu Allyson Ettinger 57 1 0 05 Oct 2022
GAPX: Generalized Autoregressive Paraphrase-Identification X Yi Zhou Renyu Li Hayden Housen Ser-Nam Lim BDL 79 0 0 05 Oct 2022
Recitation-Augmented Language Models Zhiqing Sun Xuezhi Wang Yi Tay Yiming Yang Denny Zhou RALM 281 65 0 04 Oct 2022
Understanding Prior Bias and Choice Paralysis in Transformer-based Language Representation Models through Four Experimental Probes Ke Shen Mayank Kejriwal 89 4 0 03 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization Rajkumar Ramamurthy Prithviraj Ammanabrolu Kianté Brantley Jack Hessel R. Sifa Christian Bauckhage Hannaneh Hajishirzi Yejin Choi OffRL 136 250 0 03 Oct 2022
Community Learning: Understanding A Community Through NLP for Positive Impact Towhid Chowdhury Naveen Sharma 48 0 0 02 Oct 2022
ReAct: A Review Comment Dataset for Actionability (and more) G. Choudhary Natwar Modani Nitish Maurya 54 3 0 02 Oct 2022
Differentially Private Optimization on Large Model at Small Cost Zhiqi Bu Yu Wang Sheng Zha George Karypis 130 55 0 30 Sep 2022
Differentially Private Bias-Term Fine-tuning of Foundation Models Zhiqi Bu Yu Wang Sheng Zha George Karypis 146 48 0 30 Sep 2022
Relative representations enable zero-shot latent space communication Luca Moschella Valentino Maiorca Marco Fumero Antonio Norelli Francesco Locatello Emanuele Rodolà 130 109 0 30 Sep 2022
Construction and Applications of Billion-Scale Pre-Trained Multimodal Business Knowledge Graph Shumin Deng Chengming Wang Zhoubo Li Ningyu Zhang Zelin Dai ... Mosha Chen Jiaoyan Chen Jeff Z. Pan Bryan Hooi Huajun Chen VLM 117 22 0 30 Sep 2022
What Makes Pre-trained Language Models Better Zero-shot Learners? Jinghui Lu Dongsheng Zhu Weidong Han Rui Zhao Brian Mac Namee Fei Tan 104 24 0 30 Sep 2022
Self-Distillation for Further Pre-training of Transformers Seanie Lee Minki Kang Juho Lee Sung Ju Hwang Kenji Kawaguchi 118 8 0 30 Sep 2022
Depth-Wise Attention (DWAtt): A Layer Fusion Method for Data-Efficient Classification Muhammad N. ElNokrashy Badr AlKhamissi Mona T. Diab MoMe 90 5 0 30 Sep 2022
PART: Pre-trained Authorship Representation Transformer Javier Huertas-Tato Álvaro Huertas-García Alejandro Martín 137 9 0 30 Sep 2022
How to tackle an emerging topic? Combining strong and weak labels for Covid news NER Aleksander Ficek Fangyu Liu Nigel Collier 79 1 0 29 Sep 2022
Improving Molecular Pretraining with Complementary Featurizations Yanqiao Zhu Dingshuo Chen Yuanqi Du Yingze Wang Qiang Liu Shu Wu AI4CE 82 7 0 29 Sep 2022
Few-shot Text Classification with Dual Contrastive Consistency Liwen Sun Jiawei Han 59 0 0 29 Sep 2022
NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizer Valentin Leplat D. Merkulov Aleksandr Katrutsa Daniel Bershatsky Olga Tsymboi Ivan Oseledets 137 3 0 29 Sep 2022
polyBERT: A chemical language model to enable fully machine-driven ultrafast polymer informatics Christopher Kuenneth R. Ramprasad 109 113 0 29 Sep 2022
Make-A-Video: Text-to-Video Generation without Text-Video Data Uriel Singer Adam Polyak Thomas Hayes Xiaoyue Yin Jie An ... Oron Ashual Oran Gafni Devi Parikh Sonal Gupta Yaniv Taigman DiffM VGen 147 1,441 0 29 Sep 2022
Perturbations and Subpopulations for Testing Robustness in Token-Based Argument Unit Recognition Jonathan Kamp Lisa Beinborn Antske Fokkens 52 0 0 29 Sep 2022
Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts Timo Spinde Manuel Plank Jan-David Krieger Terry Ruas Bela Gipp Akiko Aizawa 72 76 0 29 Sep 2022
Bidirectional Language Models Are Also Few-shot Learners Ajay Patel Bryan Li Mohammad Sadegh Rasooli Noah Constant Colin Raffel Chris Callison-Burch LRM 140 47 0 29 Sep 2022
Multi-stage Information Retrieval for Vietnamese Legal Texts Nhat-Minh Pham Nguyen Ha Thanh Trong-Hop Do AILaw 46 3 0 29 Sep 2022
Downstream Datasets Make Surprisingly Good Pretraining Corpora Kundan Krishna Saurabh Garg Jeffrey P. Bigham Zachary Chase Lipton 108 33 0 28 Sep 2022
Audio Barlow Twins: Self-Supervised Audio Representation Learning Jonah Anton H. Coppock Pancham Shukla Bjorn W. Schuller BDL SSL 85 8 0 28 Sep 2022
Causal Proxy Models for Concept-Based Model Explanations Zhengxuan Wu Karel DÓosterlinck Atticus Geiger Amir Zur Christopher Potts MILM 134 37 0 28 Sep 2022
Audio Retrieval with WavText5K and CLAP Training Soham Deshmukh Benjamin Elizalde Huaming Wang 3DV CLIP 187 53 0 28 Sep 2022
TVLT: Textless Vision-Language Transformer Zineng Tang Jaemin Cho Yixin Nie Joey Tianyi Zhou VLM 137 31 0 28 Sep 2022
CEFER: A Four Facets Framework based on Context and Emotion embedded features for Implicit and Explicit Emotion Recognition Fereshte Khoshnam Ahmad Baraani-Dastjerdi M. J. Liaghatdar 71 0 0 28 Sep 2022
YATO: Yet Another deep learning based Text analysis Open toolkit Zeqiang Wang Yile Wang Jiageng Wu Zhiyang Teng Jie Yang 115 3 0 28 Sep 2022
Using contradictions improves question answering systems Étienne Fortier-Dubois Domenic Rosati 102 0 0 28 Sep 2022
METS-CoV: A Dataset of Medical Entity and Targeted Sentiment on COVID-19 Related Tweets Peilin Zhou Zeqiang Wang Dading Chong Zhijiang Guo Yining Hua Zichang Su Zhiyang Teng Jiageng Wu Jie Yang 85 23 0 28 Sep 2022
Information Extraction and Human-Robot Dialogue towards Real-life Tasks: A Baseline Study with the MobileCS Dataset Hong Liu Hao Peng Zhijian Ou Juan-Zi Li Yi Huang Junlan Feng 101 7 0 27 Sep 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models Xiuying Wei Yunchen Zhang Xiangguo Zhang Ruihao Gong Shanghang Zhang Qi Zhang F. Yu Xianglong Liu MQ 164 153 0 27 Sep 2022
Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding Yang Jin Yongzhi Li Zehuan Yuan Yadong Mu 83 34 0 27 Sep 2022
Regularized Contrastive Learning of Semantic Search Mingxi Tan Alexis Rolland Andong Tian 65 0 0 27 Sep 2022
A general-purpose material property data extraction pipeline from large polymer corpora using Natural Language Processing Pranav Shetty Arunkumar Chitteth Rajan Christopher Kuenneth Sonkakshi Gupta L. P. Panchumarti Lauren Holm Chaoran Zhang R. Ramprasad 73 71 0 27 Sep 2022
Lex2Sent: A bagging approach to unsupervised sentiment analysis Kai-Robin Lange Jonas Rieger Carsten Jentsch SSL 42 2 0 26 Sep 2022
Word to Sentence Visual Semantic Similarity for Caption Generation: Lessons Learned Ahmed Sabir 137 0 0 26 Sep 2022
Fast-FNet: Accelerating Transformer Encoder Models via Efficient Fourier Layers Nurullah Sevim Ege Ozan Özyedek Furkan Şahinuç Aykut Koç 95 12 0 26 Sep 2022
Do ever larger octopi still amplify reporting biases? Evidence from judgments of typical colour Fangyu Liu Julian Martin Eisenschlos Jeremy R. Cole Nigel Collier 96 4 0 26 Sep 2022
Text Summarization with Oracle Expectation Yumo Xu Mirella Lapata VLM 79 4 0 26 Sep 2022