RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019

Luke Zettlemoyer

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 9,296 papers shown

Title
Modality-Agnostic Self-Supervised Learning with Meta-Learned Masked Auto-Encoder Huiwon Jang Jihoon Tack Daewon Choi Jongheon Jeong Jinwoo Shin 42 3 0 25 Oct 2023
URL-BERT: Training Webpage Representations via Social Media Engagements A. Qamar Chetan Verma Ahmed El-Kishky Sumit Binnani Sneha Mehta Taylor Berg-Kirkpatrick 47 0 0 25 Oct 2023
The Distributional Hypothesis Does Not Fully Explain the Benefits of Masked Language Model Pretraining Ting-Rui Chiang Dani Yogatama 30 1 0 25 Oct 2023
Speakerly: A Voice-based Writing Assistant for Text Composition Dhruv Kumar Vipul Raheja Alice Kaiser-Schatzlein Robyn Perry Apurva Joshi Justin Hugues-Nuger Samuel Lou Navid Chowdhury 49 1 0 24 Oct 2023
Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting Pre-trained Language Models Raymond Li Gabriel Murray Giuseppe Carenini MoE 46 2 0 24 Oct 2023
Knowledge Editing for Large Language Models: A Survey Song Wang Yaochen Zhu Haochen Liu Zaiyi Zheng Chen Chen Wenlin Yao KELM 85 143 0 24 Oct 2023
BLP-2023 Task 2: Sentiment Analysis Md. Arid Hasan Firoj Alam Anika Anjum Shudipta Das Afiyat Anjum 34 19 0 24 Oct 2023
PreWoMe: Exploiting Presuppositions as Working Memory for Long Form Question Answering Wookje Han Jinsol Park Kyungjae Lee 41 4 0 24 Oct 2023
From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning Zheyuan Zhang Shane Storks Fengyuan Hu Sungryull Sohn Moontae Lee Honglak Lee Joyce Chai LRM 48 3 0 24 Oct 2023
GenKIE: Robust Generative Multimodal Document Key Information Extraction Panfeng Cao Ye Wang Qiang Zhang Zaiqiao Meng SyDa 34 6 0 24 Oct 2023
Locally Differentially Private Document Generation Using Zero Shot Prompting Saiteja Utpala Sara Hooker Pin-Yu Chen 26 38 0 24 Oct 2023
Contrastive Learning-based Sentence Encoders Implicitly Weight Informative Words Hiroto Kurita Goro Kobayashi Sho Yokoi Kentaro Inui 40 2 0 24 Oct 2023
Is Probing All You Need? Indicator Tasks as an Alternative to Probing Embedding Spaces Tal Levy Omer Goldman Reut Tsarfaty 32 3 0 24 Oct 2023
Density of States Prediction of Crystalline Materials via Prompt-guided Multi-Modal Transformer Namkyeong Lee Heewoong Noh Sungwon Kim Dongmin Hyun Gyoung S. Na Chanyoung Park 34 6 0 24 Oct 2023
Rosetta Stone at KSAA-RD Shared Task: A Hop From Language Modeling To Word--Definition Alignment Ahmed ElBakry Mohamed Gabr Muhammad N. ElNokrashy Badr AlKhamissi 28 3 0 24 Oct 2023
Towards Automated Recipe Genre Classification using Semi-Supervised Learning Nazmus Sakib G. M. Shahariar Mohsinul Kabir Md. Kamrul Hasan H. Mahmud 18 1 0 24 Oct 2023
Expression Syntax Information Bottleneck for Math Word Problems Jing Xiong Chengming Li Min Yang Xiping Hu Bin Hu 40 5 0 24 Oct 2023
Confounder Balancing in Adversarial Domain Adaptation for Pre-Trained Large Models Fine-Tuning Shuoran Jiang Qingcai Chen Yang Xiang Youcheng Pan Xiangping Wu AI4CE 56 0 0 24 Oct 2023
A Survey on Detection of LLMs-Generated Content Xianjun Yang Liangming Pan Xuandong Zhao Haifeng Chen Linda R. Petzold William Y. Wang Wei Cheng DeLMO 68 53 0 24 Oct 2023
Retrieval-based Knowledge Transfer: An Effective Approach for Extreme Large Language Model Compression Jiduan Liu Jiahao Liu Qifan Wang Jingang Wang Xunliang Cai Dongyan Zhao Ran Wang Rui Yan 43 4 0 24 Oct 2023
Improving Language Models Meaning Understanding and Consistency by Learning Conceptual Roles from Dictionary Myeongjun Jang Thomas Lukasiewicz 40 4 0 24 Oct 2023
Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting Elusive Disinformation Jason Samuel Lucas Adaku Uchendu Michiharu Yamashita Jooyoung Lee Shaurya Rohatgi Dongwon Lee 54 42 0 24 Oct 2023
A Joint Matrix Factorization Analysis of Multilingual Representations Zheng Zhao Yftah Ziser Bonnie Webber Shay B. Cohen 39 2 0 24 Oct 2023
TRAMS: Training-free Memory Selection for Long-range Language Modeling Haofei Yu Cunxiang Wang Yue Zhang Wei Bi RALM 46 6 0 24 Oct 2023
Interpreting Answers to Yes-No Questions in User-Generated Content Shivam Mathur Keun Hee Park Dhivya Chinnappa Saketh Kotamraju Eduardo Blanco 40 0 0 24 Oct 2023
Toward a Critical Toponymy Framework for Named Entity Recognition: A Case Study of Airbnb in New York City Mikael Brunila J. LaViolette Sky CH-Wang Priyanka Verma Clara Féré Grant McKenzie 17 1 0 23 Oct 2023
Adaptive End-to-End Metric Learning for Zero-Shot Cross-Domain Slot Filling Yuanjun Shi Linzhi Wu Minglai Shao 40 3 0 23 Oct 2023
On the Dimensionality of Sentence Embeddings Hongwei Wang Hongming Zhang Dong Yu AI4TS DML 38 3 0 23 Oct 2023
Towards Possibilities & Impossibilities of AI-generated Text Detection: A Survey Soumya Suvra Ghosal Souradip Chakraborty Jonas Geiping Furong Huang Dinesh Manocha Amrit Singh Bedi DeLMO 47 36 0 23 Oct 2023
GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs Yichuan Li Kaize Ding Kyumin Lee SSL 38 25 0 23 Oct 2023
Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization Tianshi Che Ji Liu Yang Zhou Jiaxiang Ren Jiwen Zhou Victor S. Sheng H. Dai Dejing Dou 38 51 0 23 Oct 2023
Affective and Dynamic Beam Search for Story Generation Tenghao Huang Ehsan Qasemi Bangzheng Li He Wang Faeze Brahman Muhao Chen Snigdha Chaturvedi 44 11 0 23 Oct 2023
'Don't Get Too Technical with Me': A Discourse Structure-Based Framework for Science Journalism Ronald Cardenas Bingsheng Yao Dakuo Wang Yufang Hou 56 0 0 23 Oct 2023
Leveraging Deep Learning for Abstractive Code Summarization of Unofficial Documentation AmirHossein Naghshzan Latifa Guerrouj Olga Baysal 31 0 0 23 Oct 2023
Did the Neurons Read your Book? Document-level Membership Inference for Large Language Models Matthieu Meeus Shubham Jain Marek Rei Yves-Alexandre de Montjoye MIALM 34 30 0 23 Oct 2023
System Combination via Quality Estimation for Grammatical Error Correction Muhammad Reza Qorib Hwee Tou Ng 25 5 0 23 Oct 2023
Linking Surface Facts to Large-Scale Knowledge Graphs Gorjan Radevski Kiril Gashteovski Chia-Chien Hung Carolin (Haas) Lawrence Goran Glavaš HILM 37 3 0 23 Oct 2023
Air-Decoding: Attribute Distribution Reconstruction for Decoding-Time Controllable Text Generation Tianqi Zhong Quan Wang Jingxuan Han Yongdong Zhang Zhendong Mao 55 9 0 23 Oct 2023
Paraphrase Types for Generation and Detection Jan Philip Wahle Bela Gipp Terry Ruas 39 4 0 23 Oct 2023
Adaptive Policy with Wait- $k$ Model for Simultaneous Translation Libo Zhao Kai Fan Wei Luo Jing Wu Shushu Wang Ziqian Zeng Zhongqiang Huang 63 9 0 23 Oct 2023
Transparency at the Source: Evaluating and Interpreting Language Models With Access to the True Distribution Jaap Jumelet Willem H. Zuidema 49 5 0 23 Oct 2023
Harnessing Attention Mechanisms: Efficient Sequence Reduction using Attention-based Autoencoders Daniel Biermann Fabrizio Palumbo Morten Goodwin Ole-Christoffer Granmo 56 0 0 23 Oct 2023
Large Language Models can Share Images, Too! Young-Jun Lee Dokyong Lee Joo Won Sung Jonghwan Hyeon Ho-Jin Choi MLLM 50 2 0 23 Oct 2023
What do Deck Chairs and Sun Hats Have in Common? Uncovering Shared Properties in Large Concept Vocabularies Amit Gajbhiye Zied Bouraoui Na Li Usashi Chatterjee Luis Espinosa Anke Steven Schockaert 75 1 0 23 Oct 2023
Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning Hao Wang Xiahua Chen Rui Wang Chenhui Chu 41 0 0 23 Oct 2023
SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for Social Media NLP Research Dimosthenis Antypas Asahi Ushio Francesco Barbieri Leonardo Neves Kiamehr Rezaee Luis Espinosa-Anke Jiaxin Pei Jose Camacho-Collados 43 9 0 23 Oct 2023
A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions Junchao Wu Shu Yang Runzhe Zhan Yulin Yuan Derek F. Wong Lidia S. Chao DeLMO 37 25 0 23 Oct 2023
$Once Upon a $\textit{Time}$ in $\textit{Graph}$: Relative-Time Pretraining for Complex Temporal Reasoning$ Once Upon a $\textit{Time}$ in $\textit{Graph}$ : Relative-Time Pretraining for Complex Temporal Reasoning Sen Yang Xin Li Li Bing Wai Lam AI4CE 45 10 0 23 Oct 2023
Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Models Gangwoo Kim Sungdong Kim Byeongguk Jeon Joonsuk Park Jaewoo Kang UQLM 28 26 0 23 Oct 2023
SpEL: Structured Prediction for Entity Linking Hassan S. Shavarani Anoop Sarkar 56 10 0 23 Oct 2023