Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.03654
Cited By
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
5 June 2020
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DeBERTa: Decoding-enhanced BERT with Disentangled Attention"
50 / 125 papers shown
Title
A Look Into News Avoidance Through AWRS: An Avoidance-Aware Recommender System
Igor L.R. Azevedo
Toyotaro Suzumura
Yuichiro Yasui
116
0
0
12 Jul 2024
Efficient Nearest Neighbor based Uncertainty Estimation for Natural Language Processing Tasks
Wataru Hashimoto
Hidetaka Kamigaito
Taro Watanabe
95
0
0
02 Jul 2024
Revisiting Random Walks for Learning on Graphs
Jinwoo Kim
Olga Zaghen
Ayhan Suleymanzade
Youngmin Ryou
Seunghoon Hong
114
1
0
01 Jul 2024
Cross-Lingual Transfer Learning for Speech Translation
Rao Ma
Yassir Fathullah
Mengjie Qian
Siyuan Tang
Mark Gales
Kate Knill
106
3
0
01 Jul 2024
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph
Roman Vashurin
Ekaterina Fadeeva
Artem Vazhentsev
Akim Tsvigun
Daniil Vasilev
...
Timothy Baldwin
Timothy Baldwin
Maxim Panov
Artem Shelmanov
Artem Shelmanov
HILM
95
28
0
21 Jun 2024
Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
Han Jiang
Xiaoyuan Yi
Zhihua Wei
Ziang Xiao
Shu Wang
Xing Xie
ELM
ALM
132
8
0
20 Jun 2024
Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks
Dan S. Nielsen
Kenneth Enevoldsen
Peter Schneider-Kamp
ELM
88
8
0
19 Jun 2024
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Holy Lovenia
Rahmad Mahendra
Salsabil Maulana Akbar
Lester James V. Miranda
Jennifer Santoso
...
Genta Indra Winata
Ruochen Zhang
Fajri Koto
Zheng-Xin Yong
Samuel Cahyawijaya
164
14
0
14 Jun 2024
Bilingual Sexism Classification: Fine-Tuned XLM-RoBERTa and GPT-3.5 Few-Shot Learning
AmirMohammad Azadi
Baktash Ansari
Sina Zamani
Sauleh Eetemadi
32
1
0
11 Jun 2024
Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Yipeng Zhang
Haitao Mi
Helen Meng
CLL
KELM
111
7
0
10 Jun 2024
Distilling Aggregated Knowledge for Weakly-Supervised Video Anomaly Detection
Jash Dalvi
Ali Dabouei
Gunjan Dhanuka
Min Xu
51
0
0
05 Jun 2024
Amortizing intractable inference in diffusion models for vision, language, and control
S. Venkatraman
Moksh Jain
Luca Scimeca
Minsu Kim
Marcin Sendera
...
Alexandre Adam
Jarrid Rector-Brooks
Yoshua Bengio
Glen Berseth
Nikolay Malkin
134
31
0
31 May 2024
Large Language Models for Cyber Security: A Systematic Literature Review
HanXiang Xu
Shenao Wang
Ningke Li
Kaidi Wang
Yanjie Zhao
Kai Chen
Ting Yu
Yang Liu
Haoyu Wang
104
40
0
08 May 2024
Automating Customer Needs Analysis: A Comparative Study of Large Language Models in the Travel Industry
Simone Barandoni
F. Chiarello
Lorenzo Cascone
Emiliano Marrale
Salvatore Puccio
102
6
0
27 Apr 2024
A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision-Language Models
Xiujie Song
Mengyue Wu
Ke Zhu
Chunhao Zhang
Yanyi Chen
LRM
ELM
73
3
0
28 Feb 2024
Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space
Leo Schwinn
David Dobre
Sophie Xhonneux
Gauthier Gidel
Stephan Gunnemann
AAML
114
43
0
14 Feb 2024
Large Language Models: A Survey
Shervin Minaee
Tomas Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
191
406
0
09 Feb 2024
Large Language Models as Topological Structure Enhancers for Text-Attributed Graphs
Shengyin Sun
Yuxiang Ren
Chen Ma
Xuecang Zhang
185
21
0
24 Nov 2023
AutoMix: Automatically Mixing Language Models
Pranjal Aggarwal
Aman Madaan
Ankit Anand
Srividya Pranavi Potharaju
Swaroop Mishra
...
Karthik Kappaganthu
Yiming Yang
Shyam Upadhyay
Manaal Faruqui
Mausam
90
23
0
19 Oct 2023
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport
Quentin Bouniot
I. Redko
Anton Mallasto
Charlotte Laclau
Karol Arndt
Oliver Struckmeier
Markus Heinonen
Ville Kyrki
Samuel Kaski
121
2
0
17 Oct 2023
TouchUp-G: Improving Feature Representation through Graph-Centric Finetuning
Jing Zhu
Xiang Song
V. Ioannidis
Danai Koutra
Christos Faloutsos
105
15
0
25 Sep 2023
Semantic Consistency for Assuring Reliability of Large Language Models
Harsh Raj
Vipul Gupta
Domenic Rosati
S. Majumdar
HILM
132
14
0
17 Aug 2023
Summaries, Highlights, and Action items: Design, implementation and evaluation of an LLM-powered meeting recap system
Sumit Asthana
Sagi Hilleli
Pengcheng He
Aaron L Halfaker
82
12
0
28 Jul 2023
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
Zeyuan Allen-Zhu
Yuanzhi Li
93
20
0
23 May 2023
Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes
Simran Arora
Brandon Yang
Sabri Eyuboglu
A. Narayan
Andrew Hojel
Immanuel Trummer
Christopher Ré
SyDa
89
77
0
19 Apr 2023
Small-Bench NLP: Benchmark for small single GPU trained models in Natural Language Processing
K. Kanakarajan
Bhuvana Kundumani
Malaikannan Sankarasubbu
ALM
MoE
37
5
0
22 Sep 2021
COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Yu Meng
Chenyan Xiong
Payal Bajaj
Saurabh Tiwary
Paul N. Bennett
Jiawei Han
Xia Song
159
205
0
16 Feb 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
85
2,181
0
11 Jan 2021
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
743
41,932
0
28 May 2020
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning
Tao Shen
Yi Mao
Pengcheng He
Guodong Long
Adam Trischler
Weizhu Chen
53
63
0
29 Apr 2020
Adversarial Training for Large Neural Language Models
Xiaodong Liu
Hao Cheng
Pengcheng He
Weizhu Chen
Yu Wang
Hoifung Poon
Jianfeng Gao
AAML
76
185
0
20 Apr 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
162
4,062
0
10 Apr 2020
Deep Learning Based Text Classification: A Comprehensive Review
Shervin Minaee
Nal Kalchbrenner
Min Zhang
Narjes Nikzad
M. Asgari-Chenaghlu
Jianfeng Gao
AILaw
VLM
AI4TS
97
1,103
0
06 Apr 2020
Reformer: The Efficient Transformer
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
186
2,313
0
13 Jan 2020
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
T. Zhao
86
561
0
08 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
419
20,127
0
23 Oct 2019
Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving
Imanol Schlag
P. Smolensky
Roland Fernandez
Nebojsa Jojic
Jürgen Schmidhuber
Jianfeng Gao
64
52
0
15 Oct 2019
Mapping Natural-language Problems to Formal-language Solutions Using Structured Neural Representations
Kezhen Chen
Qiuyuan Huang
Hamid Palangi
P. Smolensky
Kenneth D. Forbus
Jianfeng Gao
NAI
23
3
0
05 Oct 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
358
6,449
0
26 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
326
1,899
0
17 Sep 2019
X-SQL: reinforce schema representation with context
Pengcheng He
Yi Mao
K. Chakrabarti
Weizhu Chen
41
89
0
21 Aug 2019
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
Wei Wang
Bin Bi
Ming Yan
Chen Henry Wu
Zuyi Bao
Jiangnan Xia
Liwei Peng
Luo Si
56
262
0
13 Aug 2019
On the Variance of the Adaptive Learning Rate and Beyond
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
ODL
284
1,903
0
08 Aug 2019
A Hybrid Neural Network Model for Commonsense Reasoning
Pengcheng He
Xiaodong Liu
Weizhu Chen
Jianfeng Gao
LRM
58
29
0
27 Jul 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
621
24,431
0
26 Jul 2019
SpanBERT: Improving Pre-training by Representing and Predicting Spans
Mandar Joshi
Danqi Chen
Yinhan Liu
Daniel S. Weld
Luke Zettlemoyer
Omer Levy
138
1,964
0
24 Jul 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
230
8,426
0
19 Jun 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
217
1,517
0
24 May 2019
Unified Language Model Pre-training for Natural Language Understanding and Generation
Li Dong
Nan Yang
Wenhui Wang
Furu Wei
Xiaodong Liu
Yu Wang
Jianfeng Gao
M. Zhou
H. Hon
ELM
AI4CE
220
1,555
0
08 May 2019
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
256
2,312
0
02 May 2019
Previous
1
2
3
Next