Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.10044
Cited By
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
24 May 2019
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions"
50 / 1,143 papers shown
Title
AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation Experts
Zefang Liu
Jiahua Luo
MoE
KELM
82
13
0
01 May 2024
Suvach -- Generated Hindi QA benchmark
Vaishak Narayanan
KP PrabinRaj
Saifudheen Nouphal
36
0
0
30 Apr 2024
SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning
Jinghan Jia
Yihua Zhang
Yimeng Zhang
Jiancheng Liu
Bharat Runwal
James Diffenderfer
B. Kailkhura
Sijia Liu
MU
188
50
0
28 Apr 2024
Temporal Scaling Law for Large Language Models
Yizhe Xiong
Xiansheng Chen
Xin Ye
Hui Chen
Zijia Lin
...
Zhenpeng Su
Wei Huang
Jianwei Niu
Jiawei Han
Guiguang Ding
120
10
0
27 Apr 2024
Make Your LLM Fully Utilize the Context
Shengnan An
Zexiong Ma
Zeqi Lin
Nanning Zheng
Jian-Guang Lou
SyDa
131
67
0
25 Apr 2024
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Mostafa Elhoushi
Akshat Shrivastava
Diana Liskovich
Basil Hosmer
Bram Wasti
...
Saurabh Agarwal
Ahmed Roman
Ahmed Aly
Beidi Chen
Carole-Jean Wu
LRM
110
110
0
25 Apr 2024
Evaluating Consistency and Reasoning Capabilities of Large Language Models
Yash Saxena
Sarthak Chopra
Arunendra Mani Tripathi
ELM
LRM
77
7
0
25 Apr 2024
Interpreting Answers to Yes-No Questions in Dialogues from Multiple Domains
Zijie Wang
Farzana Rashid
Eduardo Blanco
65
0
0
25 Apr 2024
Nyonic Technical Report
Junfeng Tian
Rui Wang
Cong Li
Yudong Zhou
Jun Liu
Jun Wang
55
1
0
24 Apr 2024
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks
Amir Saeidi
Shivanshu Verma
Chitta Baral
Chitta Baral
ALM
110
26
0
23 Apr 2024
OpenELM: An Efficient Language Model Family with Open Training and Inference Framework
Sachin Mehta
Mohammad Hossein Sekhavat
Qingqing Cao
Maxwell Horton
Yanzi Jin
...
Iman Mirzadeh
Mahyar Najibi
Dmitry Belenko
Peter Zatloukal
Mohammad Rastegari
OSLM
AIFin
108
61
0
22 Apr 2024
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Marah Abdin
Sam Ade Jacobs
A. A. Awan
J. Aneja
Ahmed Hassan Awadallah
...
Li Zhang
Yi Zhang
Yue Zhang
Yunan Zhang
Xiren Zhou
LRM
ALM
197
1,273
0
22 Apr 2024
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
Dengchun Li
Yingzi Ma
Naizheng Wang
Zhengmao Ye
Zhiyuan Cheng
...
Yan Zhang
Lei Duan
Jie Zuo
Cal Yang
Mingjie Tang
MoE
128
59
0
22 Apr 2024
Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization
Costas Mavromatis
Petros Karypis
George Karypis
MoMe
71
30
0
17 Apr 2024
Shears: Unstructured Sparsity with Neural Low-rank Adapter Search
J. P. Muñoz
Jinjie Yuan
Nilesh Jain
54
7
0
16 Apr 2024
Fewer Truncations Improve Language Modeling
Hantian Ding
Zijian Wang
Giovanni Paolini
Varun Kumar
Anoop Deoras
Dan Roth
Stefano Soatto
111
14
0
16 Apr 2024
HLAT: High-quality Large Language Model Pre-trained on AWS Trainium
Haozheng Fan
Hao Zhou
Guangtai Huang
Parameswaran Raman
Xinwei Fu
Gaurav Gupta
Dhananjay Ram
Yida Wang
Jun Huan
81
6
0
16 Apr 2024
Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model
Hyunsoo Cho
ALM
29
0
0
15 Apr 2024
Learn Your Reference Model for Real Good Alignment
Alexey Gorbatovski
Boris Shaposhnikov
Alexey Malakhov
Nikita Surnachev
Yaroslav Aksenov
Ian Maksimov
Nikita Balagansky
Daniil Gavrilov
OffRL
127
35
0
15 Apr 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Xuezhe Ma
Xiaomeng Yang
Wenhan Xiong
Beidi Chen
Lili Yu
Hao Zhang
Jonathan May
Luke Zettlemoyer
Omer Levy
Chunting Zhou
90
33
0
12 Apr 2024
Rho-1: Not All Tokens Are What You Need
Zheng-Wen Lin
Zhibin Gou
Yeyun Gong
Xiao Liu
Yelong Shen
...
Chen Lin
Yujiu Yang
Jian Jiao
Nan Duan
Weizhu Chen
CLL
158
75
0
11 Apr 2024
JetMoE: Reaching Llama2 Performance with 0.1M Dollars
Yikang Shen
Zhen Guo
Tianle Cai
Zengyi Qin
MoE
ALM
96
31
0
11 Apr 2024
ONNXPruner: ONNX-Based General Model Pruning Adapter
Dongdong Ren
Wenbin Li
Tianyu Ding
Lei Wang
Qi Fan
Jing Huo
Hongbing Pan
Yang Gao
92
3
0
10 Apr 2024
CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers
Longwei Zou
Qingyang Wang
Han Zhao
Jiangang Kong
Yi Yang
Yangdong Deng
98
0
0
10 Apr 2024
Eraser: Jailbreaking Defense in Large Language Models via Unlearning Harmful Knowledge
Weikai Lu
Huiping Zhuang
Jianwei Wang
Zhengdong Lu
Zelin Chen
Huiping Zhuang
Cen Chen
MU
AAML
KELM
82
30
0
08 Apr 2024
PORTULAN ExtraGLUE Datasets and Models: Kick-starting a Benchmark for the Neural Processing of Portuguese
T. Osório
Bernardo Leite
Henrique Lopes Cardoso
Luís Gomes
João Rodrigues
Rodrigo Santos
António Branco
78
3
0
08 Apr 2024
DLoRA: Distributed Parameter-Efficient Fine-Tuning Solution for Large Language Model
Chao Gao
Sai Qian Zhang
ALM
155
7
0
08 Apr 2024
Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts
Weilin Cai
Juyong Jiang
Le Qin
Junwei Cui
Sunghun Kim
Jiayi Huang
180
10
0
07 Apr 2024
Your Finetuned Large Language Model is Already a Powerful Out-of-distribution Detector
Andi Zhang
Tim Z. Xiao
Weiyang Liu
Robert Bamler
Damon J. Wischik
OODD
116
6
0
07 Apr 2024
ReFT: Representation Finetuning for Language Models
Zhengxuan Wu
Aryaman Arora
Zheng Wang
Atticus Geiger
Daniel Jurafsky
Christopher D. Manning
Christopher Potts
OffRL
119
72
0
04 Apr 2024
Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models
Taiqiang Wu
Chaofan Tao
Jiahao Wang
Zhe Zhao
Ngai Wong
ALM
97
18
0
03 Apr 2024
Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
Yuxin Wen
Leo Marchyok
Sanghyun Hong
Jonas Geiping
Tom Goldstein
Nicholas Carlini
SILM
AAML
82
16
0
01 Apr 2024
The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis
Chen Yang
Junzhuo Li
Xinyao Niu
Xinrun Du
Songyang Gao
...
Stephen W. Huang
Shawn Yue
Wenhu Chen
Jie Fu
Ge Zhang
78
2
0
01 Apr 2024
Communication Efficient Distributed Training with Distributed Lion
Bo Liu
Lemeng Wu
Lizhang Chen
Kaizhao Liang
Jiaxu Zhu
Chen Liang
Raghuraman Krishnamoorthi
Qiang Liu
103
7
0
30 Mar 2024
Conceptual and Unbiased Reasoning in Language Models
Ben Zhou
Hongming Zhang
Sihao Chen
Dian Yu
Hongwei Wang
Baolin Peng
Dan Roth
Dong Yu
ReLM
LRM
ELM
95
16
0
30 Mar 2024
Jamba: A Hybrid Transformer-Mamba Language Model
Opher Lieber
Barak Lenz
Hofit Bata
Gal Cohen
Jhonathan Osin
...
Nir Ratner
N. Rozen
Erez Shwartz
Mor Zusman
Y. Shoham
120
227
0
28 Mar 2024
A Review of Multi-Modal Large Language and Vision Models
Kilian Carolan
Laura Fennelly
Alan F. Smeaton
VLM
186
28
0
28 Mar 2024
A Two-Phase Recall-and-Select Framework for Fast Model Selection
Jianwei Cui
Wenhang Shi
Honglin Tao
Wei Lu
Xiaoyong Du
89
0
0
28 Mar 2024
Large Language Models as Financial Data Annotators: A Study on Effectiveness and Efficiency
Toyin Aguda
S. Siddagangappa
Elena Kochkina
Simerjot Kaur
Dongsheng Wang
Charese Smiley
Sameena Shah
89
12
0
26 Mar 2024
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov
Kushal Tirumala
Hassan Shapourian
Paolo Glorioso
Daniel A. Roberts
149
106
0
26 Mar 2024
ChatGPT Incorrectness Detection in Software Reviews
M. Tanzil
Junaed Younus Khan
Gias Uddin
89
4
0
25 Mar 2024
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Zeyu Han
Chao Gao
Jinyang Liu
Jeff Zhang
Sai Qian Zhang
295
403
0
21 Mar 2024
Reverse Training to Nurse the Reversal Curse
O. Yu. Golovneva
Zeyuan Allen-Zhu
Jason Weston
Sainbayar Sukhbaatar
114
38
0
20 Mar 2024
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Junyuan Hong
Jinhao Duan
Chenhui Zhang
Zhangheng Li
Chulin Xie
...
B. Kailkhura
Dan Hendrycks
Dawn Song
Zhangyang Wang
Yue Liu
110
28
0
18 Mar 2024
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Piotr Nawrot
Adrian Lañcucki
Marcin Chochowski
David Tarjan
Edoardo Ponti
94
56
0
14 Mar 2024
Simple and Scalable Strategies to Continually Pre-train Large Language Models
Adam Ibrahim
Benjamin Thérien
Kshitij Gupta
Mats L. Richter
Quentin Anthony
Timothée Lesort
Eugene Belilovsky
Irina Rish
KELM
CLL
109
63
0
13 Mar 2024
Language models scale reliably with over-training and on downstream tasks
S. Gadre
Georgios Smyrnis
Vaishaal Shankar
Suchin Gururangan
Mitchell Wortsman
...
Y. Carmon
Achal Dave
Reinhard Heckel
Niklas Muennighoff
Ludwig Schmidt
ALM
ELM
LRM
183
48
0
13 Mar 2024
Knowledge Conflicts for LLMs: A Survey
Rongwu Xu
Zehan Qi
Zhijiang Guo
Cunxiang Wang
Hongru Wang
Yue Zhang
Wei Xu
299
122
0
13 Mar 2024
Gemma: Open Models Based on Gemini Research and Technology
Gemma Team
Gemma Team Thomas Mesnard
Cassidy Hardin
Robert Dadashi
Surya Bhupatiraju
...
Armand Joulin
Noah Fiedel
Evan Senter
Alek Andreev
Kathleen Kenealy
VLM
LLMAG
242
513
0
13 Mar 2024
CHAI: Clustered Head Attention for Efficient LLM Inference
Saurabh Agarwal
Bilge Acun
Basil Homer
Mostafa Elhoushi
Yejin Lee
Shivaram Venkataraman
Dimitris Papailiopoulos
Carole-Jean Wu
100
11
0
12 Mar 2024
Previous
1
2
3
...
11
12
13
...
21
22
23
Next