Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.05457
Cited By
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
14 March 2018
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELM
RALM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge"
50 / 292 papers shown
Title
More Expressive Attention with Negative Weights
Ang Lv
Ruobing Xie
Shuaipeng Li
Jiayi Liao
Xingwu Sun
Zhanhui Kang
Di Wang
Rui Yan
89
1
0
11 Nov 2024
Zeroth-Order Adaptive Neuron Alignment Based Pruning without Re-Training
Elia Cunegatti
Leonardo Lucio Custode
Giovanni Iacca
136
0
0
11 Nov 2024
LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models
Runming Yang
Taiqiang Wu
Jiahao Wang
Pengfei Hu
Ngai Wong
Yujiu Yang
Yujiu Yang
427
1
0
11 Nov 2024
TODO: Enhancing LLM Alignment with Ternary Preferences
Yuxiang Guo
Lu Yin
Bo Jiang
Jiaqi Zhang
116
3
0
02 Nov 2024
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang
Yifei Wang
Zhaoyang Liu
Chenheng Zhang
Stefanie Jegelka
Jinyang Gao
Bolin Ding
Yisen Wang
129
13
0
31 Oct 2024
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Yunjia Qi
Hao Peng
Xinyu Wang
Bin Xu
Lei Hou
Juanzi Li
102
4
0
31 Oct 2024
EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation
Shih-yang Liu
Huck Yang
Nai Chit Fung
Charbel Sakr
Hongxu Yin
...
Jan Kautz
Yu-Chun Wang
Pavlo Molchanov
Min-Hung Chen
Min-Hung Chen
MQ
99
0
0
28 Oct 2024
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Sangmin Bae
Adam Fisch
Hrayr Harutyunyan
Ziwei Ji
Seungyeon Kim
Tal Schuster
KELM
129
7
0
28 Oct 2024
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Vipul Gupta
Candace Ross
David Pantoja
R. Passonneau
Megan Ung
Adina Williams
285
2
0
26 Oct 2024
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
Haocheng Xi
Han Cai
Ligeng Zhu
Yaojie Lu
Kurt Keutzer
Jianfei Chen
Song Han
MQ
153
10
0
25 Oct 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Liwen Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
157
7
0
24 Oct 2024
Scaling up Masked Diffusion Models on Text
Shen Nie
Fengqi Zhu
Chao Du
Tianyu Pang
Qian Liu
Guangtao Zeng
Min Lin
Chongxuan Li
AI4CE
179
30
0
24 Oct 2024
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi
Clara Mohri
David Brandfonbrener
Alex Gu
Nikhil Vyas
Nikhil Anand
David Alvarez-Melis
Yuanzhi Li
Sham Kakade
Eran Malach
MoE
80
5
0
24 Oct 2024
Controlled Low-Rank Adaptation with Subspace Regularization for Continued Training on Large Language Models
Yuheng Lu
Bingshuo Qian
Caixia Yuan
Huixing Jiang
Xiaojie Wang
CLL
76
0
0
22 Oct 2024
Self-calibration for Language Model Quantization and Pruning
Miles Williams
G. Chrysostomou
Nikolaos Aletras
MQ
468
0
0
22 Oct 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
170
7
0
22 Oct 2024
ToW: Thoughts of Words Improve Reasoning in Large Language Models
Zhikun Xu
Ming shen
Jacob Dineen
Zhaonan Li
Xiao Ye
Shijie Lu
Aswin Rrv
Chitta Baral
Ben Zhou
LRM
454
1
0
21 Oct 2024
A Systematic Study of Cross-Layer KV Sharing for Efficient LLM Inference
You Wu
Haoyi Wu
Kewei Tu
70
3
0
18 Oct 2024
SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment
Qin Liu
Fei Wang
Chaowei Xiao
Muhao Chen
438
2
0
18 Oct 2024
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
Nan Xu
Xuezhe Ma
LRM
145
5
0
18 Oct 2024
On the Role of Attention Heads in Large Language Model Safety
Zhenhong Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Kun Wang
Yang Liu
Sihang Li
Yongbin Li
135
10
0
17 Oct 2024
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Xinze Li
Sen Mei
Zhenghao Liu
Yukun Yan
Shuo Wang
...
Haotian Chen
Ge Yu
Zhiyuan Liu
Maosong Sun
Chenyan Xiong
94
11
0
17 Oct 2024
Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers
Shwai He
Tao Ge
Guoheng Sun
Bowei Tian
Xiaoyang Wang
Ang Li
MoE
104
1
0
17 Oct 2024
MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection
Bokai Lin
Zihao Zeng
Zipeng Xiao
Siqi Kou
Tianqi Hou
Xiaofeng Gao
Hao Zhang
Zhijie Deng
73
6
0
16 Oct 2024
In-context KV-Cache Eviction for LLMs via Attention-Gate
Zihao Zeng
Bokai Lin
Tianqi Hou
Hao Zhang
Zhijie Deng
102
2
0
15 Oct 2024
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
Syeda Nahida Akter
Shrimai Prabhumoye
John Kamalu
S. Satheesh
Eric Nyberg
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
LRM
SyDa
ReLM
154
2
0
15 Oct 2024
A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck
Maximilian Baader
Martin Vechev
128
2
0
14 Oct 2024
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Guorui Zheng
Xidong Wang
Juhao Liang
Nuo Chen
Yuping Zheng
Benyou Wang
MoE
120
5
0
14 Oct 2024
Self-Data Distillation for Recovering Quality in Pruned Large Language Models
Vithursan Thangarasa
Ganesh Venkatesh
Mike Lasby
Nish Sinnadurai
Sean Lie
SyDa
146
2
0
13 Oct 2024
FlatQuant: Flatness Matters for LLM Quantization
Yuxuan Sun
Ruikang Liu
Haoli Bai
Han Bao
Kang Zhao
...
Lu Hou
Chun Yuan
Xin Jiang
Wen Liu
Jun Yao
MQ
162
11
0
12 Oct 2024
ELICIT: LLM Augmentation via External In-Context Capability
Futing Wang
Jianhao Yan
Yue Zhang
Tao Lin
120
1
0
12 Oct 2024
Language Imbalance Driven Rewarding for Multilingual Self-improving
Wen Yang
Junhong Wu
Chen Wang
Chengqing Zong
J.N. Zhang
ALM
LRM
195
7
0
11 Oct 2024
Scaling Laws for Predicting Downstream Performance in LLMs
Yangyi Chen
Binxuan Huang
Yifan Gao
Zhengyang Wang
Jingfeng Yang
Heng Ji
LRM
116
12
0
11 Oct 2024
SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture
Jiayi Han
Liang Du
Hongwei Du
Xiangguo Zhou
Yiwen Wu
Weibo Zheng
Donghong Han
CLL
MoMe
MoE
79
4
0
10 Oct 2024
COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Philipp Guldimann
Alexander Spiridonov
Robin Staab
Nikola Jovanović
Mark Vero
...
Mislav Balunović
Nikola Konstantinov
Pavol Bielik
Petar Tsankov
Martin Vechev
ELM
101
8
0
10 Oct 2024
A Closer Look at Machine Unlearning for Large Language Models
Xiaojian Yuan
Tianyu Pang
Chao Du
Kejiang Chen
Weiming Zhang
Min Lin
MU
225
13
0
10 Oct 2024
Data Selection via Optimal Control for Language Models
Yuxian Gu
Li Dong
Hongning Wang
Y. Hao
Qingxiu Dong
Furu Wei
Minlie Huang
AI4CE
147
8
0
09 Oct 2024
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs
Ruijia Niu
D. Wu
Rose Yu
Yi-An Ma
83
2
0
09 Oct 2024
No Need to Talk: Asynchronous Mixture of Language Models
Anastasiia Filippova
Angelos Katharopoulos
David Grangier
Ronan Collobert
MoE
87
0
0
04 Oct 2024
Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation
Xinpeng Wang
Chengzhi Hu
Paul Röttger
Barbara Plank
147
11
0
04 Oct 2024
How Much Can We Forget about Data Contamination?
Sebastian Bordt
Suraj Srinivas
Valentyn Boreiko
U. V. Luxburg
133
2
0
04 Oct 2024
Selective Attention Improves Transformer
Yaniv Leviathan
Matan Kalman
Yossi Matias
105
12
0
03 Oct 2024
How to Train Long-Context Language Models (Effectively)
Tianyu Gao
Alexander Wettig
Howard Yen
Danqi Chen
RALM
166
48
0
03 Oct 2024
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
Yuxuan Yao
Han Wu
Mingyang Liu
Sichun Luo
Xiongwei Han
Jie Liu
Zhijiang Guo
Linqi Song
96
7
0
03 Oct 2024
ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
Xiangyu Peng
Congying Xia
Xinyi Yang
Caiming Xiong
Chien-Sheng Wu
Chen Xing
LRM
106
8
0
03 Oct 2024
Better Instruction-Following Through Minimum Bayes Risk
Ian Wu
Patrick Fernandes
Amanda Bertsch
Seungone Kim
Sina Pakazad
Graham Neubig
128
11
0
03 Oct 2024
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
Tung-Yu Wu
Pei-Yu Lo
ReLM
LRM
115
2
0
02 Oct 2024
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
Jiyeon Kim
Hyunji Lee
Hyowon Cho
Joel Jang
Hyeonbin Hwang
Seungpil Won
Youbin Ahn
Dohaeng Lee
Minjoon Seo
KELM
394
5
0
02 Oct 2024
Geometric Signatures of Compositionality Across a Language Model's Lifetime
Jin Hwa Lee
Thomas Jiralerspong
Lei Yu
Yoshua Bengio
Emily Cheng
CoGe
182
4
0
02 Oct 2024
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models
Yuxuan Zhang
Ruizhe Li
MoMe
170
2
0
02 Oct 2024
Previous
1
2
3
4
5
6
Next