Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.11641
Cited By
PIQA: Reasoning about Physical Commonsense in Natural Language
26 November 2019
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PIQA: Reasoning about Physical Commonsense in Natural Language"
50 / 1,393 papers shown
Title
Bilingual Adaptation of Monolingual Foundation Models
Gurpreet Gosal
Yishi Xu
Gokul Ramakrishnan
Rituraj Joshi
Avraham Sheinin
...
Rahul Pal
Parvez Mullah
Soundar Doraiswamy
Mohamed El Karim Chami
Preslav Nakov
CLL
105
3
0
13 Jul 2024
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts
Zhenpeng Su
Zijia Lin
Xue Bai
Xing Wu
Yizhe Xiong
...
Guangyuan Ma
Hui Chen
Guiguang Ding
Wei Zhou
Songlin Hu
MoE
93
5
0
13 Jul 2024
sPhinX: Sample Efficient Multilingual Instruction Fine-Tuning Through N-shot Guided Prompting
Sanchit Ahuja
Kumar Tanmay
Hardik Hansrajbhai Chauhan
Barun Patra
Kriti Aggarwal
...
Tejas I. Dhamecha
Ahmed Awadallah
Monojit Choudhary
Vishrav Chaudhary
Sunayana Sitaram
85
0
0
13 Jul 2024
Low-Rank Interconnected Adaptation across Layers
Yibo Zhong
Jinman Zhao
Yao Zhou
OffRL
MoE
114
1
0
13 Jul 2024
MUSCLE: A Model Update Strategy for Compatible LLM Evolution
Jessica Echterhoff
Fartash Faghri
Raviteja Vemulapalli
Ting-Yao Hu
Chun-Liang Li
Oncel Tuzel
Hadi Pouransari
KELM
97
2
0
12 Jul 2024
Accuracy is Not All You Need
Abhinav Dutta
Sanjeev Krishnan
Nipun Kwatra
Ramachandran Ramjee
98
4
0
12 Jul 2024
SoupLM: Model Integration in Large Language and Multi-Modal Models
Yue Bai
Zichen Zhang
Jiasen Lu
Yun Fu
MoMe
59
1
0
11 Jul 2024
RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
Xijie Huang
Zechun Liu
Shih-yang Liu
Kwang-Ting Cheng
MQ
91
9
0
10 Jul 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Mengzhao Chen
Wenqi Shao
Peng Xu
Jiahao Wang
Peng Gao
Kaipeng Zhang
Ping Luo
MQ
158
35
0
10 Jul 2024
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
Liqun Ma
Mingjie Sun
Zhiqiang Shen
86
9
0
09 Jul 2024
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Catherine Tony
Nicolás E. Díaz Ferreyra
Markus Mutas
Salem Dhiff
Riccardo Scandariato
SILM
153
14
0
09 Jul 2024
Data, Data Everywhere: A Guide for Pretraining Dataset Construction
Jupinder Parmar
Shrimai Prabhumoye
Pritam Gundecha
Bo Liu
Aastha Jhunjhunwala
Zhilin Wang
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
122
10
0
08 Jul 2024
B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
Luca Zancato
Arjun Seshadri
Yonatan Dukler
Aditya Golatkar
Yantao Shen
Benjamin Bowman
Matthew Trager
Alessandro Achille
Stefano Soatto
77
10
0
08 Jul 2024
Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations
Bowen Shen
Zheng Lin
Daren Zha
Wei Liu
Jian Luan
Bin Wang
Weiping Wang
126
2
0
08 Jul 2024
On Speeding Up Language Model Evaluation
Jin Peng Zhou
Christian K. Belardi
Ruihan Wu
Travis Zhang
Carla P. Gomes
Wen Sun
Kilian Q. Weinberger
161
2
0
08 Jul 2024
LoCo: Low-Bit Communication Adaptor for Large-scale Model Training
Xingyu Xie
Zhijie Lin
Kim-Chuan Toh
Pan Zhou
96
3
0
05 Jul 2024
Waterfall: Framework for Robust and Scalable Text Watermarking
Gregory Kang Ruey Lau
Xinyuan Niu
Hieu Dao
Jiangwei Chen
Chuan-Sheng Foo
Bryan Kian Hsiang Low
WaLM
82
6
0
05 Jul 2024
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Xingrun Xing
Boyan Gao
Zheng Zhang
David A. Clifton
Shitao Xiao
Li Du
Guoqi Li
Jiajun Zhang
157
6
0
05 Jul 2024
A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Md Tahmid Rahman Laskar
Sawsan Alqahtani
M Saiful Bari
Mizanur Rahman
Mohammad Abdullah Matin Khan
...
Chee Wei Tan
Md. Rizwan Parvez
Enamul Hoque
Shafiq Joty
Jimmy Huang
ELM
ALM
105
41
0
04 Jul 2024
DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs
Zhen Tan
Daize Dong
Xinyu Zhao
Jie Peng
Yu Cheng
Tianlong Chen
MoE
91
4
0
03 Jul 2024
Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment
Janghwan Lee
Seongmin Park
S. Hong
Minsoo Kim
Du-Seong Chang
Jungwook Choi
44
6
0
03 Jul 2024
ObfuscaTune: Obfuscated Offsite Fine-tuning and Inference of Proprietary LLMs on Private Datasets
Ahmed Frikha
Nassim Walha
Ricardo Mendes
Krishna Kanth Nakka
Xue Jiang
Xuebing Zhou
160
3
0
03 Jul 2024
Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models
Xavier Suau
Pieter Delobelle
Katherine Metcalf
Armand Joulin
N. Apostoloff
Luca Zappella
P. Rodríguez
MU
AAML
99
14
0
02 Jul 2024
WTU-EVAL: A Whether-or-Not Tool Usage Evaluation Benchmark for Large Language Models
Kangyun Ning
Yisong Su
Xueqiang Lv
Yuanzhe Zhang
Jian Liu
Kang Liu
Jinan Xu
ELM
LLMAG
80
3
0
02 Jul 2024
GemmAr: Enhancing LLMs Through Arabic Instruction-Tuning
Hasna Chouikhi
Manel Aloui
Cyrine Ben Hammou
Ghaith Chaabane
Haithem Kchaou
Chehir Dhaouadi
76
0
0
02 Jul 2024
Breaking Language Barriers: Cross-Lingual Continual Pre-Training at Scale
Wenzhen Zheng
Wenbo Pan
Xu Xu
Libo Qin
Li Yue
Ming Zhou
CLL
80
7
0
02 Jul 2024
Why does in-context learning fail sometimes? Evaluating in-context learning on open and closed questions
Xiang Li
Haoran Tang
Siyu Chen
Ziwei Wang
Ryan Chen
Marcin Abram
LRM
106
4
0
02 Jul 2024
Is Your Large Language Model Knowledgeable or a Choices-Only Cheater?
Nishant Balepur
Rachel Rudinger
94
8
0
02 Jul 2024
Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application
Chuanpeng Yang
Wang Lu
Yao Zhu
Yidong Wang
Qian Chen
Chenlong Gao
Bingjie Yan
Yiqiang Chen
ALM
KELM
101
32
0
02 Jul 2024
Normalization and effective learning rates in reinforcement learning
Clare Lyle
Zeyu Zheng
Khimya Khetarpal
James Martens
H. V. Hasselt
Razvan Pascanu
Will Dabney
97
13
0
01 Jul 2024
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu
Xiaosen Zheng
Niklas Muennighoff
Guangtao Zeng
Longxu Dou
Tianyu Pang
Jing Jiang
Min Lin
MoE
172
54
1
01 Jul 2024
FoldGPT: Simple and Effective Large Language Model Compression Scheme
Songwei Liu
Chao Zeng
Lianqiang Li
Chenqian Yan
Lean Fu
Xing Mei
Fangmin Chen
86
5
0
01 Jul 2024
BAPO: Base-Anchored Preference Optimization for Personalized Alignment in Large Language Models
Gihun Lee
Minchan Jeong
Yujin Kim
Hojung Jung
Jaehoon Oh
Sangmook Kim
Se-Young Yun
73
3
0
30 Jun 2024
VcLLM: Video Codecs are Secretly Tensor Codecs
Ceyu Xu
Yongji Wu
Xinyu Yang
Beidi Chen
Matthew Lentz
Danyang Zhuo
Lisa Wu Wills
102
0
0
29 Jun 2024
Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring
Jiazheng Li
Hainiu Xu
ZHAOYUE SUN
Yuxiang Zhou
David West
Cesare Aloisi
Yulan He
LRM
74
4
0
28 Jun 2024
YuLan: An Open-source Large Language Model
Yutao Zhu
Kun Zhou
Kelong Mao
Wentong Chen
Yiding Sun
...
Wenbing Huang
Ze-Feng Gao
Yueguo Chen
Weizheng Lu
Ji-Rong Wen
ALM
ELM
65
1
0
28 Jun 2024
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management
Wonbeom Lee
Jungi Lee
Junghwan Seo
Jaewoong Sim
RALM
87
96
0
28 Jun 2024
Aligning Teacher with Student Preferences for Tailored Training Data Generation
Yantao Liu
Zhao Zhang
Zijun Yao
S. Cao
Lei Hou
Juanzi Li
92
2
0
27 Jun 2024
OutlierTune: Efficient Channel-Wise Quantization for Large Language Models
Jinguang Wang
Yuexi Yin
Haifeng Sun
Qi Qi
Jingyu Wang
Zirui Zhuang
Tingting Yang
Jianxin Liao
76
2
0
27 Jun 2024
Length Optimization in Conformal Prediction
Shayan Kiyani
George Pappas
Hamed Hassani
113
17
0
27 Jun 2024
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Guilherme Penedo
Hynek Kydlícek
Loubna Ben Allal
Anton Lozhkov
Margaret Mitchell
Colin Raffel
Leandro von Werra
Thomas Wolf
141
265
0
25 Jun 2024
Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels
Razvan-Gabriel Dumitru
Vikas Yadav
Rishabh Maheshwary
Paul-Ioan Clotan
Sathwik Tejaswi Madhusudhan
Mihai Surdeanu
MQ
127
2
0
25 Jun 2024
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Ashwinee Panda
Berivan Isik
Xiangyu Qi
Sanmi Koyejo
Tsachy Weissman
Prateek Mittal
MoMe
139
16
0
24 Jun 2024
Scaling Laws for Linear Complexity Language Models
Xuyang Shen
Dong Li
Ruitao Leng
Zhen Qin
Weigao Sun
Yiran Zhong
LRM
83
8
0
24 Jun 2024
ShadowLLM: Predictor-based Contextual Sparsity for Large Language Models
Yash Akhauri
Ahmed F. AbouElhamayed
Jordan Dotzel
Zhiru Zhang
Alexander M Rush
Safeen Huda
Mohamed S. Abdelfattah
54
5
0
24 Jun 2024
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
Tong Zhu
Xiaoye Qu
Daize Dong
Jiacheng Ruan
Jingqi Tong
Conghui He
Yu Cheng
MoE
ALM
106
89
0
24 Jun 2024
OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser
Jingze Shi
Ting Xie
Bingheng Wu
Chunjun Zheng
Kai Wang
38
2
0
24 Jun 2024
Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other
Yifei Gao
Jie Ou
Lei Wang
Yuting Xiao
Zhiyuan Xiang
Ruiting Dai
Jun Cheng
MQ
61
3
0
24 Jun 2024
Large Vocabulary Size Improves Large Language Models
Sho Takase
Ryokan Ri
Shun Kiyono
Takuya Kato
133
4
0
24 Jun 2024
Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning
Somnath Basu Roy Chowdhury
Krzysztof Choromanski
Arijit Sehanobish
Avinava Dubey
Snigdha Chaturvedi
MU
108
10
0
24 Jun 2024
Previous
1
2
3
...
12
13
14
...
26
27
28
Next