Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.11641
Cited By
PIQA: Reasoning about Physical Commonsense in Natural Language
26 November 2019
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PIQA: Reasoning about Physical Commonsense in Natural Language"
50 / 1,393 papers shown
Title
Large Language Model-guided Document Selection
Xiang Kong
Tom Gunter
Ruoming Pang
65
4
0
07 Jun 2024
mCSQA: Multilingual Commonsense Reasoning Dataset with Unified Creation Strategy by Language Models and Humans
Yusuke Sakai
Hidetaka Kamigaito
Taro Watanabe
LRM
94
5
0
06 Jun 2024
Scaling and evaluating sparse autoencoders
Leo Gao
Tom Dupré la Tour
Henk Tillman
Gabriel Goh
Rajan Troll
Alec Radford
Ilya Sutskever
Jan Leike
Jeffrey Wu
100
163
0
06 Jun 2024
HeSum: a Novel Dataset for Abstractive Text Summarization in Hebrew
Tzuf Paz-Argaman
Itai Mondshine
Asaf Achi Mordechai
Reut Tsarfaty
76
3
0
06 Jun 2024
Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
Naibin Gu
Peng Fu
Xiyu Liu
Bowen Shen
Zheng Lin
Weiping Wang
69
10
0
06 Jun 2024
Wings: Learning Multimodal LLMs without Text-only Forgetting
Yi-Kai Zhang
Shiyin Lu
Yang Li
Yanqing Ma
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
De-Chuan Zhan
Han-Jia Ye
VLM
128
10
0
05 Jun 2024
VideoPhy: Evaluating Physical Commonsense for Video Generation
Hritik Bansal
Zongyu Lin
Tianyi Xie
Zeshun Zong
Michal Yarom
Yonatan Bitton
Chenfanfu Jiang
Ningyu Zhang
Kai-Wei Chang
Aditya Grover
EGVM
VGen
112
45
0
05 Jun 2024
Does your data spark joy? Performance gains from domain upsampling at the end of training
Cody Blakeney
Mansheej Paul
Brett W. Larsen
Sean Owen
Jonathan Frankle
86
20
0
05 Jun 2024
Xmodel-LM Technical Report
Yichuan Wang
Yang Liu
Yu Yan
Qun Wang
Xucheng Huang
Ling Jiang
OSLM
ALM
57
1
0
05 Jun 2024
Scalable MatMul-free Language Modeling
Rui-Jie Zhu
Yu Zhang
Ethan Sifferman
Tyler Sheaves
Yiqiao Wang
Dustin Richmond
P. Zhou
Jason K. Eshraghian
94
22
0
04 Jun 2024
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Namgyu Ho
Sangmin Bae
Taehyeon Kim
Hyunjik Jo
Yireun Kim
Tal Schuster
Adam Fisch
James Thorne
Se-Young Yun
107
9
0
04 Jun 2024
GrootVL: Tree Topology is All You Need in State Space Model
Yicheng Xiao
Lin Song
Shaoli Huang
Jiangshan Wang
Siyu Song
Yixiao Ge
Xiu Li
Ying Shan
Mamba
114
13
0
04 Jun 2024
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models
Marianna Nezhurina
Lucia Cipolina-Kun
Mehdi Cherti
J. Jitsev
LLMAG
LRM
ELM
ReLM
191
37
0
04 Jun 2024
OLoRA: Orthonormal Low-Rank Adaptation of Large Language Models
Kerim Büyükakyüz
AI4CE
74
7
0
03 Jun 2024
Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models
Cheng-Hsun Hsueh
Paul Kuo-Ming Huang
Tzu-Han Lin
Che-Wei Liao
Hung-Chieh Fang
Chao-Wei Huang
Yun-Nung Chen
KELM
82
6
0
03 Jun 2024
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
Yilong Chen
Linhao Zhang
Junyuan Shang
Zhenyu Zhang
Tingwen Liu
Shuohuan Wang
Yu Sun
69
1
0
03 Jun 2024
MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures
Jinjie Ni
Fuzhao Xue
Xiang Yue
Yuntian Deng
Mahir Shah
Kabir Jain
Graham Neubig
Yang You
ELM
82
48
0
03 Jun 2024
A Survey of Useful LLM Evaluation
Ji-Lun Peng
Sijia Cheng
Egil Diau
Yung-Yu Shih
Po-Heng Chen
Yen-Ting Lin
Yun-Nung Chen
LLMAG
ELM
86
16
0
03 Jun 2024
MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantization
Aozhong Zhang
Naigang Wang
Yanxia Deng
Xin Li
Zi Yang
Penghang Yin
MQ
71
8
0
02 Jun 2024
QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation
Zhuo Chen
Rumen Dangovski
Charlotte Loh
Owen Dugan
Di Luo
Marin Soljacic
MQ
93
9
0
31 May 2024
Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs
Davide Paglieri
Saurabh Dash
Tim Rocktaschel
Jack Parker-Holder
MQ
75
6
0
31 May 2024
LCQ: Low-Rank Codebook based Quantization for Large Language Models
Wen-Pu Cai
Wu-Jun Li
Wu-Jun Li
MQ
117
0
0
31 May 2024
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Zachary Ankner
Cody Blakeney
Kartik K. Sreenivasan
Max Marion
Matthew L. Leavitt
Mansheej Paul
115
34
0
30 May 2024
One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Ke Yi
Yuhui Xu
Heng Chang
Chen Tang
Yuan Meng
Tong Zhang
Jia Li
MQ
83
2
0
30 May 2024
Would I Lie To You? Inference Time Alignment of Language Models using Direct Preference Heads
Avelina Asada Hadji-Kyriacou
Ognjen Arandjelović
32
1
0
30 May 2024
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model
Chaochen Gao
Xing Wu
Qingfang Fu
Songlin Hu
SyDa
110
7
0
30 May 2024
SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors
Vijay Lingam
Atula Tejaswi
Aditya Vavre
Aneesh Shetty
Gautham Krishna Gudur
Joydeep Ghosh
Alexandros G. Dimakis
Eunsol Choi
Aleksandar Bojchevski
Sujay Sanghavi
122
18
0
30 May 2024
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
Ge Zhang
Scott Qu
Jiaheng Liu
Chenchen Zhang
Chenghua Lin
...
Zi-Kai Zhao
Jiajun Zhang
Wanli Ouyang
Wenhao Huang
Wenhu Chen
ELM
124
46
0
29 May 2024
Compressing Large Language Models using Low Rank and Low Precision Decomposition
R. Saha
Naomi Sagan
Varun Srivastava
Andrea J. Goldsmith
Mert Pilanci
MQ
65
22
0
29 May 2024
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Alexander Hägele
Elie Bakouch
Atli Kosson
Loubna Ben Allal
Leandro von Werra
Martin Jaggi
125
45
0
28 May 2024
FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models
Yang Zhang
Yawei Li
Xinpeng Wang
Qianli Shen
Barbara Plank
Bernd Bischl
Mina Rezaei
Kenji Kawaguchi
110
12
0
28 May 2024
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
Xing Hu
Yuan Cheng
Dawei Yang
Zhihang Yuan
Jiangyong Yu
Chen Xu
Sifan Zhou
MQ
81
8
0
28 May 2024
Exploring Activation Patterns of Parameters in Language Models
Yudong Wang
Damai Dai
Zhifang Sui
54
2
0
28 May 2024
Outlier-weighed Layerwise Sampling for LLM Fine-tuning
Pengxiang Li
L. Yin
Xiaowei Gao
Shiwei Liu
77
10
0
28 May 2024
LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters
Klaudia Bałazy
Mohammadreza Banaei
Karl Aberer
Jacek Tabor
93
34
0
27 May 2024
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Zhen Qin
Weigao Sun
Dong Li
Xuyang Shen
Weixuan Sun
Yiran Zhong
76
12
0
27 May 2024
CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs
Haoyu Wang
Bei Liu
Hang Shao
Bo Xiao
Ke Zeng
Guanglu Wan
Yanmin Qian
MQ
52
1
0
27 May 2024
MoEUT: Mixture-of-Experts Universal Transformers
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
Christopher Potts
Christopher D. Manning
MoE
88
11
0
25 May 2024
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach
Huy V. Vo
Vasil Khalidov
Timothée Darcet
Théo Moutakanni
Nikita Smetanin
...
Maxime Oquab
Armand Joulin
Hervé Jégou
Patrick Labatut
Piotr Bojanowski
SSL
164
23
0
24 May 2024
BiSup: Bidirectional Quantization Error Suppression for Large Language Models
Minghui Zou
Ronghui Guo
Sai Zhang
Xiaowang Zhang
Zhiyong Feng
MQ
82
1
0
24 May 2024
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
Wenyu Du
Tongxu Luo
Zihan Qiu
Zeyu Huang
Songlin Yang
Reynold Cheng
Yike Guo
Jie Fu
82
15
0
24 May 2024
Sparse Spectral Training and Inference on Euclidean and Hyperbolic Neural Networks
Jialin Zhao
Yingtao Zhang
Xinghang Li
Huaping Liu
C. Cannistraci
61
1
0
24 May 2024
Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training
Xianzhi Du
Tom Gunter
Xiang Kong
Mark Lee
Zirui Wang
Aonan Zhang
Nan Du
Ruoming Pang
MoE
41
1
0
23 May 2024
Bitune: Bidirectional Instruction-Tuning
D. J. Kopiczko
Tijmen Blankevoort
Yuki Markus Asano
47
3
0
23 May 2024
Lessons from the Trenches on Reproducible Evaluation of Language Models
Stella Biderman
Hailey Schoelkopf
Lintang Sutawika
Leo Gao
J. Tow
...
Xiangru Tang
Kevin A. Wang
Genta Indra Winata
Franccois Yvon
Andy Zou
ELM
ALM
198
63
3
23 May 2024
Mitigating Quantization Errors Due to Activation Spikes in GLU-Based LLMs
Jaewoo Yang
Hayun Kim
Younghoon Kim
87
15
0
23 May 2024
Instruction Tuning With Loss Over Instructions
Zhengyan Shi
Adam X. Yang
Bin Wu
Laurence Aitchison
Emine Yilmaz
Aldo Lipani
ALM
83
23
0
23 May 2024
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
Akide Liu
Jing Liu
Zizheng Pan
Yefei He
Gholamreza Haffari
Bohan Zhuang
MQ
91
37
0
23 May 2024
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
Wei Huang
Haotong Qin
Yangdong Liu
Yawei Li
Qinshuo Liu
Xianglong Liu
Luca Benini
Michele Magno
Shiming Zhang
Xiaojuan Qi
MQ
137
19
0
23 May 2024
eXmY: A Data Type and Technique for Arbitrary Bit Precision Quantization
Aditya Agrawal
Matthew Hedlund
Blake A. Hechtman
MQ
89
4
0
22 May 2024
Previous
1
2
3
...
14
15
16
...
26
27
28
Next