Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.11641
Cited By
PIQA: Reasoning about Physical Commonsense in Natural Language
26 November 2019
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PIQA: Reasoning about Physical Commonsense in Natural Language"
50 / 1,393 papers shown
Title
Reverse Training to Nurse the Reversal Curse
O. Yu. Golovneva
Zeyuan Allen-Zhu
Jason Weston
Sainbayar Sukhbaatar
116
38
0
20 Mar 2024
Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Vidhi Jain
Maria Attarian
Nikhil J. Joshi
Ayzaan Wahid
Danny Driess
...
Stefan Welker
Christine Chan
Igor Gilitschenski
Yonatan Bisk
Debidatta Dwibedi
136
32
0
19 Mar 2024
Think Twice Before Trusting: Self-Detection for Large Language Models through Comprehensive Answer Reflection
Moxin Li
Wenjie Wang
Fuli Feng
Fengbin Zhu
Qifan Wang
Tat-Seng Chua
HILM
LRM
115
23
0
15 Mar 2024
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Piotr Nawrot
Adrian Lañcucki
Marcin Chochowski
David Tarjan
Edoardo Ponti
98
56
0
14 Mar 2024
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Brandon McKinzie
Zhe Gan
J. Fauconnier
Sam Dodge
Bowen Zhang
...
Zirui Wang
Ruoming Pang
Peter Grasch
Alexander Toshev
Yinfei Yang
MLLM
127
209
0
14 Mar 2024
Meaningful Learning: Advancing Abstract Reasoning in Large Language Models via Generic Fact Guidance
Kai Xiong
Xiao Ding
Ting Liu
Bing Qin
Dongliang Xu
Qing Yang
Hongtao Liu
Yixin Cao
LRM
72
7
0
14 Mar 2024
Keyformer: KV Cache Reduction through Key Tokens Selection for Efficient Generative Inference
Muhammad Adnan
Akhil Arunkumar
Gaurav Jain
Prashant J. Nair
Ilya Soloveychik
Purushotham Kamath
112
62
0
14 Mar 2024
Simple and Scalable Strategies to Continually Pre-train Large Language Models
Adam Ibrahim
Benjamin Thérien
Kshitij Gupta
Mats L. Richter
Quentin Anthony
Timothée Lesort
Eugene Belilovsky
Irina Rish
KELM
CLL
109
63
0
13 Mar 2024
Language models scale reliably with over-training and on downstream tasks
S. Gadre
Georgios Smyrnis
Vaishaal Shankar
Suchin Gururangan
Mitchell Wortsman
...
Y. Carmon
Achal Dave
Reinhard Heckel
Niklas Muennighoff
Ludwig Schmidt
ALM
ELM
LRM
183
48
0
13 Mar 2024
Gemma: Open Models Based on Gemini Research and Technology
Gemma Team
Gemma Team Thomas Mesnard
Cassidy Hardin
Robert Dadashi
Surya Bhupatiraju
...
Armand Joulin
Noah Fiedel
Evan Senter
Alek Andreev
Kathleen Kenealy
VLM
LLMAG
242
513
0
13 Mar 2024
CHAI: Clustered Head Attention for Efficient LLM Inference
Saurabh Agarwal
Bilge Acun
Basil Homer
Mostafa Elhoushi
Yejin Lee
Shivaram Venkataraman
Dimitris Papailiopoulos
Carole-Jean Wu
107
11
0
12 Mar 2024
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
Fangyun Wei
Xi Chen
Linzi Luo
ELM
ALM
LRM
63
8
0
12 Mar 2024
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Sainbayar Sukhbaatar
O. Yu. Golovneva
Vasu Sharma
Hu Xu
Xi Lin
...
Jacob Kahn
Shang-Wen Li
Wen-tau Yih
Jason Weston
Xian Li
MoMe
OffRL
MoE
96
69
0
12 Mar 2024
Harder Tasks Need More Experts: Dynamic Routing in MoE Models
Quzhe Huang
Zhenwei An
Zhuang Nan
Mingxu Tao
Chen Zhang
...
Kun Xu
Kun Xu
Liwei Chen
Songfang Huang
Yansong Feng
MoE
94
28
0
12 Mar 2024
Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs
Tianqing Fang
Zeming Chen
Yangqiu Song
Antoine Bosselut
ReLM
LRM
71
14
0
12 Mar 2024
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
Xin Wang
Yu Zheng
Zhongwei Wan
Mi Zhang
MQ
161
64
0
12 Mar 2024
FrameQuant: Flexible Low-Bit Quantization for Transformers
Harshavardhan Adepu
Zhanpeng Zeng
Li Zhang
Vikas Singh
MQ
60
8
0
10 Mar 2024
Yi: Open Foundation Models by 01.AI
01. AI
Alex Young
01.AI Alex Young
Bei Chen
Chao Li
...
Yue Wang
Yuxuan Cai
Zhenyu Gu
Zhiyuan Liu
Zonghong Dai
OSLM
LRM
317
576
0
07 Mar 2024
QAQ: Quality Adaptive Quantization for LLM KV Cache
Shichen Dong
Wenfang Cheng
Jiayu Qin
Wei Wang
MQ
118
36
0
07 Mar 2024
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Xin Men
Mingyu Xu
Qingyu Zhang
Bingning Wang
Hongyu Lin
Yaojie Lu
Xianpei Han
Weipeng Chen
117
142
0
06 Mar 2024
Should We Fear Large Language Models? A Structural Analysis of the Human Reasoning System for Elucidating LLM Capabilities and Risks Through the Lens of Heidegger's Philosophy
Jianqiiu Zhang
ELM
67
1
0
05 Mar 2024
PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset
Arda Uzunouglu
Abdalfatah Rashid Safa
Gözde Gül Sahin
LRM
70
2
0
05 Mar 2024
How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider Transformer Models
Xin Lu
Yanyan Zhao
Bing Qin
73
0
0
04 Mar 2024
Right for Right Reasons: Large Language Models for Verifiable Commonsense Knowledge Graph Question Answering
Armin Toroghi
Willis Guo
Mohammad Mahdi Torabi pour
Scott Sanner
LRM
101
10
0
03 Mar 2024
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization
Xiang Meng
Shibal Ibrahim
Kayhan Behdin
Hussein Hazimeh
Natalia Ponomareva
Rahul Mazumder
VLM
104
8
0
02 Mar 2024
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
Tianyi Zhang
Jonah Yi
Bowen Yao
Zhaozhuo Xu
Anshumali Shrivastava
MQ
104
7
0
02 Mar 2024
LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization
Juntao Zhao
Borui Wan
Size Zheng
Yanghua Peng
Chuan Wu
MQ
63
15
0
02 Mar 2024
FAC
2
^2
2
E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition
Xiaoqiang Wang
Bang Liu
Lingfei Wu
87
0
0
29 Feb 2024
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap
Saurabh Srivastava
B. AnnaroseM
V. AntoP
Shashank Menon
Ajay Sukumar
T. AdwaithSamod
Alan Philipose
Stevin Prince
Sooraj Thomas
ELM
ReLM
LRM
79
56
0
29 Feb 2024
Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge
Ansh Arora
Xuanli He
Maximilian Mozes
Srinibas Swain
Mark Dras
Xingliang Yuan
SILM
MoMe
AAML
121
14
0
29 Feb 2024
Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning
Weijieying Ren
Xinlong Li
Lei Wang
Tianxiang Zhao
Wei Qin
CLL
KELM
117
39
0
29 Feb 2024
Tokenization Is More Than Compression
Craig W. Schmidt
Varshini Reddy
Haoran Zhang
Alec Alameddine
Omri Uzan
Yuval Pinter
Chris Tanner
124
38
0
28 Feb 2024
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning
Jiachun Li
Pengfei Cao
Chenhao Wang
Zhuoran Jin
Yubo Chen
Daojian Zeng
Kang Liu
Jun Zhao
LRM
101
10
0
28 Feb 2024
Evaluating Quantized Large Language Models
Shiyao Li
Xuefei Ning
Luning Wang
Tengxuan Liu
Xiangsheng Shi
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MQ
119
53
0
28 Feb 2024
FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization
Yi Zhang
Fei Yang
Shuang Peng
Fangyu Wang
Aimin Pan
MQ
71
2
0
28 Feb 2024
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Shuming Ma
Hongyu Wang
Lingxiao Ma
Lei Wang
Wenhui Wang
Shaohan Huang
Lifeng Dong
Ruiping Wang
Jilong Xue
Furu Wei
MQ
95
234
0
27 Feb 2024
Massive Activations in Large Language Models
Mingjie Sun
Xinlei Chen
J. Zico Kolter
Zhuang Liu
126
81
0
27 Feb 2024
KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark
Seongbo Jang
Seonghyeon Lee
Hwanjo Yu
ELM
71
0
0
27 Feb 2024
Measuring Vision-Language STEM Skills of Neural Models
Jianhao Shen
Ye Yuan
Srbuhi Mirzoyan
Ming Zhang
Chenguang Wang
VLM
119
12
0
27 Feb 2024
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
Sunghyeon Woo
Baeseong Park
Byeongwook Kim
Minjung Jo
S. Kwon
Dongsuk Jeon
Dongsoo Lee
132
3
0
27 Feb 2024
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Omkar Thawakar
Ashmal Vayani
Salman Khan
Hisham Cholakal
Rao M. Anwer
Michael Felsberg
Timothy Baldwin
Eric P. Xing
Fahad Shahbaz Khan
115
35
0
26 Feb 2024
Nemotron-4 15B Technical Report
Jupinder Parmar
Shrimai Prabhumoye
Pritam Gundecha
M. Patwary
Sandeep Subramanian
...
Ashwath Aithal
Oleksii Kuchaiev
Mohammad Shoeybi
Jonathan Cohen
Bryan Catanzaro
101
23
0
26 Feb 2024
DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models
Wei He
Kai Han
Yehui Tang
Chengcheng Wang
Yujie Yang
Tianyu Guo
Yunhe Wang
Mamba
121
27
0
26 Feb 2024
Data-free Weight Compress and Denoise for Large Language Models
Runyu Peng
Yunhua Zhou
Qipeng Guo
Yang Gao
Hang Yan
Xipeng Qiu
Dahua Lin
160
1
0
26 Feb 2024
GPTVQ: The Blessing of Dimensionality for LLM Quantization
M. V. Baalen
Andrey Kuzmin
Ivan Koryakovskiy
Markus Nagel
Peter Couperus
Cédric Bastoul
E. Mahurin
Tijmen Blankevoort
Paul N. Whatmough
MQ
110
35
0
23 Feb 2024
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Zechun Liu
Changsheng Zhao
Forrest N. Iandola
Chen Lai
Yuandong Tian
...
Ernie Chang
Yangyang Shi
Raghuraman Krishnamoorthi
Liangzhen Lai
Vikas Chandra
ALM
137
102
0
22 Feb 2024
"My Answer is C": First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models
Xinpeng Wang
Bolei Ma
Chengzhi Hu
Leon Weber-Genzel
Paul Röttger
Frauke Kreuter
Dirk Hovy
Barbara Plank
83
46
0
22 Feb 2024
On the Tip of the Tongue: Analyzing Conceptual Representation in Large Language Models with Reverse-Dictionary Probe
Ningyu Xu
Qi Zhang
Menghan Zhang
Peng Qian
Xuanjing Huang
LRM
124
3
0
22 Feb 2024
Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?
Ning Bian
Xianpei Han
Hongyu Lin
Yaojie Lu
Xianpei Han
Le Sun
82
1
0
22 Feb 2024
Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization
Xuxi Chen
Zhendong Wang
Daouda Sow
Junjie Yang
Tianlong Chen
Yingbin Liang
Mingyuan Zhou
Zhangyang Wang
88
7
0
22 Feb 2024
Previous
1
2
3
...
17
18
19
...
26
27
28
Next