Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.11641
Cited By
PIQA: Reasoning about Physical Commonsense in Natural Language
26 November 2019
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PIQA: Reasoning about Physical Commonsense in Natural Language"
50 / 1,393 papers shown
Title
An Empirical Study of Qwen3 Quantization
Xingyu Zheng
Yuye Li
Haoran Chu
Yue Feng
Xudong Ma
Jie Luo
Jinyang Guo
Haotong Qin
Michele Magno
Xianglong Liu
MQ
84
6
0
04 May 2025
Efficient Fine-Tuning of Quantized Models via Adaptive Rank and Bitwidth
Changhai Zhou
Yuhua Zhou
Qian Qiao
Weizhong Zhang
Cheng Jin
Cheng Jin
MQ
68
1
0
02 May 2025
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Li
Blake Bordelon
Shane Bergsma
Cengiz Pehlevan
Boris Hanin
Joel Hestness
118
2
0
02 May 2025
Grouped Sequency-arranged Rotation: Optimizing Rotation Transformation for Quantization for Free
Euntae Choi
Sumin Song
Woosang Lim
Sungjoo Yoo
78
0
0
02 May 2025
RWKVQuant: Quantizing the RWKV Family with Proxy Guided Hybrid of Scalar and Vector Quantization
Chen Xu
Yuxuan Yue
Zukang Xu
Xing Hu
Jiangyong Yu
Zhixuan Chen
Sifan Zhou
Zhihang Yuan
Dawei Yang
MQ
64
0
0
02 May 2025
ICQuant: Index Coding enables Low-bit LLM Quantization
Xinlin Li
Osama A. Hanna
Christina Fragouli
Suhas Diggavi
MQ
145
1
0
01 May 2025
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
Piotr Piekos
Róbert Csordás
Jürgen Schmidhuber
MoE
VLM
278
2
0
01 May 2025
Softpick: No Attention Sink, No Massive Activations with Rectified Softmax
Zayd Muhammad Kawakibi Zuhri
Erland Hilman Fuadi
Alham Fikri Aji
54
0
0
29 Apr 2025
Efficient LLMs with AMP: Attention Heads and MLP Pruning
Leandro Giusti Mugnaini
Bruno Yamamoto
Lucas Lauton de Alcantara
Victor Zacarias
Edson Bollis
Lucas Pellicer
A. H. R. Costa
Artur Jordao
86
1
0
29 Apr 2025
Combatting Dimensional Collapse in LLM Pre-Training Data via Diversified File Selection
Ziqing Fan
Siyuan Du
Shengchao Hu
Pingjie Wang
Li Shen
Yanzhe Zhang
Dacheng Tao
Yucheng Wang
94
2
0
29 Apr 2025
Learning to Plan Before Answering: Self-Teaching LLMs to Learn Abstract Plans for Problem Solving
Junxuan Zhang
Flood Sung
Zhiyong Yang
Yang Gao
Chongjie Zhang
LLMAG
126
0
0
28 Apr 2025
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
Zhenyu Zhang
Zechun Liu
Yuandong Tian
Harshit Khaitan
Ziyi Wang
Steven Li
106
3
0
28 Apr 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
Xuzhao Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Tianwei Zhang
ALM
ELM
253
7
0
26 Apr 2025
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs
Hongyu Wang
Shuming Ma
Furu Wei
MQ
96
4
0
25 Apr 2025
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining
Fengze Liu
Weidong Zhou
Binbin Liu
Zhimiao Yu
Yifan Zhang
...
Yifeng Yu
Bingni Zhang
Xiaohuan Zhou
Taifeng Wang
Yong Cao
134
1
0
23 Apr 2025
Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation
Luca Moroni
Giovanni Puccetti
Pere-Lluís Huguet Cabot
Andrei Stefan Bejgu
Edoardo Barba
Alessio Miaschi
F. Dell’Orletta
Andrea Esuli
Roberto Navigli
81
2
0
23 Apr 2025
Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Aviv Bick
Eric P. Xing
Albert Gu
RALM
146
1
0
22 Apr 2025
Natural Fingerprints of Large Language Models
Teppei Suzuki
Ryokan Ri
Sho Takase
63
0
0
21 Apr 2025
Efficient Pretraining Length Scaling
Bohong Wu
Shen Yan
Sijun Zhang
Jianqiao Lu
Yutao Zeng
Ya Wang
Xun Zhou
477
0
0
21 Apr 2025
CoLoTa: A Dataset for Entity-based Commonsense Reasoning over Long-Tail Knowledge
Armin Toroghi
Willis Guo
Scott Sanner
RALM
LRM
70
0
0
20 Apr 2025
NoWag: A Unified Framework for Shape Preserving Compression of Large Language Models
Lawrence Liu
Inesh Chakrabarti
Yixiao Li
Mengdi Wang
Tuo Zhao
Lin F. Yang
MQ
69
0
0
20 Apr 2025
Accelerating LLM Inference with Flexible N:M Sparsity via A Fully Digital Compute-in-Memory Accelerator
Akshat Ramachandran
Souvik Kundu
Arnab Raha
Shamik Kundu
Deepak K. Mathaikutty
Tushar Krishna
67
1
0
19 Apr 2025
Empirical Evaluation of Knowledge Distillation from Transformers to Subquadratic Language Models
Patrick Haller
Jonas Golde
Alan Akbik
120
0
0
19 Apr 2025
D-GEN: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Model
Grace Byun
Jinho D. Choi
EGVM
85
0
0
18 Apr 2025
D
2
^{2}
2
MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving
Haodong Wang
Qihua Zhou
Zicong Hong
Song Guo
MoE
83
0
0
17 Apr 2025
It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Ali Behrouz
Meisam Razaviyayn
Peilin Zhong
Vahab Mirrokni
116
5
0
17 Apr 2025
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
Shizhe Diao
Yu Yang
Y. Fu
Xin Dong
Dan Su
...
Hongxu Yin
M. Patwary
Yingyan
Jan Kautz
Pavlo Molchanov
122
2
0
17 Apr 2025
Can Pre-training Indicators Reliably Predict Fine-tuning Outcomes of LLMs?
Hansi Zeng
Kai Hui
Honglei Zhuang
Zhen Qin
Zhenrui Yue
Hamed Zamani
Dana Alon
63
0
0
16 Apr 2025
FLIP Reasoning Challenge
Andreas Plesner
Turlan Kuzhagaliyev
Roger Wattenhofer
AAML
VLM
LRM
187
0
0
16 Apr 2025
Unveiling Hidden Collaboration within Mixture-of-Experts in Large Language Models
Yuanbo Tang
Yan Tang
N. Zhang
Meixuan Chen
Yang Li
MoE
135
1
0
16 Apr 2025
DataDecide: How to Predict Best Pretraining Data with Small Experiments
Ian H. Magnusson
Nguyen Tai
Ben Bogin
David Heineman
Jena D. Hwang
...
Dirk Groeneveld
Oyvind Tafjord
Noah A. Smith
Pang Wei Koh
Jesse Dodge
ALM
83
3
0
15 Apr 2025
Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
Deyu Cao
Samin Aref
MQ
86
0
0
14 Apr 2025
Can the capability of Large Language Models be described by human ability? A Meta Study
Mingrui Zan
Yunquan Zhang
Boyang Zhang
Fangming Liu
Daning Cheng
ELM
LM&MA
84
1
0
13 Apr 2025
Alleviating the Fear of Losing Alignment in LLM Fine-tuning
Kang Yang
Guanhong Tao
X. Chen
Jun Xu
81
1
0
13 Apr 2025
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Yamato Arai
Yuma Ichikawa
MQ
107
0
0
13 Apr 2025
Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance
Ram Mohan Rao Kadiyala
Siddartha Pullakhandam
Siddhant Gupta
Drishti Sharma
Jebish Purbey
Kanwal Mehreen
Muhammad Arham
Hamza Farooq
130
0
0
13 Apr 2025
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation
Juzheng Zhang
Jiacheng You
Ashwinee Panda
Tom Goldstein
MoMe
109
4
0
10 Apr 2025
A Neuro-inspired Interpretation of Unlearning in Large Language Models through Sample-level Unlearning Difficulty
Xiaohua Feng
Yuyuan Li
C. Wang
Junlin Liu
Lulu Zhang
Chaochao Chen
MU
57
0
0
09 Apr 2025
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
Dongyang Fan
Vinko Sabolčec
Matin Ansaripour
Ayush Kumar Tarun
Martin Jaggi
Antoine Bosselut
Imanol Schlag
63
1
0
08 Apr 2025
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation
Biao Zhang
Fedor Moiseev
Joshua Ainslie
Paul Suganthan
Min Ma
Surya Bhupatiraju
Fede Lebron
Orhan Firat
Armand Joulin
Zhe Dong
AI4CE
47
0
0
08 Apr 2025
Graph-based Approaches and Functionalities in Retrieval-Augmented Generation: A Comprehensive Survey
Zulun Zhu
Tiancheng Huang
Kai Wang
Junda Ye
Xiao Chen
Siqiang Luo
3DV
141
0
0
08 Apr 2025
Achieving binary weight and activation for LLMs using Post-Training Quantization
Siqing Song
Chuang Wang
Ruiqi Wang
Yi Yang
Xuyao Zhang
MQ
132
0
0
07 Apr 2025
Saliency-driven Dynamic Token Pruning for Large Language Models
Yao Tao
Yehui Tang
Yun Wang
Mingjian Zhu
Hailin Hu
Yunhe Wang
131
2
0
06 Apr 2025
Compression Laws for Large Language Models
Ayan Sengupta
Siddhant Chaudhary
Tanmoy Chakraborty
57
0
0
06 Apr 2025
Efficient Evaluation of Large Language Models via Collaborative Filtering
Xu-Xiang Zhong
Chao Yi
Han-Jia Ye
118
0
0
05 Apr 2025
A Perplexity and Menger Curvature-Based Approach for Similarity Evaluation of Large Language Models
Yuantao Zhang
Zhankui Yang
AAML
76
0
0
05 Apr 2025
STEP: Staged Parameter-Efficient Pre-training for Large Language Models
Kazuki Yano
Takumi Ito
Jun Suzuki
LRM
132
1
0
05 Apr 2025
Entropy-Based Block Pruning for Efficient Large Language Models
Liangwei Yang
Yuhui Xu
Juntao Tan
Doyen Sahoo
Siyang Song
Caiming Xiong
Han Wang
Shelby Heinecke
AAML
64
0
0
04 Apr 2025
GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration
Yuhang Li
Ruokai Yin
Donghyun Lee
Shiting Xiao
Priyadarshini Panda
MQ
124
0
0
03 Apr 2025
Large (Vision) Language Models are Unsupervised In-Context Learners
Artyom Gadetsky
Andrei Atanov
Yulun Jiang
Zhitong Gao
Ghazal Hosseini Mighan
Amir Zamir
Maria Brbić
VLM
MLLM
LRM
279
0
0
03 Apr 2025
Previous
1
2
3
4
5
...
26
27
28
Next