Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.11641
Cited By
PIQA: Reasoning about Physical Commonsense in Natural Language
26 November 2019
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PIQA: Reasoning about Physical Commonsense in Natural Language"
50 / 1,393 papers shown
Title
MoORE: SVD-based Model MoE-ization for Conflict- and Oblivion-Resistant Multi-Task Adaptation
Shen Yuan
Yin Zheng
Taifeng Wang
Binbin Liu
Hongteng Xu
MoMe
42
0
0
01 Jul 2025
Revisiting LoRA through the Lens of Parameter Redundancy: Spectral Encoding Helps
Jiashun Cheng
Aochuan Chen
Nuo Chen
Ziqi Gao
Yuhan Li
Jia Li
Fugee Tsung
17
0
0
20 Jun 2025
EvoLM: In Search of Lost Language Model Training Dynamics
Zhenting Qi
Fan Nie
Alexandre Alahi
James Zou
Himabindu Lakkaraju
Yilun Du
Eric P. Xing
Sham Kakade
Hanlin Zhang
36
1
0
19 Jun 2025
SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity
Samir Khaki
Xiuyu Li
Junxian Guo
Ligeng Zhu
Chenfeng Xu
Konstantinos N. Plataniotis
Amir Yazdanbakhsh
Kurt Keutzer
Song Han
Zhijian Liu
24
0
0
19 Jun 2025
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Zhiyuan Liang
Dongwen Tang
Yuhao Zhou
Xuanlei Zhao
Mingjia Shi
...
Damian Borth
Michael M. Bronstein
Yang You
Zhangyang Wang
Kai Wang
OffRL
23
0
0
19 Jun 2025
Learning-Time Encoding Shapes Unlearning in LLMs
Ruihan Wu
Konstantin Garov
Kamalika Chaudhuri
MU
22
0
0
18 Jun 2025
RATTENTION: Towards the Minimal Sliding Window Size in Local-Global Attention Models
Bailin Wang
Chang Lan
Chong-Jun Wang
Ruoming Pang
15
0
0
18 Jun 2025
Context-Informed Grounding Supervision
Hyunji Lee
Seunghyun Yoon
Yunjae Won
Hanseok Oh
Geewook Kim
Trung H. Bui
Franck Dernoncourt
Elias Stengel-Eskin
Mohit Bansal
Minjoon Seo
LRM
41
0
0
18 Jun 2025
Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality
Yuto Harada
Yusuke Yamauchi
Yusuke Oda
Yohei Oseki
Yusuke Miyao
Yu Takagi
ALM
29
0
0
17 Jun 2025
Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization
Badr AlKhamissi
C. Nicolò De Sabbata
Zeming Chen
Martin Schrimpf
Antoine Bosselut
MoE
LRM
22
0
0
16 Jun 2025
BOW: Bottlenecked Next Word Exploration
Ming shen
Zhikun Xu
Xiao Ye
Jacob Dineen
Ben Zhou
OffRL
LRM
30
0
0
16 Jun 2025
EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization
Zhongqian Fu
Ning Ding
Kai Han
Xianzhi Yu
Xiaosong Li
Xinghao Chen
Yehui Tang
Yunhe Wang
MQ
MoE
23
0
0
16 Jun 2025
Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization
Guanghui Song
Dongping Liao
Yiren Zhao
Kejiang Ye
Cheng-zhong Xu
X. Gao
MoE
19
0
0
16 Jun 2025
TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices
Mingxue Xu
Y. Xu
Danilo Mandic
32
0
0
16 Jun 2025
Assessing the Role of Data Quality in Training Bilingual Language Models
Skyler Seto
Maartje ter Hoeve
Maureen de Seyssel
David Grangier
17
0
0
15 Jun 2025
Unveiling Confirmation Bias in Chain-of-Thought Reasoning
Yue Wan
Xiaowei Jia
Xiang Li
LRM
20
0
0
14 Jun 2025
LoRA-Gen: Specializing Large Language Model via Online LoRA Generation
Yicheng Xiao
Lin Song
Rui Yang
Cheng Cheng
Yixiao Ge
Xiu Li
Y. Shan
OffRL
24
0
0
13 Jun 2025
Curriculum-Guided Layer Scaling for Language Model Pretraining
Karanpartap Singh
Neil Band
Ehsan Adeli
ALM
LRM
37
0
0
13 Jun 2025
Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training
Mozhi Zhang
Howe Tissue
Lu Wang
Xipeng Qiu
120
1
0
12 Jun 2025
Beyond Random Sampling: Efficient Language Model Pretraining via Curriculum Learning
Yang Zhang
Amr Mohamed
Hadi Abdine
Guokan Shang
Michalis Vazirgiannis
25
0
0
12 Jun 2025
One Tokenizer To Rule Them All: Emergent Language Plasticity via Multilingual Tokenizers
Diana Abagyan
Alejandro Salamanca
Andres Felipe Cruz-Salinas
Kris Cao
Hangyu Lin
Acyr Locatelli
Marzieh Fadaee
Ahmet Üstün
Sara Hooker
CLL
131
0
0
12 Jun 2025
DIVE into MoE: Diversity-Enhanced Reconstruction of Large Language Models from Dense into Mixture-of-Experts
Yuchen Feng
Bowen Shen
Naibin Gu
Jiaxuan Zhao
Peng Fu
Zheng Lin
Weiping Wang
MoMe
MoE
52
0
0
11 Jun 2025
TransXSSM: A Hybrid Transformer State Space Model with Unified Rotary Position Embedding
Bingheng Wu
Jingze Shi
Yifan Wu
Nan Tang
Yuyu Luo
91
0
0
11 Jun 2025
IntPhys 2: Benchmarking Intuitive Physics Understanding In Complex Synthetic Environments
Florian Bordes
Q. Garrido
Justine T Kao
Adina Williams
Michael G. Rabbat
Emmanuel Dupoux
PINN
91
0
0
11 Jun 2025
Unifying Block-wise PTQ and Distillation-based QAT for Progressive Quantization toward 2-bit Instruction-Tuned LLMs
Jung Hyun Lee
Seungjae Shin
Vinnam Kim
Jaeseong You
An Chen
MQ
28
0
0
10 Jun 2025
Olica: Efficient Structured Pruning of Large Language Models without Retraining
Jiujun He
Huazhen Lin
26
0
0
10 Jun 2025
ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving
Yongkang Li
Kaixin Xiong
Xiangyu Guo
Fang Li
Sixu Yan
...
Bing Wang
Guang Chen
Hangjun Ye
Wenyu Liu
Xinggang Wang
VLM
48
0
0
09 Jun 2025
Private Memorization Editing: Turning Memorization into a Defense to Strengthen Data Privacy in Large Language Models
Elena Sofia Ruzzetti
Giancarlo A. Xompero
Davide Venditti
Fabio Massimo Zanzotto
KELM
PILM
39
0
0
09 Jun 2025
Not quite Sherlock Holmes: Language model predictions do not reliably differentiate impossible from improbable events
J. Michaelov
Reeka Estacio
Zhien Zhang
Benjamin Bergen
ReLM
LRM
26
0
0
07 Jun 2025
Adapt Once, Thrive with Updates: Transferable Parameter-Efficient Fine-Tuning on Evolving Base Models
Naibin Gu
Peng Fu
Xiyu Liu
Ke Ma
Zheng Lin
Weiping Wang
24
0
0
07 Jun 2025
MoA: Heterogeneous Mixture of Adapters for Parameter-Efficient Fine-Tuning of Large Language Models
Jie Cao
Tianwei Lin
Hongyang He
Rolan Yan
Wenqiao Zhang
Juncheng Billy Li
D. Zhang
Siliang Tang
Yueting Zhuang
MoE
55
0
0
06 Jun 2025
dots.llm1 Technical Report
Bi Huo
Bin Tu
Cheng Qin
Da Zheng
Debing Zhang
...
Yuqiu Ji
Ze Wen
Zhenhai Liu
Zichao Li
Zilong Liao
MoE
47
0
0
06 Jun 2025
Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models
Peijie Liu
Fengli Xu
Yong Li
LRM
51
0
0
06 Jun 2025
Come Together, But Not Right Now: A Progressive Strategy to Boost Low-Rank Adaptation
Zhan Zhuang
Xiequn Wang
Wei Li
Yulong Zhang
Qiushi Huang
...
Yanbin Wei
Yuhe Nie
Kede Ma
Yu Zhang
Ying Wei
55
0
0
06 Jun 2025
Text-to-LoRA: Instant Transformer Adaption
Rujikorn Charakorn
Edoardo Cetin
Yujin Tang
Robert Tjarko Lange
AI4CE
54
0
0
06 Jun 2025
DynamicMind: A Tri-Mode Thinking System for Large Language Models
Wei Li
Yanbin Wei
Qiushi Huang
Jiangyue Yan
Yang Chen
James T. Kwok
Yu Zhang
LLMAG
LRM
46
0
0
06 Jun 2025
MesaNet: Sequence Modeling by Locally Optimal Test-Time Training
J. Oswald
Nino Scherrer
Seijin Kobayashi
Luca Versari
Songlin Yang
...
Guillaume Lajoie
Charlotte Frenkel
Razvan Pascanu
Blaise Agüera y Arcas
João Sacramento
102
1
0
05 Jun 2025
FPTQuant: Function-Preserving Transforms for LLM Quantization
Boris van Breugel
Yelysei Bondarenko
Paul N. Whatmough
Markus Nagel
MQ
97
0
0
05 Jun 2025
SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling
Anhao Zhao
Fanghua Ye
Yingqi Fan
Junlong Tong
Zhiwei Fei
Hui Su
Xiaoyu Shen
68
0
0
04 Jun 2025
TokAlign: Efficient Vocabulary Adaptation via Token Alignment
Chong Li
Jiajun Zhang
Chengqing Zong
VLM
59
0
0
04 Jun 2025
A Statistical Physics of Language Model Reasoning
Jack David Carson
Amir Reisizadeh
LRM
AI4CE
78
0
0
04 Jun 2025
Accurate Sublayer Pruning for Large Language Models by Exploiting Latency and Tunability Information
Seungcheol Park
Sojin Lee
Jongjin Kim
Jinsik Lee
Hyunjik Jo
U. Kang
75
2
0
04 Jun 2025
MANBench: Is Your Multimodal Model Smarter than Human?
Han Zhou
Qitong Xu
Yiheng Dong
Xin Yang
19
0
0
04 Jun 2025
Beyond Text Compression: Evaluating Tokenizers Across Scales
Jonas F. Lotz
António V. Lopes
Stephan Peitz
Hendra Setiawan
Leonardo Emili
57
0
0
03 Jun 2025
PoLAR: Polar-Decomposed Low-Rank Adapter Representation
Kai Lion
Liang Zhang
Bingcong Li
Niao He
58
0
0
03 Jun 2025
ProcrustesGPT: Compressing LLMs with Structured Matrices and Orthogonal Transformations
Ekaterina Grishina
Mikhail Gorbunov
Maxim Rakhuba
57
0
0
03 Jun 2025
EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving
Shihan Dou
Ming Zhang
Chenhao Huang
Jiayi Chen
F. Chen
...
Wei Chengzhi
Lin Yan
Qi Zhang
Xuanjing Huang
Xuanjing Huang
ELM
82
0
0
03 Jun 2025
Scaling Fine-Grained MoE Beyond 50B Parameters: Empirical Evaluation and Practical Insights
Jakub Krajewski
Marcin Chochowski
Daniel Korzekwa
MoE
ALM
64
0
0
03 Jun 2025
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding
Junliang Ye
Zhengyi Wang
Ruowen Zhao
Shenghao Xie
Jun Zhu
54
0
0
02 Jun 2025
Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition
Yoonjun Cho
Soeun Kim
Dongjae Jeon
Kyelim Lee
Beomsoo Lee
Albert No
MQ
30
0
0
02 Jun 2025
1
2
3
4
...
26
27
28
Next