Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.11641
Cited By
PIQA: Reasoning about Physical Commonsense in Natural Language
26 November 2019
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PIQA: Reasoning about Physical Commonsense in Natural Language"
50 / 1,393 papers shown
Title
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models
Luohe Shi
Yao Yao
Zuchao Li
Lefei Zhang
Hai Zhao
60
0
0
30 Sep 2024
Hyper-Connections
Defa Zhu
Hongzhi Huang
Zihao Huang
Yutao Zeng
Yunyao Mao
Banggu Wu
Qiyang Min
Xun Zhou
87
6
0
29 Sep 2024
Analog In-Memory Computing Attention Mechanism for Fast and Energy-Efficient Large Language Models
Nathan Leroux
Paul-Philipp Manea
Chirag Sudarshan
Jan Finkbeiner
Sebastian Siegel
J. Strachan
Emre Neftci
51
1
0
28 Sep 2024
Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning
Yu Fu
Jie He
Yifan Yang
Qun Liu
Deyi Xiong
OffRL
LRM
107
0
0
27 Sep 2024
Scene Exploration by Vision-Language Models
Venkatesh Sripada
Samuel Carter
Frank Guerin
Amir Ghalamzan
LM&Ro
79
1
0
26 Sep 2024
VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Yifei Liu
Jicheng Wen
Yang Wang
Shengyu Ye
Li Lyna Zhang
Ting Cao
Cheng Li
Mao Yang
MQ
239
16
0
25 Sep 2024
PMSS: Pretrained Matrices Skeleton Selection for LLM Fine-tuning
Qibin Wang
Xiaolin Hu
Weikai Xu
Wei Liu
Jian Luan
Bin Wang
54
1
0
25 Sep 2024
RISCORE: Enhancing In-Context Riddle Solving in Language Models through Context-Reconstructed Example Augmentation
Ioannis Panagiotopoulos
Giorgos Filandrianos
Maria Lymperaiou
Giorgos Stamou
LRM
ReLM
75
1
0
24 Sep 2024
MonoFormer: One Transformer for Both Diffusion and Autoregression
Chuyang Zhao
Yuxing Song
Wenhao Wang
Haocheng Feng
Errui Ding
Yifan Sun
Xinyan Xiao
Jingdong Wang
DiffM
77
22
0
24 Sep 2024
Small Language Models: Survey, Measurements, and Insights
Zhenyan Lu
Xiang Li
Dongqi Cai
Rongjie Yi
Fangming Liu
Xiwen Zhang
Nicholas D. Lane
Mengwei Xu
ObjD
LRM
169
58
0
24 Sep 2024
Target-Aware Language Modeling via Granular Data Sampling
Ernie Chang
Pin-Jie Lin
Yang Li
Changsheng Zhao
Daeil Kim
Rastislav Rabatin
Zechun Liu
Yangyang Shi
Vikas Chandra
SyDa
63
1
0
23 Sep 2024
Investigating Layer Importance in Large Language Models
Yang Zhang
Yanfei Dong
Kenji Kawaguchi
FAtt
98
10
0
22 Sep 2024
CFSP: An Efficient Structured Pruning Framework for LLMs with Coarse-to-Fine Activation Information
Yuxin Wang
Minghua Ma
Zekun Wang
Jingchang Chen
Huiming Fan
Liping Shan
Qing Yang
Dongliang Xu
Ming Liu
Bing Qin
79
4
0
20 Sep 2024
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
Stephen Zhang
Vardan Papyan
VLM
164
3
0
20 Sep 2024
Bilingual Evaluation of Language Models on General Knowledge in University Entrance Exams with Minimal Contamination
Eva Sánchez Salido
Roser Morante
Julio Gonzalo
Guillermo Marco
Jorge Carrillo-de-Albornoz
...
Enrique Amigó
Andrés Fernández
Alejandro Benito-Santos
Adrián Ghajari Espinosa
Victor Fresno
ELM
131
0
0
19 Sep 2024
MEOW: MEMOry Supervised LLM Unlearning Via Inverted Facts
Tianle Gu
Kexin Huang
Ruilin Luo
Yuanqi Yao
Yujiu Yang
Yan Teng
Yingchun Wang
MU
143
9
0
18 Sep 2024
The Factuality of Large Language Models in the Legal Domain
Rajaa El Hamdani
Thomas Bonald
Fragkiskos D. Malliaros
Nils Holzenberger
Fabian M. Suchanek
AILaw
HILM
110
1
0
18 Sep 2024
Mixture of Diverse Size Experts
Manxi Sun
Wei Liu
Jian Luan
Pengzhi Gao
Bin Wang
MoE
40
1
0
18 Sep 2024
Reward-Robust RLHF in LLMs
Yuzi Yan
Xingzhou Lou
Jialian Li
Yiping Zhang
Jian Xie
Chao Yu
Yu Wang
Dong Yan
Yuan Shen
99
13
0
18 Sep 2024
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague
Fangcong Yin
Juan Diego Rodriguez
Dongwei Jiang
Manya Wadhwa
Prasann Singhal
Xinyu Zhao
Xi Ye
Kyle Mahowald
Greg Durrett
ReLM
LRM
247
132
0
18 Sep 2024
AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs
Basel Mousi
Nadir Durrani
Fatema Ahmad
Md. Arid Hasan
Maram Hasanain
Tameem Kabbani
Fahim Dalvi
Shammur A. Chowdhury
Firoj Alam
97
9
0
17 Sep 2024
Propulsion: Steering LLM with Tiny Fine-Tuning
Md. Kowsher
Nusrat Jahan Prottasha
Prakash Bhat
93
7
0
17 Sep 2024
MotIF: Motion Instruction Fine-tuning
Minyoung Hwang
Joey Hejna
Dorsa Sadigh
Yonatan Bisk
86
1
0
16 Sep 2024
MARCA: Mamba Accelerator with ReConfigurable Architecture
Jinhao Li
Shan Huang
Jiaming Xu
Jun Liu
Li Ding
Ningyi Xu
Guohao Dai
83
9
0
16 Sep 2024
The 20 questions game to distinguish large language models
Gurvan Richardeau
Erwan Le Merrer
C. Penzo
Gilles Tredan
106
1
0
16 Sep 2024
Flash STU: Fast Spectral Transform Units
Y. Isabel Liu
Windsor Nguyen
Yagiz Devre
Evan Dogariu
Anirudha Majumdar
Elad Hazan
AI4TS
156
1
0
16 Sep 2024
Understanding Foundation Models: Are We Back in 1924?
Alan F. Smeaton
AI4CE
70
3
0
11 Sep 2024
Gated Slot Attention for Efficient Linear-Time Sequence Modeling
Yu Zhang
Aaron Courville
Ruijie Zhu
Yue Zhang
Leyang Cui
...
Freda Shi
Bailin Wang
Wei Bi
P. Zhou
Guohong Fu
117
24
0
11 Sep 2024
DiPT: Enhancing LLM reasoning through diversified perspective-taking
H. Just
Mahavir Dabas
Lifu Huang
Ming Jin
Ruoxi Jia
LRM
72
1
0
10 Sep 2024
Improving Pretraining Data Using Perplexity Correlations
Tristan Thrush
Christopher Potts
Tatsunori Hashimoto
107
22
0
09 Sep 2024
The AdEMAMix Optimizer: Better, Faster, Older
Matteo Pagliardini
Pierre Ablin
David Grangier
ODL
91
13
0
05 Sep 2024
Unveiling the Vulnerability of Private Fine-Tuning in Split-Based Frameworks for Large Language Models: A Bidirectionally Enhanced Attack
Guanzhong Chen
Zhenghan Qin
Mingxin Yang
Yajie Zhou
Tao Fan
Tianyu Du
Zenglin Xu
AAML
121
6
0
02 Sep 2024
Does Alignment Tuning Really Break LLMs' Internal Confidence?
Hongseok Oh
Wonseok Hwang
137
0
0
31 Aug 2024
OnlySportsLM: Optimizing Sports-Domain Language Models with SOTA Performance under Billion Parameters
Zexin Chen
Chengxi Li
Xiangyu Xie
Parijat Dube
ALM
64
2
0
30 Aug 2024
Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models
Yuncheng Yang
Yulei Qin
Tong Wu
Zihan Xu
Gang Li
...
Yuchen Shi
Ke Li
Xing Sun
Jie Yang
Yun Gu
ALM
OffRL
MoE
127
0
0
28 Aug 2024
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
Nikolas Gritsch
Qizhen Zhang
Acyr Locatelli
Sara Hooker
Ahmet Üstün
MoE
86
3
0
28 Aug 2024
Language Adaptation on a Tight Academic Compute Budget: Tokenizer Swapping Works and Pure bfloat16 Is Enough
Konstantin Dobler
Gerard de Melo
78
1
0
28 Aug 2024
Legilimens: Practical and Unified Content Moderation for Large Language Model Services
Jialin Wu
Jiangyi Deng
Shengyuan Pang
Yanjiao Chen
Jiayang Xu
Xinfeng Li
Wei Dong
131
8
0
28 Aug 2024
Focused Large Language Models are Stable Many-Shot Learners
Peiwen Yuan
Shaoxiong Feng
Yiwei Li
Xinglin Wang
Y. Zhang
Chuyi Tan
Boyuan Pan
Heda Wang
Yao Hu
Kan Li
102
5
0
26 Aug 2024
Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler
Songlin Yang
Matthew Stallone
Mayank Mishra
Gaoyuan Zhang
Shawn Tan
Aditya Prasad
Adriana Meza Soria
David D. Cox
Yikang Shen
109
16
0
23 Aug 2024
CIPHER: Cybersecurity Intelligent Penetration-testing Helper for Ethical Researcher
Derry Pratama
Naufal Suryanto
Andro Aprila Adiputra
Thi-Thu-Huong Le
Ahmada Yusril Kadiptya
Muhammad Iqbal
Howon Kim
82
9
0
21 Aug 2024
Differentiating Choices via Commonality for Multiple-Choice Question Answering
Wenqing Deng
Zhe Wang
Kewen Wang
Shirui Pan
Xiaowang Zhang
Zhiyong Feng
67
0
0
21 Aug 2024
Diagnosing and Remedying Knowledge Deficiencies in LLMs via Label-free Curricular Meaningful Learning
Kai Xiong
Xiao Ding
Li Du
Jiahao Ying
Ting Liu
Bing Qin
Yixin Cao
92
2
0
21 Aug 2024
First Activations Matter: Training-Free Methods for Dynamic Activation in Large Language Models
Chi Ma
Mincong Huang
Ying Zhang
Chao Wang
Yujie Wang
Lei Yu
Chuan Liu
Wei Lin
AI4CE
LLMSV
81
2
0
21 Aug 2024
CoDi: Conversational Distillation for Grounded Question Answering
Patrick Huber
Arash Einolghozati
Rylan Conway
Kanika Narang
Matt Smith
Waqar Nayyar
Adithya Sagar
Ahmed Aly
Akshat Shrivastava
35
0
0
20 Aug 2024
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Chunting Zhou
Lili Yu
Arun Babu
Kushal Tirumala
Michihiro Yasunaga
Leonid Shamis
Jacob Kahn
Xuezhe Ma
Luke Zettlemoyer
Omer Levy
DiffM
130
190
0
20 Aug 2024
HMoE: Heterogeneous Mixture of Experts for Language Modeling
An Wang
Xingwu Sun
Ruobing Xie
Shuaipeng Li
Jiaqi Zhu
...
J. N. Han
Zhanhui Kang
Di Wang
Naoaki Okazaki
Cheng-zhong Xu
MoE
127
18
0
20 Aug 2024
Enhancing One-shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism
Guanchen Li
Xiandong Zhao
Lian Liu
Zeping Li
Dong Li
Lu Tian
Jie He
Ashish Sirasao
E. Barsoum
VLM
52
1
0
20 Aug 2024
Flexora: Flexible Low Rank Adaptation for Large Language Models
Chenxing Wei
Yao Shu
Y. He
Fei Richard Yu
AI4CE
96
4
0
20 Aug 2024
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Aviv Bick
Kevin Y. Li
Eric P. Xing
J. Zico Kolter
Albert Gu
Mamba
154
32
0
19 Aug 2024
Previous
1
2
3
...
10
11
12
...
26
27
28
Next