Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.11641
Cited By
PIQA: Reasoning about Physical Commonsense in Natural Language
26 November 2019
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PIQA: Reasoning about Physical Commonsense in Natural Language"
50 / 1,393 papers shown
Title
Towards smaller, faster decoder-only transformers: Architectural variants and their implications
Sathya Krishnan Suresh
P. Shunmugapriya
91
1
0
22 Apr 2024
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
Dengchun Li
Yingzi Ma
Naizheng Wang
Zhengmao Ye
Zhiyuan Cheng
...
Yan Zhang
Lei Duan
Jie Zuo
Cal Yang
Mingjie Tang
MoE
128
59
0
22 Apr 2024
Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications
Charith Chandra Sai Balne
S. Bhaduri
Tamoghna Roy
Vinija Jain
Aman Chadha
106
20
0
21 Apr 2024
When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering
Stephen Choi
William Gazeley
KELM
46
2
0
19 Apr 2024
Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning
Ahmed Elshabrawy
Yongix Huang
Iryna Gurevych
Alham Fikri Aji
72
1
0
19 Apr 2024
Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration
Yi-Chong Huang
Xiaocheng Feng
Baohang Li
Yang Xiang
Hui Wang
Bing Qin
Ting Liu
FedML
97
30
0
19 Apr 2024
Shears: Unstructured Sparsity with Neural Low-rank Adapter Search
J. P. Muñoz
Jinjie Yuan
Nilesh Jain
56
7
0
16 Apr 2024
Self-playing Adversarial Language Game Enhances LLM Reasoning
Pengyu Cheng
Tianhao Hu
Han Xu
Zhisong Zhang
Yong Dai
Lei Han
Nan Du
Nan Du
Xiaolong Li
SyDa
LRM
ReLM
188
38
0
16 Apr 2024
HLAT: High-quality Large Language Model Pre-trained on AWS Trainium
Haozheng Fan
Hao Zhou
Guangtai Huang
Parameswaran Raman
Xinwei Fu
Gaurav Gupta
Dhananjay Ram
Yida Wang
Jun Huan
81
6
0
16 Apr 2024
Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model
Hyunsoo Cho
ALM
31
0
0
15 Apr 2024
LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models
Guangyan Li
Yongqiang Tang
Wensheng Zhang
84
6
0
15 Apr 2024
Learn Your Reference Model for Real Good Alignment
Alexey Gorbatovski
Boris Shaposhnikov
Alexey Malakhov
Nikita Surnachev
Yaroslav Aksenov
Ian Maksimov
Nikita Balagansky
Daniil Gavrilov
OffRL
129
35
0
15 Apr 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Xuezhe Ma
Xiaomeng Yang
Wenhan Xiong
Beidi Chen
Lili Yu
Hao Zhang
Jonathan May
Luke Zettlemoyer
Omer Levy
Chunting Zhou
90
33
0
12 Apr 2024
Rho-1: Not All Tokens Are What You Need
Zheng-Wen Lin
Zhibin Gou
Yeyun Gong
Xiao Liu
Yelong Shen
...
Chen Lin
Yujiu Yang
Jian Jiao
Nan Duan
Weizhu Chen
CLL
160
75
0
11 Apr 2024
Scalable Language Model with Generalized Continual Learning
Bohao Peng
Zhuotao Tian
Shu Liu
Mingchang Yang
Jiaya Jia
ALM
CLL
KELM
89
18
0
11 Apr 2024
JetMoE: Reaching Llama2 Performance with 0.1M Dollars
Yikang Shen
Zhen Guo
Tianle Cai
Zengyi Qin
MoE
ALM
96
31
0
11 Apr 2024
ONNXPruner: ONNX-Based General Model Pruning Adapter
Dongdong Ren
Wenbin Li
Tianyu Ding
Lei Wang
Qi Fan
Jing Huo
Hongbing Pan
Yang Gao
95
3
0
10 Apr 2024
Vision-Language Model-based Physical Reasoning for Robot Liquid Perception
Wenqiang Lai
Yuan Gao
T. Lam
LRM
LM&Ro
119
7
0
10 Apr 2024
CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers
Longwei Zou
Qingyang Wang
Han Zhao
Jiangang Kong
Yi Yang
Yangdong Deng
107
0
0
10 Apr 2024
Latent Distance Guided Alignment Training for Large Language Models
Haotian Luo
23
0
0
09 Apr 2024
RAR-b: Reasoning as Retrieval Benchmark
Chenghao Xiao
G. Thomas
Al Moubayed
LRM
RALM
143
12
0
09 Apr 2024
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Bo Peng
Daniel Goldstein
Quentin G. Anthony
Alon Albalak
Eric Alcaide
...
Bingchen Zhao
Qihang Zhao
Peng Zhou
Jian Zhu
Ruijie Zhu
119
82
0
08 Apr 2024
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models
Bowen Pan
Songlin Yang
Haokun Liu
Mayank Mishra
Gaoyuan Zhang
Aude Oliva
Colin Raffel
Yikang Shen
MoE
95
22
0
08 Apr 2024
DLoRA: Distributed Parameter-Efficient Fine-Tuning Solution for Large Language Model
Chao Gao
Sai Qian Zhang
ALM
162
7
0
08 Apr 2024
Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts
Weilin Cai
Juyong Jiang
Le Qin
Junwei Cui
Sunghun Kim
Jiayi Huang
185
10
0
07 Apr 2024
Your Finetuned Large Language Model is Already a Powerful Out-of-distribution Detector
Andi Zhang
Tim Z. Xiao
Weiyang Liu
Robert Bamler
Damon J. Wischik
OODD
116
6
0
07 Apr 2024
Language Models as Critical Thinking Tools: A Case Study of Philosophers
Andre Ye
Jared Moore
Rose Novick
Amy X. Zhang
KELM
ELM
LRM
LLMAG
58
10
0
06 Apr 2024
ReFT: Representation Finetuning for Language Models
Zhengxuan Wu
Aryaman Arora
Zheng Wang
Atticus Geiger
Daniel Jurafsky
Christopher D. Manning
Christopher Potts
OffRL
119
72
0
04 Apr 2024
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline
Yifan Xu
Xiao Liu
Xinghan Liu
Zhenyu Hou
Yueyan Li
...
Aohan Zeng
Zhengxiao Du
Wenyi Zhao
Jie Tang
Yuxiao Dong
LRM
101
42
0
03 Apr 2024
Cross-Architecture Transfer Learning for Linear-Cost Inference Transformers
Sehyun Choi
68
3
0
03 Apr 2024
Emergent Abilities in Reduced-Scale Generative Language Models
Sherin Muckatira
Vijeta Deshpande
Vladislav Lialin
Anna Rumshisky
ReLM
ELM
LRM
69
5
0
02 Apr 2024
HyperCLOVA X Technical Report
Kang Min Yoo
Jaegeun Han
Sookyo In
Heewon Jeon
Jisu Jeong
...
Hyunkyung Noh
Se-Eun Choi
Sang-Woo Lee
Jung Hwa Lim
Nako Sung
VLM
88
9
0
02 Apr 2024
IndoCulture: Exploring Geographically-Influenced Cultural Commonsense Reasoning Across Eleven Indonesian Provinces
Fajri Koto
Rahmad Mahendra
Nurul Aisyah
Timothy Baldwin
LRM
160
19
0
02 Apr 2024
Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
Yuxin Wen
Leo Marchyok
Sanghyun Hong
Jonas Geiping
Tom Goldstein
Nicholas Carlini
SILM
AAML
84
16
0
01 Apr 2024
The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis
Chen Yang
Junzhuo Li
Xinyao Niu
Xinrun Du
Songyang Gao
...
Stephen W. Huang
Shawn Yue
Wenhu Chen
Jie Fu
Ge Zhang
78
2
0
01 Apr 2024
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
Saleh Ashkboos
Amirkeivan Mohtashami
Maximilian L. Croci
Bo Li
Martin Jaggi
Dan Alistarh
Torsten Hoefler
James Hensman
MQ
150
184
0
30 Mar 2024
Communication Efficient Distributed Training with Distributed Lion
Bo Liu
Lemeng Wu
Lizhang Chen
Kaizhao Liang
Jiaxu Zhu
Chen Liang
Raghuraman Krishnamoorthi
Qiang Liu
105
7
0
30 Mar 2024
DiJiang: Efficient Large Language Models through Compact Kernelization
Hanting Chen
Zhicheng Liu
Xutao Wang
Yuchuan Tian
Yunhe Wang
VLM
92
5
0
29 Mar 2024
MANGO: A Benchmark for Evaluating Mapping and Navigation Abilities of Large Language Models
Peng Ding
Jiading Fang
Peng Li
Kangrui Wang
Xiaochen Zhou
Mo Yu
Jing Li
Matthew R. Walter
Hongyuan Mei
RALM
ELM
97
6
0
29 Mar 2024
Jamba: A Hybrid Transformer-Mamba Language Model
Opher Lieber
Barak Lenz
Hofit Bata
Gal Cohen
Jhonathan Osin
...
Nir Ratner
N. Rozen
Erez Shwartz
Mor Zusman
Y. Shoham
124
228
0
28 Mar 2024
A Review of Multi-Modal Large Language and Vision Models
Kilian Carolan
Laura Fennelly
Alan F. Smeaton
VLM
186
28
0
28 Mar 2024
Checkpoint Merging via Bayesian Optimization in LLM Pretraining
Deyuan Liu
Zecheng Wang
Bingning Wang
Weipeng Chen
Chunshan Li
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
MoMe
97
18
0
28 Mar 2024
Large Language Models Need Consultants for Reasoning: Becoming an Expert in a Complex Human System Through Behavior Simulation
Chuwen Wang
Shirong Zeng
Cheng Wang
LLMAG
LRM
34
2
0
27 Mar 2024
Naive Bayes-based Context Extension for Large Language Models
Jianlin Su
Murtadha Ahmed
Wenbo Luo
Abhishek Rao
Denny Zhou
Hyeontaek Lim
74
6
0
26 Mar 2024
ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching
Youpeng Zhao
Di Wu
Jun Wang
96
28
0
26 Mar 2024
Understanding Emergent Abilities of Language Models from the Loss Perspective
Zhengxiao Du
Aohan Zeng
Yuxiao Dong
Jie Tang
UQCV
LRM
162
56
0
23 Mar 2024
Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention
Bin Gao
Zhuomin He
Puru Sharma
Qingxuan Kang
Djordje Jevdjic
Junbo Deng
Xingkun Yang
Zhou Yu
Pengfei Zuo
141
56
0
23 Mar 2024
Extending Token Computation for LLM Reasoning
Bingli Liao
Danilo Vasconcellos Vargas
LRM
41
2
0
22 Mar 2024
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Zeyu Han
Chao Gao
Jinyang Liu
Jeff Zhang
Sai Qian Zhang
307
403
0
21 Mar 2024
Locating and Mitigating Gender Bias in Large Language Models
Yuchen Cai
Ding Cao
Rongxi Guo
Yaqin Wen
Guiquan Liu
Enhong Chen
58
5
0
21 Mar 2024
Previous
1
2
3
...
16
17
18
...
26
27
28
Next