Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.09261
Cited By
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
17 October 2022
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
Hyung Won Chung
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
ALM
ELM
LRM
ReLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them"
50 / 793 papers shown
Title
Rethinking Prompt Optimizers: From Prompt Merits to Optimization
Zixiao Zhu
Hanzhang Zhou
Zijian Feng
Tianjiao Li
Chua Jia Jim Deryl
Mak Lee Onn
Gee Wah Ng
Kezhi Mao
LRM
23
0
0
15 May 2025
Qwen3 Technical Report
A. Yang
A. Li
Baosong Yang
Beichen Zhang
Binyuan Hui
...
Zekun Wang
Zeyu Cui
Z. Zhang
Z. Zhou
Z. Qiu
LLMAG
OSLM
LRM
40
0
0
14 May 2025
Evaluating GPT- and Reasoning-based Large Language Models on Physics Olympiad Problems: Surpassing Human Performance and Implications for Educational Assessment
Paul Tschisgale
Holger Maus
Fabian Kieser
Ben Kroehs
Stefan Petersen
Peter Wulff
ELM
LRM
35
0
0
14 May 2025
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining
Xiaomi LLM-Core Team
Bingquan Xia
B. S.
Cici
Dawei Zhu
...
Y. Wang
Yue Yu
Zhenru Lin
Zhichao Song
Zihao Yue
MoE
ReLM
LRM
AI4CE
37
0
0
12 May 2025
Semantic Retention and Extreme Compression in LLMs: Can We Have Both?
Stanislas Laborde
Martin Cousseau
Antoun Yaacoub
Lionel Prevost
MQ
23
0
0
12 May 2025
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection
Kai Hua
Steven Wu
Ge Zhang
Ke Shen
LRM
23
0
0
12 May 2025
Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models
Rei Higuchi
Taiji Suzuki
31
0
0
12 May 2025
Measuring General Intelligence with Generated Games
Vivek Verma
David Huang
William Chen
Dan Klein
Nicholas Tomlin
ReLM
ELM
LM&MA
LRM
45
0
0
12 May 2025
xGen-small Technical Report
Erik Nijkamp
Bo Pang
Egor Pakhomov
Akash Gokul
Jin Qu
Silvio Savarese
Yingbo Zhou
Caiming Xiong
LLMAG
53
0
0
10 May 2025
Stability in Single-Peaked Strategic Resource Selection Games
Henri Zeiler
29
3
0
09 May 2025
Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation
Stefan Vasilev
Christian Herold
Baohao Liao
Seyyed Hadi Hashemi
Shahram Khadivi
Christof Monz
MU
135
0
0
09 May 2025
ICon: In-Context Contribution for Automatic Data Selection
Yixin Yang
Qingxiu Dong
Linli Yao
Fangwei Zhu
Zhifang Sui
48
0
0
08 May 2025
LiteLMGuard: Seamless and Lightweight On-Device Prompt Filtering for Safeguarding Small Language Models against Quantization-induced Risks and Vulnerabilities
Kalyan Nakka
Jimmy Dani
Ausmit Mondal
Nitesh Saxena
AAML
30
0
0
08 May 2025
Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data
Y. Wang
Z. Fu
Jie Cai
Peijun Tang
Hongya Lyu
...
Jie Zhou
Guoyang Zeng
Chaojun Xiao
Xu Han
Zhiyuan Liu
49
0
0
08 May 2025
Rewriting Pre-Training Data Boosts LLM Performance in Math and Code
Kazuki Fujii
Yukito Tajima
Sakae Mizuki
Hinari Shimada
Taihei Shiotani
...
Kakeru Hattori
Youmi Ma
Hiroya Takamura
Rio Yokota
Naoaki Okazaki
SyDa
49
0
0
05 May 2025
TRAVELER: A Benchmark for Evaluating Temporal Reasoning across Vague, Implicit and Explicit References
Svenja Kenneweg
J. Deigmöller
Philipp Cimiano
Julian Eggert
51
0
0
02 May 2025
Understanding LLM Scientific Reasoning through Promptings and Model's Explanation on the Answers
Alice Rueda
Mohammed S. Hassan
Argyrios Perivolaris
Bazen G. Teferra
Reza Samavi
...
Y. Wu
Y. Zhang
Bo Cao
Divya Sharma
Sridhar Krishnan Venkat Bhat
ELM
LRM
55
0
0
02 May 2025
Thoughts without Thinking: Reconsidering the Explanatory Value of Chain-of-Thought Reasoning in LLMs through Agentic Pipelines
R. Manuvinakurike
Emanuel Moss
E. A. Watkins
Saurav Sahay
G. Raffa
L. Nachman
LRM
31
0
0
01 May 2025
ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning
Jingyang Yi
Jiazheng Wang
ReLM
OODD
LRM
134
0
0
30 Apr 2025
Combatting Dimensional Collapse in LLM Pre-Training Data via Diversified File Selection
Ziqing Fan
Siyuan Du
Shengchao Hu
Pingjie Wang
Li Shen
Y. Zhang
Dacheng Tao
Y. Wang
41
1
0
29 Apr 2025
Local Prompt Optimization
Yash Jain
Vishal Chowdhary
53
0
0
29 Apr 2025
SAS-Prompt: Large Language Models as Numerical Optimizers for Robot Self-Improvement
H. B. Amor
L. Graesser
Atil Iscen
David B. DÁmbrosio
Saminda Abeyruwan
Alex Bewley
Yifan Zhou
Kamalesh Kalirathinam
Swaroop Mishra
Pannag R. Sanketi
LLMAG
LM&Ro
LRM
102
0
0
29 Apr 2025
Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance
Takuya Tamura
Taro Yano
Masafumi Enomoto
M. Oyamada
39
0
0
28 Apr 2025
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges
Y. Li
Qizhi Pei
Mengyuan Sun
Honglin Lin
Chenlin Ming
Xin Gao
Jiang Wu
C. He
Lijun Wu
ELM
LRM
40
0
0
27 Apr 2025
An Empirical Study on Prompt Compression for Large Language Models
Z. Zhang
Jinyi Li
Yihuai Lan
X. Wang
Hao Wang
MQ
42
0
0
24 Apr 2025
MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores
Fengwei Zhou
Jiafei Song
Wenjin Jason Li
Gengjian Xue
Zhikang Zhao
Yichao Lu
Bailin Na
17
0
0
23 Apr 2025
Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Aviv Bick
Eric P. Xing
Albert Gu
RALM
88
0
0
22 Apr 2025
Trillion 7B Technical Report
Sungjun Han
Juyoung Suk
Suyeong An
Hyungguk Kim
Kyuseok Kim
Wonsuk Yang
Seungtaek Choi
Jamin Shin
105
0
0
21 Apr 2025
ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data
Tong Chen
Faeze Brahman
Jiacheng Liu
Niloofar Mireshghallah
Weijia Shi
Pang Wei Koh
Luke Zettlemoyer
Hannaneh Hajishirzi
36
0
0
20 Apr 2025
MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space
Yicheng Chen
Yining Li
Kai Hu
Zerun Ma
Haochen Ye
Kai Chen
27
0
0
18 Apr 2025
Information Gain-Guided Causal Intervention for Autonomous Debiasing Large Language Models
Zhouhao Sun
Xiao Ding
LI DU
Yunpeng Xu
Yixuan Ma
Yang Zhao
Bing Qin
Ting Liu
27
0
0
17 Apr 2025
Dynamic Compressing Prompts for Efficient Inference of Large Language Models
Jinwu Hu
W. Zhang
Yufeng Wang
Yu Hu
Bin Xiao
Mingkui Tan
Qing Du
26
0
0
15 Apr 2025
VDocRAG: Retrieval-Augmented Generation over Visually-Rich Documents
Ryota Tanaka
Taichi Iki
Taku Hasegawa
Kyosuke Nishida
Kuniko Saito
Jun Suzuki
VLM
52
0
0
14 Apr 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Z. Liu
Shenglong Ye
...
D. Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
W. Wang
MLLM
VLM
68
7
1
14 Apr 2025
Can the capability of Large Language Models be described by human ability? A Meta Study
Mingrui Zan
Yunquan Zhang
Boyang Zhang
Fangming Liu
Daning Cheng
ELM
LM&MA
55
0
0
13 Apr 2025
Short-Path Prompting in LLMs: Analyzing Reasoning Instability and Solutions for Robust Performance
Zuoli Tang
Junjie Ou
Kaiqin Hu
Chunwei Wu
Zhaoxin Huan
Chilin Fu
Xiaolu Zhang
Jun Zhou
Chenliang Li
ReLM
LRM
38
0
0
13 Apr 2025
Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance
Ram Mohan Rao Kadiyala
Siddartha Pullakhandam
Siddhant Gupta
Drishti Sharma
Jebish Purbey
Kanwal Mehreen
Muhammad Arham
Hamza Farooq
27
0
0
13 Apr 2025
A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis
Xin Gao
Qizhi Pei
Zinan Tang
Y. Li
Honglin Lin
Jiang Wu
C. He
Lijun Wu
SyDa
28
0
0
11 Apr 2025
Playpen: An Environment for Exploring Learning Through Conversational Interaction
Nicola Horst
Davide Mazzaccara
Antonia Schmidt
Michael Sullivan
Filippo Momentè
...
Alexander Koller
Oliver Lemon
David Schlangen
Mario Giulianelli
Alessandro Suglia
OffRL
32
0
0
11 Apr 2025
DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning
Atharva Pandey
Kshitij Dubey
Rahul Sharma
Amit Sharma
ReLM
ELM
LRM
52
0
0
09 Apr 2025
Do Reasoning Models Show Better Verbalized Calibration?
Qingcheng Zeng
Weihao Xuan
Leyang Cui
Rob Voigt
LRM
28
0
0
09 Apr 2025
SEA-LION: Southeast Asian Languages in One Network
Raymond Ng
Thanh Ngan Nguyen
Yuli Huang
Ngee Chia Tai
Wai Yi Leong
...
David Ong Tat-Wee
B. Liu
William-Chandra Tjhi
Erik Cambria
Leslie Teo
36
11
0
08 Apr 2025
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation
Biao Zhang
Fedor Moiseev
Joshua Ainslie
Paul Suganthan
Min Ma
Surya Bhupatiraju
Fede Lebron
Orhan Firat
Armand Joulin
Zhe Dong
AI4CE
26
0
0
08 Apr 2025
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Gleb Rodionov
Roman Garipov
Alina Shutova
George Yakushev
Vage Egiazarian
Anton Sinitsin
Denis Kuznedelev
Dan Alistarh
LRM
27
1
0
08 Apr 2025
GREATERPROMPT: A Unified, Customizable, and High-Performing Open-Source Toolkit for Prompt Optimization
Wenliang Zheng
Sarkar Snigdha Sarathi Das
Yusen Zhang
Rui Zhang
28
0
0
04 Apr 2025
Universal Collection of Euclidean Invariants between Pairs of Position-Orientations
Gijs Bellaard
B. Smets
R. Duits
59
0
0
04 Apr 2025
Representation Bending for Large Language Model Safety
Ashkan Yousefpour
Taeheon Kim
Ryan S. Kwon
Seungbeen Lee
Wonje Jeung
Seungju Han
Alvin Wan
Harrison Ngan
Youngjae Yu
Jonghyun Choi
AAML
ALM
KELM
52
0
0
02 Apr 2025
TransforMerger: Transformer-based Voice-Gesture Fusion for Robust Human-Robot Communication
Petr Vanc
Karla Stepanova
38
0
0
02 Apr 2025
When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks
Nan Zhang
Yusen Zhang
Prasenjit Mitra
Rui Zhang
MQ
LRM
51
2
0
02 Apr 2025
AgentNet: Decentralized Evolutionary Coordination for LLM-based Multi-Agent Systems
Y. Yang
Huacan Chai
Shuai Shao
Y. Song
Siyuan Qi
Renting Rui
Weinan Zhang
AIFin
41
0
0
01 Apr 2025
1
2
3
4
...
14
15
16
Next