Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.18667
Cited By
FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability
28 February 2024
Congying Xia
Chen Xing
Jiangshu Du
Xinyi Yang
Yihao Feng
Ran Xu
Wenpeng Yin
Caiming Xiong
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (24★)
Papers citing
"FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability"
19 / 19 papers shown
Title
EIFBENCH: Extremely Complex Instruction Following Benchmark for Large Language Models
Tao Zou
Xinghua Zhang
Haiyang Yu
Minzheng Wang
Fei Huang
Yongbin Li
23
0
0
10 Jun 2025
EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving
Shihan Dou
Ming Zhang
Chenhao Huang
Jiayi Chen
F. Chen
...
Wei Chengzhi
Lin Yan
Qi Zhang
Xuanjing Huang
Xuanjing Huang
ELM
77
0
0
03 Jun 2025
CiteEval: Principle-Driven Citation Evaluation for Source Attribution
Yumo Xu
Peng Qi
Jifan Chen
Kunlun Liu
Rujun Han
Lan Liu
Bonan Min
Vittorio Castelli
Arshit Gupta
Zhiguo Wang
HILM
47
0
0
02 Jun 2025
StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs
J. Yang
Dongfu Jiang
Lipeng He
Sherman Siu
Yuxuan Zhang
...
Yi Lu
Quy Duc Do
Ziyan Jiang
Ping Nie
Wenhu Chen
24
0
0
26 May 2025
The Price of Format: Diversity Collapse in LLMs
Longfei Yun
Chenyang An
Zilong Wang
Letian Peng
Jingbo Shang
39
0
0
25 May 2025
Contrastive Distillation of Emotion Knowledge from LLMs for Zero-Shot Emotion Recognition
Minxue Niu
E. Provost
VLM
208
0
0
23 May 2025
Training with Pseudo-Code for Instruction Following
Praveen Venkateswaran
Rudra Murthy
Riyaz Ahmad Bhat
Danish Contractor
ALM
LRM
97
0
0
23 May 2025
GuideBench: Benchmarking Domain-Oriented Guideline Following for LLM Agents
Lingxiao Diao
Xinyue Xu
Wanxuan Sun
Cheng Yang
Zhuosheng Zhang
LLMAG
ALM
ELM
107
0
0
16 May 2025
Learning to Generate Structured Output with Schema Reinforcement Learning
Yaojie Lu
Haolun Li
Xin Cong
Zhong Zhang
Yesai Wu
Yankai Lin
Zhiyuan Liu
Fangming Liu
Maosong Sun
89
1
0
26 Feb 2025
Enabling Autoregressive Models to Fill In Masked Tokens
Daniel Israel
Aditya Grover
Guy Van den Broeck
AI4CE
185
2
0
09 Feb 2025
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Jiale Cheng
Xiao-Chang Liu
C. Wang
Xiaotao Gu
Yaojie Lu
Dan Zhang
Yuxiao Dong
J. Tang
Hongning Wang
Minlie Huang
LRM
186
4
0
16 Dec 2024
Do LLMs "know" internally when they follow instructions?
Juyeon Heo
Christina Heinze-Deml
Oussama Elachqar
Shirley Ren
Udhay Nallasamy
Andy Miller
Kwan Ho Ryan Chan
Jaya Narain
152
10
0
18 Oct 2024
Do LLMs estimate uncertainty well in instruction-following?
Juyeon Heo
Miao Xiong
Christina Heinze-Deml
Jaya Narain
ELM
126
4
0
18 Oct 2024
LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias of LLMs
Do Xuan Long
Hai Nguyen Ngoc
Tiviatis Sim
Hieu Dao
Shafiq Joty
Kenji Kawaguchi
Nancy F. Chen
Min-Yen Kan
129
11
0
16 Aug 2024
CFBench: A Comprehensive Constraints-Following Benchmark for LLMs
Leo Micklem
Yan-Bin Shen
Wenjing Luo
Yan Zhang
Hao Liang
...
Weipeng Chen
Bin Cui
Blair Thornton
Wentao Zhang
Guosheng Dong
ELM
136
21
0
02 Aug 2024
AgentInstruct: Toward Generative Teaching with Agentic Flows
Arindam Mitra
Luciano Del Corro
Guoqing Zheng
Shweti Mahajan
Dany Rouhana
...
Corby Rosset
Fillipe Silva
Hamed Khanpour
Yash Lara
Ahmed Awadallah
SyDa
101
35
0
03 Jul 2024
Evaluation of Instruction-Following Ability for Large Language Models on Story-Ending Generation
Rem Hida
Junki Ohmura
Toshiyuki Sekiya
ELM
64
0
0
24 Jun 2024
RuleR: Improving LLM Controllability by Rule-based Data Recycling
Ming Li
Han Chen
Chenguang Wang
Dang Nguyen
Dianqi Li
Dinesh Manocha
147
10
0
22 Jun 2024
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models
Hengyi Wang
Haizhou Shi
Shiwei Tan
Weiyi Qin
Wenyuan Wang
Tunyu Zhang
A. Nambi
T. Ganu
Hao Wang
137
21
0
17 Jun 2024
1