ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.09261
  4. Cited By
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

17 October 2022
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
Hyung Won Chung
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
    ALM
    ELM
    LRM
    ReLM
ArXivPDFHTML

Papers citing "Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them"

50 / 798 papers shown
Title
Harder Tasks Need More Experts: Dynamic Routing in MoE Models
Harder Tasks Need More Experts: Dynamic Routing in MoE Models
Quzhe Huang
Zhenwei An
Zhuang Nan
Mingxu Tao
Chen Zhang
...
Kun Xu
Kun Xu
Liwei Chen
Songfang Huang
Yansong Feng
MoE
39
26
0
12 Mar 2024
Academically intelligent LLMs are not necessarily socially intelligent
Academically intelligent LLMs are not necessarily socially intelligent
Ruoxi Xu
Hongyu Lin
Xianpei Han
Le Sun
Yingfei Sun
ELM
37
6
0
11 Mar 2024
GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless
  Generative Inference of LLM
GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM
Hao Kang
Qingru Zhang
Souvik Kundu
Geonhwa Jeong
Zaoxing Liu
Tushar Krishna
Tuo Zhao
MQ
43
81
0
08 Mar 2024
Bias-Augmented Consistency Training Reduces Biased Reasoning in
  Chain-of-Thought
Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
James Chua
Edward Rees
Hunar Batra
Samuel R. Bowman
Julian Michael
Ethan Perez
Miles Turpin
LRM
47
13
0
08 Mar 2024
Will GPT-4 Run DOOM?
Will GPT-4 Run DOOM?
Adrian de Wynter
LM&Ro
MLLM
49
5
0
08 Mar 2024
ERBench: An Entity-Relationship based Automatically Verifiable
  Hallucination Benchmark for Large Language Models
ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Jio Oh
Soyeon Kim
Junseok Seo
Jindong Wang
Ruochen Xu
Xing Xie
Steven Euijong Whang
41
1
0
08 Mar 2024
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error
Boshi Wang
Hao Fang
Jason Eisner
Benjamin Van Durme
Yu-Chuan Su
CLL
29
7
0
07 Mar 2024
Chain of Thought Explanation for Dialogue State Tracking
Chain of Thought Explanation for Dialogue State Tracking
Lin Xu
Ningxin Peng
Daquan Zhou
See-Kiong Ng
Jinlan Fu
LRM
27
1
0
07 Mar 2024
Yi: Open Foundation Models by 01.AI
Yi: Open Foundation Models by 01.AI
01. AI
Alex Young
01.AI Alex Young
Bei Chen
Chao Li
...
Yue Wang
Yuxuan Cai
Zhenyu Gu
Zhiyuan Liu
Zonghong Dai
OSLM
LRM
150
502
0
07 Mar 2024
Scope of Large Language Models for Mining Emerging Opinions in Online
  Health Discourse
Scope of Large Language Models for Mining Emerging Opinions in Online Health Discourse
Joseph Gatto
Madhusudan Basak
Yash Srivastava
Philip Bohlman
S. Preum
40
1
0
05 Mar 2024
Exploring the Limitations of Large Language Models in Compositional
  Relation Reasoning
Exploring the Limitations of Large Language Models in Compositional Relation Reasoning
Jinman Zhao
Xueyan Zhang
BDL
LRM
35
4
0
05 Mar 2024
Eliciting Better Multilingual Structured Reasoning from LLMs through
  Code
Eliciting Better Multilingual Structured Reasoning from LLMs through Code
Bryan Li
Tamer Alkhouli
Daniele Bonadiman
Nikolaos Pappas
Saab Mansour
LRM
42
7
0
05 Mar 2024
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve
  Mathematical Reasoning Learning of Language Models
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models
Changyu Chen
Xiting Wang
Ting-En Lin
Ang Lv
Yuchuan Wu
Xin Gao
Ji-Rong Wen
Rui Yan
Yongbin Li
ReLM
LRM
31
9
0
04 Mar 2024
SciAssess: Benchmarking LLM Proficiency in Scientific Literature
  Analysis
SciAssess: Benchmarking LLM Proficiency in Scientific Literature Analysis
Hengxing Cai
Xiaochen Cai
Junhan Chang
Sihang Li
Lin Yao
...
Changhong Chen
Zheng Cheng
Zifeng Zhao
Linfeng Zhang
Guolin Ke
ELM
36
24
0
04 Mar 2024
LM4OPT: Unveiling the Potential of Large Language Models in Formulating
  Mathematical Optimization Problems
LM4OPT: Unveiling the Potential of Large Language Models in Formulating Mathematical Optimization Problems
Tasnim Ahmed
Salimur Choudhury
30
11
0
02 Mar 2024
Formulation Comparison for Timeline Construction using LLMs
Formulation Comparison for Timeline Construction using LLMs
Kimihiro Hasegawa
Nikhil Kandukuri
Susan Holm
Yukari Yamakawa
Teruko Mitamura
41
0
0
01 Mar 2024
Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient
  Tuning
Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning
Weijieying Ren
Xinlong Li
Lei Wang
Tianxiang Zhao
Wei Qin
CLL
KELM
40
34
0
29 Feb 2024
KoDialogBench: Evaluating Conversational Understanding of Language
  Models with Korean Dialogue Benchmark
KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark
Seongbo Jang
Seonghyeon Lee
Hwanjo Yu
ELM
29
0
0
27 Feb 2024
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical
  Reasoning
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning
Debrup Das
Debopriyo Banerjee
Somak Aditya
Ashish Kulkarni
ReLM
LRM
36
10
0
27 Feb 2024
Nemotron-4 15B Technical Report
Nemotron-4 15B Technical Report
Jupinder Parmar
Shrimai Prabhumoye
Joseph Jennings
M. Patwary
Sandeep Subramanian
...
Ashwath Aithal
Oleksii Kuchaiev
M. Shoeybi
Jonathan Cohen
Bryan Catanzaro
36
22
0
26 Feb 2024
SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection
SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection
Liangxin Liu
Xuebo Liu
Derek F. Wong
Dongfang Li
Ziyi Wang
Baotian Hu
Min Zhang
53
17
0
26 Feb 2024
PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA
PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA
Sheng Wang
Boyang Xue
Jiacheng Ye
Jiyue Jiang
Liheng Chen
Lingpeng Kong
Chuan Wu
30
14
0
24 Feb 2024
Unintended Impacts of LLM Alignment on Global Representation
Unintended Impacts of LLM Alignment on Global Representation
Michael Joseph Ryan
William B. Held
Diyi Yang
45
41
0
22 Feb 2024
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
Zicheng Lin
Zhibin Gou
Tian Liang
Ruilin Luo
Haowei Liu
Yujiu Yang
LRM
42
43
0
22 Feb 2024
Balanced Data Sampling for Language Model Training with Clustering
Balanced Data Sampling for Language Model Training with Clustering
Yunfan Shao
Linyang Li
Zhaoye Fei
Hang Yan
Dahua Lin
Xipeng Qiu
37
9
0
22 Feb 2024
BIRCO: A Benchmark of Information Retrieval Tasks with Complex
  Objectives
BIRCO: A Benchmark of Information Retrieval Tasks with Complex Objectives
Xiaoyue Wang
Jianyou Wang
Weili Cao
Kaicheng Wang
R. Paturi
Leon Bergen
37
6
0
21 Feb 2024
Making Reasoning Matter: Measuring and Improving Faithfulness of
  Chain-of-Thought Reasoning
Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning
Debjit Paul
Robert West
Antoine Bosselut
Boi Faltings
ReLM
LRM
41
21
0
21 Feb 2024
Dynamic Evaluation of Large Language Models by Meta Probing Agents
Dynamic Evaluation of Large Language Models by Meta Probing Agents
Kaijie Zhu
Jindong Wang
Qinlin Zhao
Ruochen Xu
Xing Xie
50
31
0
21 Feb 2024
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity
  within Large Language Models
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models
Chenyang Song
Xu Han
Zhengyan Zhang
Shengding Hu
Xiyu Shi
...
Chen Chen
Zhiyuan Liu
Guanglin Li
Tao Yang
Maosong Sun
53
24
0
21 Feb 2024
Structure Guided Prompt: Instructing Large Language Model in Multi-Step
  Reasoning by Exploring Graph Structure of the Text
Structure Guided Prompt: Instructing Large Language Model in Multi-Step Reasoning by Exploring Graph Structure of the Text
Kewei Cheng
Nesreen K. Ahmed
Theodore L. Willke
Yizhou Sun
LRM
54
5
0
20 Feb 2024
TreeEval: Benchmark-Free Evaluation of Large Language Models through
  Tree Planning
TreeEval: Benchmark-Free Evaluation of Large Language Models through Tree Planning
Xiang Li
Yunshi Lan
Chao Yang
ELM
46
8
0
20 Feb 2024
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for
  Language Models
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models
Haoran Li
Qingxiu Dong
Zhengyang Tang
Chaojun Wang
Xingxing Zhang
...
Wei Lu
Zhifang Sui
Benyou Wang
Wai Lam
Furu Wei
SyDa
56
51
0
20 Feb 2024
Chain of Thought Empowers Transformers to Solve Inherently Serial
  Problems
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Zhiyuan Li
Hong Liu
Denny Zhou
Tengyu Ma
LRM
AI4CE
30
101
0
20 Feb 2024
HyperMoE: Towards Better Mixture of Experts via Transferring Among
  Experts
HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts
Hao Zhao
Zihan Qiu
Huijia Wu
Zili Wang
Zhaofeng He
Jie Fu
MoE
32
10
0
20 Feb 2024
AnaloBench: Benchmarking the Identification of Abstract and Long-context
  Analogies
AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies
Xiao Ye
Andrew Wang
Jacob Choi
Yining Lu
Shreya Sharma
Lingfeng Shen
Vijay Tiyyala
Nicholas Andrews
Daniel Khashabi
ELM
41
8
0
19 Feb 2024
Reformatted Alignment
Reformatted Alignment
Run-Ze Fan
Xuefeng Li
Haoyang Zou
Junlong Li
Shwai He
Ethan Chern
Jiewen Hu
Pengfei Liu
65
8
0
19 Feb 2024
Revisiting Knowledge Distillation for Autoregressive Language Models
Revisiting Knowledge Distillation for Autoregressive Language Models
Qihuang Zhong
Liang Ding
Li Shen
Juhua Liu
Bo Du
Dacheng Tao
KELM
47
16
0
19 Feb 2024
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference
  Dataset and Modular Fine-tuning Schema
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema
Junru Lu
Siyu An
Min Zhang
Yulan He
Di Yin
Xing Sun
48
2
0
19 Feb 2024
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models
S. Hayati
Taehee Jung
Tristan Bodding-Long
Sudipta Kar
A. Sethy
Joo-Kyung Kim
Dongyeop Kang
ALM
LRM
38
6
0
18 Feb 2024
Benchmarking Knowledge Boundary for Large Language Models: A Different
  Perspective on Model Evaluation
Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation
Xunjian Yin
Xu Zhang
Jie Ruan
Xiaojun Wan
ELM
36
17
0
18 Feb 2024
PhaseEvo: Towards Unified In-Context Prompt Optimization for Large
  Language Models
PhaseEvo: Towards Unified In-Context Prompt Optimization for Large Language Models
Wendi Cui
Jiaxin Zhang
Zhuohang Li
Hao Sun
Damien Lopez
Kamalika Das
Bradley Malin
Kumar Sricharan
22
7
0
17 Feb 2024
Chain-of-Thought Reasoning Without Prompting
Chain-of-Thought Reasoning Without Prompting
Xuezhi Wang
Denny Zhou
ReLM
LRM
152
102
0
15 Feb 2024
BitDelta: Your Fine-Tune May Only Be Worth One Bit
BitDelta: Your Fine-Tune May Only Be Worth One Bit
James Liu
Guangxuan Xiao
Kai Li
Jason D. Lee
Song Han
Tri Dao
Tianle Cai
33
21
0
15 Feb 2024
Both Matter: Enhancing the Emotional Intelligence of Large Language
  Models without Compromising the General Intelligence
Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence
Weixiang Zhao
Zhuojun Li
Shilong Wang
Yang Wang
Yulin Hu
Yanyan Zhao
Chen Wei
Bing Qin
22
4
0
15 Feb 2024
NutePrune: Efficient Progressive Pruning with Numerous Teachers for
  Large Language Models
NutePrune: Efficient Progressive Pruning with Numerous Teachers for Large Language Models
Shengrui Li
Junzhe Chen
Xueting Han
Jing Bai
24
6
0
15 Feb 2024
Efficient Prompt Optimization Through the Lens of Best Arm
  Identification
Efficient Prompt Optimization Through the Lens of Best Arm Identification
Chengshuai Shi
Kun Yang
Zihan Chen
Jundong Li
Jing Yang
Cong Shen
50
6
0
15 Feb 2024
AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential
  Reasoning Ability
AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential Reasoning Ability
Siwei Yang
Bingchen Zhao
Cihang Xie
LRM
17
6
0
14 Feb 2024
InstructGraph: Boosting Large Language Models via Graph-centric
  Instruction Tuning and Preference Alignment
InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment
Jianing Wang
Junda Wu
Yupeng Hou
Yao Liu
Ming Gao
Julian McAuley
33
32
0
13 Feb 2024
Towards an Understanding of Stepwise Inference in Transformers: A
  Synthetic Graph Navigation Model
Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model
Mikail Khona
Maya Okawa
Jan Hula
Rahul Ramesh
Kento Nishi
Robert P. Dick
Ekdeep Singh Lubana
Hidenori Tanaka
46
5
0
12 Feb 2024
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language
  Models
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
Jiacheng Ye
Shansan Gong
Liheng Chen
Lin Zheng
Jiahui Gao
...
Chuan Wu
Xin Jiang
Zhenguo Li
Wei Bi
Lingpeng Kong
DiffM
LRM
AI4CE
59
13
0
12 Feb 2024
Previous
123...91011...141516
Next