ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.14898
  4. Cited By
InterCode: Standardizing and Benchmarking Interactive Coding with
  Execution Feedback

InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback

26 June 2023
John Yang
Akshara Prabhakar
Karthik R. Narasimhan
Shunyu Yao
ArXivPDFHTML

Papers citing "InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback"

33 / 83 papers shown
Title
Large Language Model-based Human-Agent Collaboration for Complex Task
  Solving
Large Language Model-based Human-Agent Collaboration for Complex Task Solving
Xueyang Feng
Zhiyuan Chen
Yujia Qin
Yankai Lin
Xu Chen
Zhiyuan Liu
Zhicheng Dou
LLMAG
54
18
0
20 Feb 2024
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
Runlong Zhou
Simon S. Du
Beibin Li
OffRL
47
3
0
20 Feb 2024
An Empirical Evaluation of LLMs for Solving Offensive Security
  Challenges
An Empirical Evaluation of LLMs for Solving Offensive Security Challenges
Minghao Shao
Boyuan Chen
Sofija Jancheska
Brendan Dolan-Gavitt
Siddharth Garg
Ramesh Karri
Muhammad Shafique
32
25
0
19 Feb 2024
When is Tree Search Useful for LLM Planning? It Depends on the
  Discriminator
When is Tree Search Useful for LLM Planning? It Depends on the Discriminator
Ziru Chen
Michael White
Raymond Mooney
Ali Payani
Yu-Chuan Su
Huan Sun
LLMAG
75
32
0
16 Feb 2024
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Zhiyong Wu
Chengcheng Han
Zichen Ding
Zhenmin Weng
Zhoumianze Liu
Shunyu Yao
Tao Yu
Lingpeng Kong
LLMAG
LM&Ro
135
82
0
12 Feb 2024
User Centric Evaluation of Code Generation Tools
User Centric Evaluation of Code Generation Tools
Tanha Miah
Hong Zhu
ELM
28
3
0
05 Feb 2024
Large Language Models as Hyper-Heuristics for Combinatorial Optimization
Large Language Models as Hyper-Heuristics for Combinatorial Optimization
Haoran Ye
Jiarui Wang
Zhiguang Cao
Federico Berto
Chuanbo Hua
Haeyeon Kim
Jinkyoo Park
Guojie Song
42
46
0
02 Feb 2024
Executable Code Actions Elicit Better LLM Agents
Executable Code Actions Elicit Better LLM Agents
Xingyao Wang
Yangyi Chen
Lifan Yuan
Yizhe Zhang
Yunzhu Li
Hao Peng
Heng Ji
ELM
LLMAG
LM&Ro
40
132
0
01 Feb 2024
EHRAgent: Code Empowers Large Language Models for Few-shot Complex
  Tabular Reasoning on Electronic Health Records
EHRAgent: Code Empowers Large Language Models for Few-shot Complex Tabular Reasoning on Electronic Health Records
Wenqi Shi
Ran Xu
Yuchen Zhuang
Yue Yu
Jieyu Zhang
Hang Wu
Yuanda Zhu
Joyce C. Ho
Carl Yang
M. D. Wang
29
27
0
13 Jan 2024
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks
Xueyu Hu
Ziyu Zhao
Shuang Wei
Ziwei Chai
Qianli Ma
...
Jiwei Li
Kun Kuang
Yang Yang
Hongxia Yang
Fei Wu
LMTD
ELM
24
45
0
10 Jan 2024
AI capabilities can be significantly improved without expensive
  retraining
AI capabilities can be significantly improved without expensive retraining
Tom Davidson
Jean-Stanislas Denain
Pablo Villalobos
Guillem Bas
OffRL
VLM
26
26
0
12 Dec 2023
Is Feedback All You Need? Leveraging Natural Language Feedback in
  Goal-Conditioned Reinforcement Learning
Is Feedback All You Need? Leveraging Natural Language Feedback in Goal-Conditioned Reinforcement Learning
Sabrina McCallum
Max Taylor-Davies
Stefano V. Albrecht
Alessandro Suglia
21
1
0
07 Dec 2023
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models
  Catching up?
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?
Hailin Chen
Fangkai Jiao
Xingxuan Li
Chengwei Qin
Mathieu Ravaut
Ruochen Zhao
Caiming Xiong
Chenyu You
ELM
CLL
AI4MH
LRM
ALM
85
27
0
28 Nov 2023
Robot Learning in the Era of Foundation Models: A Survey
Robot Learning in the Era of Foundation Models: A Survey
Xuan Xiao
Jiahang Liu
Zhipeng Wang
Yanmin Zhou
Yong Qi
Qian Cheng
Bin He
Shuo Jiang
AI4CE
LM&Ro
33
27
0
24 Nov 2023
Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback
Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback
Seungjun Moon
Hyungjoo Chae
Yongho Song
Taeyoon Kwon
Dongjin Kang
Kai Tzu-iunn Ong
Seung-won Hwang
Jinyoung Yeo
KELM
23
11
0
13 Nov 2023
Agent Lumos: Unified and Modular Training for Open-Source Language
  Agents
Agent Lumos: Unified and Modular Training for Open-Source Language Agents
Da Yin
Faeze Brahman
Abhilasha Ravichander
Khyathi Raghavi Chandu
Kai-Wei Chang
Yejin Choi
Bill Yuchen Lin
LLMAG
45
36
0
09 Nov 2023
ADaPT: As-Needed Decomposition and Planning with Language Models
ADaPT: As-Needed Decomposition and Planning with Language Models
Archiki Prasad
Alexander Koller
Mareike Hartmann
Peter Clark
Ashish Sabharwal
Mohit Bansal
Tushar Khot
LM&Ro
31
76
0
08 Nov 2023
Fine-Tuning Language Models Using Formal Methods Feedback
Fine-Tuning Language Models Using Formal Methods Feedback
Yunhao Yang
N. Bhatt
Tyler Ingebrand
William Ward
Steven Carr
Zhangyang Wang
Ufuk Topcu
29
9
0
27 Oct 2023
S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large
  Language Models
S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models
Fangyu Lei
Qian Liu
Yiming Huang
Shizhu He
Jun Zhao
Kang Liu
ELM
LRM
25
12
0
23 Oct 2023
OpenAgents: An Open Platform for Language Agents in the Wild
OpenAgents: An Open Platform for Language Agents in the Wild
Tianbao Xie
Fan Zhou
Zhoujun Cheng
Peng Shi
Luoxuan Weng
...
Yiheng Xu
Hongjin Su
Dongchan Shin
Caiming Xiong
Tao Yu
LLMAG
33
87
0
16 Oct 2023
Lemur: Harmonizing Natural Language and Code for Language Agents
Lemur: Harmonizing Natural Language and Code for Language Agents
Yiheng Xu
Hongjin Su
Chen Xing
Boyu Mi
Qian Liu
...
Siheng Zhao
Lingpeng Kong
Bailin Wang
Caiming Xiong
Tao Yu
32
68
0
10 Oct 2023
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
Carlos E. Jimenez
John Yang
Alexander Wettig
Shunyu Yao
Kexin Pei
Ofir Press
Karthik R. Narasimhan
ELM
34
473
0
10 Oct 2023
EcoAssistant: Using LLM Assistant More Affordably and Accurately
EcoAssistant: Using LLM Assistant More Affordably and Accurately
Jieyu Zhang
Ranjay Krishna
Ahmed Hassan Awadallah
Chi Wang
38
35
0
03 Oct 2023
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language
  Feedback
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
Xingyao Wang
Zihan Wang
Jiateng Liu
Yangyi Chen
Lifan Yuan
Hao Peng
Heng Ji
LRM
133
142
0
19 Sep 2023
Incremental Learning of Humanoid Robot Behavior from Natural Interaction
  and Large Language Models
Incremental Learning of Humanoid Robot Behavior from Natural Interaction and Large Language Models
Leonard Barmann
Rainer Kartmann
Fabian Peller-Konrad
Jan Niehues
Alexander H. Waibel
Tamim Asfour
LM&Ro
26
24
0
08 Sep 2023
Cognitive Architectures for Language Agents
Cognitive Architectures for Language Agents
T. Sumers
Shunyu Yao
Karthik R. Narasimhan
Thomas Griffiths
LLMAG
LM&Ro
56
154
0
05 Sep 2023
AgentBench: Evaluating LLMs as Agents
AgentBench: Evaluating LLMs as Agents
Xiao Liu
Hao Yu
Hanchen Zhang
Yifan Xu
Xuanyu Lei
...
Yu-Chuan Su
Huan Sun
Minlie Huang
Yuxiao Dong
Jie Tang
ELM
LLMAG
37
262
0
07 Aug 2023
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik R. Narasimhan
Yuan Cao
LLMAG
ReLM
LRM
273
2,510
0
06 Oct 2022
CodeRL: Mastering Code Generation through Pretrained Models and Deep
  Reinforcement Learning
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
Hung Le
Yue Wang
Akhilesh Deepak Gotmare
Silvio Savarese
Guosheng Lin
SyDa
ALM
135
240
0
05 Jul 2022
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for
  Code Understanding and Generation
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
Yue Wang
Weishi Wang
Chenyu You
Guosheng Lin
246
1,492
0
02 Sep 2021
Measuring Coding Challenge Competence With APPS
Measuring Coding Challenge Competence With APPS
Dan Hendrycks
Steven Basart
Saurav Kadavath
Mantas Mazeika
Akul Arora
...
Collin Burns
Samir Puranik
Horace He
D. Song
Jacob Steinhardt
ELM
AIMat
ALM
208
627
0
20 May 2021
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding
  and Generation
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Shuai Lu
Daya Guo
Shuo Ren
Junjie Huang
Alexey Svyatkovskiy
...
Nan Duan
Neel Sundaresan
Shao Kun Deng
Shengyu Fu
Shujie Liu
ELM
204
853
0
09 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
282
2,000
0
31 Dec 2020
Previous
12