ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.13542
  4. Cited By
Self-play with Execution Feedback: Improving Instruction-following
  Capabilities of Large Language Models

Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models

19 June 2024
Guanting Dong
Keming Lu
Chengpeng Li
Tingyu Xia
Bowen Yu
Chang Zhou
Jingren Zhou
    SyDa
    ALM
    LRM
ArXivPDFHTML

Papers citing "Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models"

12 / 12 papers shown
Title
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
X. Li
Jiajie Jin
Guanting Dong
Hongjin Qian
Yutao Zhu
Yongkang Wu
Ji-Rong Wen
Zhicheng Dou
LLMAG
LRM
97
2
0
30 Apr 2025
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Jiale Cheng
Xiao-Chang Liu
C. Wang
Xiaotao Gu
Yunfan LU
Dan Zhang
Yuxiao Dong
J. Tang
Hongning Wang
Minlie Huang
LRM
126
3
0
16 Dec 2024
Language Models can Self-Lengthen to Generate Long Texts
Language Models can Self-Lengthen to Generate Long Texts
Shanghaoran Quan
Tianyi Tang
Bowen Yu
An Yang
Dayiheng Liu
Bofei Gao
Jianhong Tu
Yichang Zhang
Jingren Zhou
Junyang Lin
ALM
SyDa
55
7
0
31 Oct 2024
Montessori-Instruct: Generate Influential Training Data Tailored for
  Student Learning
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li
Zichun Yu
Chenyan Xiong
SyDa
33
1
0
18 Oct 2024
MMMT-IF: A Challenging Multimodal Multi-Turn Instruction Following
  Benchmark
MMMT-IF: A Challenging Multimodal Multi-Turn Instruction Following Benchmark
Elliot L. Epstein
Kaisheng Yao
Jing Li
Xinyi Bai
Hamid Palangi
LRM
47
0
0
26 Sep 2024
Qwen2 Technical Report
Qwen2 Technical Report
An Yang
Baosong Yang
Binyuan Hui
Jian Xu
Bowen Yu
...
Yuqiong Liu
Zeyu Cui
Zhenru Zhang
Zhifang Guo
Zhi-Wei Fan
OSLM
VLM
MU
60
792
0
15 Jul 2024
We-Math: Does Your Large Multimodal Model Achieve Human-like
  Mathematical Reasoning?
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?
Runqi Qiao
Qiuna Tan
Guanting Dong
Minhui Wu
Chong Sun
...
Yida Xu
Muxi Diao
Zhimin Bao
Chen Li
Honggang Zhang
VLM
LRM
44
31
0
01 Jul 2024
Understand What LLM Needs: Dual Preference Alignment for
  Retrieval-Augmented Generation
Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation
Guanting Dong
Yutao Zhu
Chenghao Zhang
Zechen Wang
Zhicheng Dou
Ji-Rong Wen
RALM
44
10
0
26 Jun 2024
Towards Scalable Automated Alignment of LLMs: A Survey
Towards Scalable Automated Alignment of LLMs: A Survey
Boxi Cao
Keming Lu
Xinyu Lu
Jiawei Chen
Mengjie Ren
...
Xianpei Han
Xianpei Han
Le Sun
Hongyu Lin
Bowen Yu
LM&MA
25
23
0
03 Jun 2024
FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research
FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research
Jiajie Jin
Yutao Zhu
Xinyu Yang
Chenghao Zhang
Zhicheng Dou
Chenghao Zhang
Tong Zhao
Zhao Yang
Zhicheng Dou
Ji-Rong Wen
VLM
85
49
0
22 May 2024
Instruction Tuning with GPT-4
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
162
585
0
06 Apr 2023
CodeRL: Mastering Code Generation through Pretrained Models and Deep
  Reinforcement Learning
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
Hung Le
Yue Wang
Akhilesh Deepak Gotmare
Silvio Savarese
Guosheng Lin
SyDa
ALM
132
240
0
05 Jul 2022
1