Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.20973
Cited By
Towards Conversational Development Environments: Using Theory-of-Mind and Multi-Agent Architectures for Requirements Refinement
27 May 2025
Keheliya Gallaba
Ali Arabat
Dayi Lin
Mohammed Sayagh
Ahmed E. Hassan
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Conversational Development Environments: Using Theory-of-Mind and Multi-Agent Architectures for Requirements Refinement"
25 / 25 papers shown
Title
Preference Leakage: A Contamination Problem in LLM-as-a-judge
Dawei Li
Renliang Sun
Yue Huang
Ming Zhong
Bohan Jiang
Jiawei Han
Wei Wei
Wei Wang
Huan Liu
117
29
0
03 Feb 2025
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELM
AILaw
236
104
0
25 Nov 2024
Towards AI-Native Software Engineering (SE 3.0): A Vision and a Challenge Roadmap
Ahmed E. Hassan
G. Oliva
Dayi Lin
Boyuan Chen
Zhen Ming
Jiang
71
6
0
08 Oct 2024
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
John Yang
Carlos E. Jimenez
Alex Zhang
K. Lieret
Joyce Yang
...
Gabriel Synnaeve
Karthik Narasimhan
Diyi Yang
Sida I. Wang
Ofir Press
64
32
0
04 Oct 2024
MuMA-ToM: Multi-modal Multi-Agent Theory of Mind
Haojun Shi
Suyu Ye
Xinyu Fang
Chuanyang Jin
Leyla Isik
Yen-Ling Kuo
Tianmin Shu
LLMAG
101
14
0
22 Aug 2024
What should I wear to a party in a Greek taverna? Evaluation for Conversational Agents in the Fashion Domain
Antonis Maronikolakis
Ana Peleteiro Ramallo
Weiwei Cheng
Thomas Kober
LLMAG
37
2
0
13 Aug 2024
Can LLMs Replace Manual Annotation of Software Engineering Artifacts?
Toufique Ahmed
Premkumar Devanbu
Christoph Treude
Michael Pradel
109
16
0
10 Aug 2024
SpecRover: Code Intent Extraction via LLMs
Haifeng Ruan
Yuntong Zhang
Abhik Roychoudhury
52
23
0
05 Aug 2024
Automatic Generation of Fashion Images using Prompting in Generative Machine Learning Models
Georgia Argyrou
Angeliki Dimitriou
Maria Lymperaiou
Giorgos Filandrianos
Giorgos Stamou
63
4
0
20 Jul 2024
Scaling Synthetic Data Creation with 1,000,000,000 Personas
Tao Ge
Xin Chan
Dian Yu
Haitao Mi
Dong Yu
Dong Yu
SyDa
156
142
0
28 Jun 2024
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
Pat Verga
Sebastian Hofstatter
Sophia Althammer
Yixuan Su
Aleksandra Piktus
Arkady Arkhangorodsky
Minjie Xu
Naomi White
Patrick Lewis
ALM
ELM
91
99
0
29 Apr 2024
LLM Evaluators Recognize and Favor Their Own Generations
Arjun Panickssery
Samuel R. Bowman
Shi Feng
84
185
0
15 Apr 2024
Exploring the Impact of the Output Format on the Evaluation of Large Language Models for Code Translation
Marcos Macedo
Yuan Tian
F. Côgo
Bram Adams
56
17
0
25 Mar 2024
Let the LLMs Talk: Simulating Human-to-Human Conversational QA via Zero-Shot LLM-to-LLM Interactions
Zahra Abbasiantaeb
Yifei Yuan
Evangelos Kanoulas
Mohammad Aliannejadi
64
64
0
05 Dec 2023
Magicoder: Empowering Code Generation with OSS-Instruct
Yuxiang Wei
Zhe Wang
Jiawei Liu
Yifeng Ding
Lingming Zhang
SyDa
52
111
0
04 Dec 2023
Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities
Alex Wilf
Sihyun Shawn Lee
Paul Pu Liang
Louis-Philippe Morency
LRM
82
44
0
16 Nov 2023
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Xuhui Zhou
Hao Zhu
Leena Mathur
Ruohong Zhang
Haofei Yu
...
Louis-Philippe Morency
Yonatan Bisk
Daniel Fried
Graham Neubig
Maarten Sap
LLMAG
69
142
0
18 Oct 2023
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
Ziyang Luo
Can Xu
Pu Zhao
Qingfeng Sun
Xiubo Geng
Wenxiang Hu
Chongyang Tao
Jing Ma
Qingwei Lin
Daxin Jiang
ELM
SyDa
ALM
79
678
0
14 Jun 2023
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
Ge Li
Hasan Hammoud
Hani Itani
Dmitrii Khizbullin
Guohao Li
SyDa
ALM
118
488
0
31 Mar 2023
Large Language Models Are State-of-the-Art Evaluators of Translation Quality
Tom Kocmi
C. Federmann
ELM
88
361
0
28 Feb 2023
An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation
Max Schäfer
Sarah Nadi
A. Eghbali
F. Tip
LM&MA
55
245
0
13 Feb 2023
Large Language Models Can Be Easily Distracted by Irrelevant Context
Freda Shi
Xinyun Chen
Kanishka Misra
Nathan Scales
David Dohan
Ed H. Chi
Nathanael Scharli
Denny Zhou
ReLM
RALM
LRM
96
583
0
31 Jan 2023
Mutual Theory of Mind for Human-AI Communication
Qiaosi Wang
Ashok K. Goel
30
12
0
07 Oct 2022
The Use of NLP-Based Text Representation Techniques to Support Requirement Engineering Tasks: A Systematic Mapping Review
R. Sonbol
Ghaida Rebdawi
Nada Ghneim
AI4TS
13
30
0
17 May 2022
Neural Text Summarization: A Critical Evaluation
Wojciech Kry'sciñski
N. Keskar
Bryan McCann
Caiming Xiong
R. Socher
74
366
0
23 Aug 2019
1