ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.13545
  4. Cited By
IQA-EVAL: Automatic Evaluation of Human-Model Interactive Question
  Answering

IQA-EVAL: Automatic Evaluation of Human-Model Interactive Question Answering

24 August 2024
Ruosen Li
Barry Wang
Ruochen Li
Xinya Du
    ELM
ArXivPDFHTML

Papers citing "IQA-EVAL: Automatic Evaluation of Human-Model Interactive Question Answering"

17 / 17 papers shown
Title
PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations
PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations
Ruosen Li
Teerth Patel
Xinya Du
LLMAG
ALM
124
101
0
03 Jan 2025
Thinking Fair and Slow: On the Efficacy of Structured Prompts for
  Debiasing Language Models
Thinking Fair and Slow: On the Efficacy of Structured Prompts for Debiasing Language Models
Shaz Furniturewala
Surgan Jandial
Abhinav Java
Pragyan Banerjee
Simra Shahid
Sumita Bhatia
Kokil Jaidka
78
11
0
16 May 2024
The Rise and Potential of Large Language Model Based Agents: A Survey
The Rise and Potential of Large Language Model Based Agents: A Survey
Zhiheng Xi
Wenxiang Chen
Xin Guo
Wei He
Yiwen Ding
...
Wenjuan Qin
Yongyan Zheng
Xipeng Qiu
Xuanjing Huan
Tao Gui
LM&MA
LM&Ro
3DV
AI4CE
92
919
0
14 Sep 2023
InterCode: Standardizing and Benchmarking Interactive Coding with
  Execution Feedback
InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback
John Yang
Akshara Prabhakar
Karthik Narasimhan
Shunyu Yao
81
109
0
26 Jun 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
314
4,288
0
09 Jun 2023
Evaluating Human-Language Model Interaction
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Percy Liang
LM&MA
ALM
85
102
0
19 Dec 2022
Out of One, Many: Using Language Models to Simulate Human Samples
Out of One, Many: Using Language Models to Simulate Human Samples
Lisa P. Argyle
Ethan C. Busby
Nancy Fulda
Joshua R Gubler
Christopher Rytting
David Wingate
SyDa
79
589
0
14 Sep 2022
Emergent Abilities of Large Language Models
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELM
ReLM
LRM
265
2,462
0
15 Jun 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
742
9,330
0
28 Jan 2022
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language
  Models
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
Samuel Gehman
Suchin Gururangan
Maarten Sap
Yejin Choi
Noah A. Smith
133
1,194
0
24 Sep 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
706
41,736
0
28 May 2020
Personalizing Dialogue Agents via Meta-Learning
Personalizing Dialogue Agents via Meta-Learning
Zhaojiang Lin
Andrea Madotto
Chien-Sheng Wu
Pascale Fung
133
183
0
24 May 2019
Survey on Evaluation Methods for Dialogue Systems
Survey on Evaluation Methods for Dialogue Systems
Jan Deriu
Álvaro Rodrigo
Arantxa Otegi
Guillermo Echegoyen
S. Rosset
Eneko Agirre
Mark Cieliebak
52
282
0
10 May 2019
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question
  Answering
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Zhilin Yang
Peng Qi
Saizheng Zhang
Yoshua Bengio
William W. Cohen
Ruslan Salakhutdinov
Christopher D. Manning
RALM
150
2,635
0
25 Sep 2018
QuAC : Question Answering in Context
QuAC : Question Answering in Context
Eunsol Choi
He He
Mohit Iyyer
Mark Yatskar
Wen-tau Yih
Yejin Choi
Percy Liang
Luke Zettlemoyer
104
826
0
21 Aug 2018
Personalizing Dialogue Agents: I have a dog, do you have pets too?
Personalizing Dialogue Agents: I have a dog, do you have pets too?
Saizheng Zhang
Emily Dinan
Jack Urbanek
Arthur Szlam
Douwe Kiela
Jason Weston
89
1,453
0
22 Jan 2018
A Neural Conversational Model
A Neural Conversational Model
Oriol Vinyals
Quoc V. Le
BDL
128
1,767
0
19 Jun 2015
1