Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.16634
Cited By
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
29 March 2023
Yang Liu
Dan Iter
Yichong Xu
Shuohang Wang
Ruochen Xu
Chenguang Zhu
ELM
ALM
LM&MA
Re-assign community
ArXiv
PDF
HTML
Papers citing
"G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment"
50 / 763 papers shown
Title
Calibrating Long-form Generations from Large Language Models
Yukun Huang
Yixin Liu
Raghuveer Thirukovalluru
Arman Cohan
Bhuwan Dhingra
27
7
0
09 Feb 2024
The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate
Juhyun Oh
Eunsu Kim
Inha Cha
Alice Oh
ELM
49
8
0
09 Feb 2024
Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing
Yong Cao
Wenyan Li
Jiaang Li
Yifei Yuan
Antonia Karamolegkou
Daniel Hershcovich
VLM
48
8
0
08 Feb 2024
UFO: A UI-Focused Agent for Windows OS Interaction
Chaoyun Zhang
Liqun Li
Shilin He
Xu Zhang
Bo Qiao
...
Yu Kang
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
Qi Zhang
LLMAG
66
70
0
08 Feb 2024
FaithLM: Towards Faithful Explanations for Large Language Models
Yu-Neng Chuang
Guanchu Wang
Chia-Yuan Chang
Ruixiang Tang
Shaochen Zhong
Fan Yang
Mengnan Du
Xuanting Cai
Xia Hu
LRM
77
0
0
07 Feb 2024
The Future of Cognitive Strategy-enhanced Persuasive Dialogue Agents: New Perspectives and Trends
Mengqi Chen
Bin Guo
Hao Wang
Haoyu Li
Qian Zhao
Jingqi Liu
Yasan Ding
Yan Pan
Zhiwen Yu
LLMAG
43
1
0
07 Feb 2024
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs
Simone Balloccu
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
SILM
ELM
PILM
35
159
0
06 Feb 2024
Empowering Language Models with Active Inquiry for Deeper Understanding
Jing-Cheng Pang
Heng-Bo Fan
Pengyuan Wang
Jia-Hao Xiao
Nan Tang
Si-Hang Yang
Chengxing Jia
Sheng-Jun Huang
Yang Yu
21
5
0
06 Feb 2024
Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models
Zhiyuan Hu
Chumin Liu
Xidong Feng
Yilun Zhao
See-Kiong Ng
A. Luu
Junxian He
Pang Wei Koh
Bryan Hooi
LRM
58
11
0
05 Feb 2024
DeAL: Decoding-time Alignment for Large Language Models
James Y. Huang
Sailik Sengupta
Daniele Bonadiman
Yi-An Lai
Arshit Gupta
Nikolaos Pappas
Saab Mansour
Katrin Kirchoff
Dan Roth
64
29
0
05 Feb 2024
Probing Critical Learning Dynamics of PLMs for Hate Speech Detection
Sarah Masud
Mohammad Aflah Khan
Vikram Goyal
Md. Shad Akhtar
Tanmoy Chakraborty
26
0
0
03 Feb 2024
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELM
LM&MA
73
30
0
02 Feb 2024
Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
Ming Li
Yong Zhang
Shwai He
Zhitao Li
Hongyu Zhao
Jianzong Wang
Ning Cheng
Dinesh Manocha
35
69
0
01 Feb 2024
Making a Long Story Short in Conversation Modeling
Yufei Tao
Tiernan Mines
Ameeta Agrawal
30
0
0
31 Jan 2024
Synthetic Dialogue Dataset Generation using LLM Agents
Yelaman Abdullin
Diego Mollá Aliod
B. Ofoghi
John Yearwood
Qingyang Li
26
29
0
30 Jan 2024
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Ansar Aynetdinov
Alan Akbik
ALM
44
12
0
30 Jan 2024
MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models
Wai-Chung Kwan
Xingshan Zeng
Yuxin Jiang
Yufei Wang
Liangyou Li
Lifeng Shang
Xin Jiang
Qun Liu
Kam-Fai Wong
LRM
ELM
25
13
0
30 Jan 2024
E-EVAL: A Comprehensive Chinese K-12 Education Evaluation Benchmark for Large Language Models
Jinchang Hou
Chang Ao
Haihong Wu
Xiangtao Kong
Zhigang Zheng
...
Chengming Li
Xiping Hu
Ruifeng Xu
Shiwen Ni
Min Yang
AI4Ed
ELM
29
6
0
29 Jan 2024
Equipping Language Models with Tool Use Capability for Tabular Data Analysis in Finance
Adrian Theuma
Ehsan Shareghi
32
4
0
27 Jan 2024
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models
Haochen Tan
Zhijiang Guo
Zhan Shi
Lu Xu
Zhili Liu
...
Xiaoguang Li
Yasheng Wang
Lifeng Shang
Qun Liu
Linqi Song
48
12
0
26 Jan 2024
F-Eval: Asssessing Fundamental Abilities with Refined Evaluation Methods
Yu Sun
Keyu Chen
Shujie Wang
Qipeng Guo
Hang Yan
Xipeng Qiu
Xuanjing Huang
Dahua Lin
ELM
19
0
0
26 Jan 2024
From RAG to QA-RAG: Integrating Generative AI for Pharmaceutical Regulatory Compliance Process
Jaewoong Kim
Moohong Min
49
0
0
26 Jan 2024
Generating Zero-shot Abstractive Explanations for Rumour Verification
I. Bilal
Preslav Nakov
Rob Procter
M. Liakata
24
0
0
23 Jan 2024
Assessing and Understanding Creativity in Large Language Models
Yunpu Zhao
Rui Zhang
Wenyi Li
Di Huang
Jiaming Guo
...
Xingui Hu
Zidong Du
Qi Guo
Ling Li
Yunji Chen
LRM
37
20
0
23 Jan 2024
Beyond Traditional Benchmarks: Analyzing Behaviors of Open LLMs on Data-to-Text Generation
Zdeněk Kasner
Ondrej Dusek
33
8
0
18 Jan 2024
Evolutionary Computation in the Era of Large Language Model: Survey and Roadmap
Xingyu Wu
Sheng-hao Wu
Jibin Wu
Liang Feng
Kay Chen Tan
ELM
53
59
0
18 Jan 2024
Salute the Classic: Revisiting Challenges of Machine Translation in the Age of Large Language Models
Jianhui Pang
Fanghua Ye
Longyue Wang
Dian Yu
Derek F. Wong
Shuming Shi
Zhaopeng Tu
ALM
46
6
0
16 Jan 2024
The Chronicles of RAG: The Retriever, the Chunk and the Generator
Paulo Finardi
Leonardo Avila
Rodrigo Castaldoni
P. Gengo
Celio H. N. Larcher
Marcos Piau
Pablo B. Costa
Vinicius Fernandes Caridá
RALM
22
29
0
15 Jan 2024
Lost in the Source Language: How Large Language Models Evaluate the Quality of Machine Translation
Xu Huang
Zhirui Zhang
Xiang Geng
Yichao Du
Jiajun Chen
Shujian Huang
53
7
0
12 Jan 2024
INACIA: Integrating Large Language Models in Brazilian Audit Courts: Opportunities and Challenges
J. Pereira
Andre Assumpcao
J. Trecenti
Luiz Airosa
Caio Lente
Jhonatan Cléto
Guilherme Dobins
Rodrigo Nogueira
Luis Mitchell
R. Lotufo
40
2
0
10 Jan 2024
Reinforcement Learning for Optimizing RAG for Domain Chatbots
Mandar Kulkarni
Praveen Tangarajan
Kyung Kim
Anusua Trivedi
OffRL
RALM
SILM
28
25
0
10 Jan 2024
Can Large Language Models Beat Wall Street? Unveiling the Potential of AI in Stock Selection
G. Fatouros
Konstantinos Metaxas
John Soldatos
D. Kyriazis
AIFin
44
20
0
08 Jan 2024
InFoBench: Evaluating Instruction Following Ability in Large Language Models
Yiwei Qin
Kaiqiang Song
Yebowen Hu
Wenlin Yao
Sangwoo Cho
Xiaoyang Wang
Xuansheng Wu
Fei Liu
Pengfei Liu
Dong Yu
ELM
36
42
0
07 Jan 2024
Large Language Models for Social Networks: Applications, Challenges, and Solutions
Jingying Zeng
Richard Huang
Waleed Malik
Langxuan Yin
Bojan Babic
Danny Shacham
Xiao Yan
Jaewon Yang
Qi He
22
7
0
04 Jan 2024
DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models
Wendi Cui
Jiaxin Zhang
Zhuohang Li
Lopez Damien
Kamalika Das
Bradley Malin
Kumar Sricharan
30
2
0
04 Jan 2024
Social Media Ready Caption Generation for Brands
Himanshu Maheshwari
Koustava Goswami
Apoorv Saxena
Balaji Vasan Srinivasan
24
1
0
03 Jan 2024
LLM Harmony: Multi-Agent Communication for Problem Solving
Sumedh Rasal
LLMAG
24
22
0
02 Jan 2024
BatchEval: Towards Human-like Text Evaluation
Peiwen Yuan
Shaoxiong Feng
Yiwei Li
Xinglin Wang
Boyuan Pan
Heda Wang
Kan Li
ALM
31
11
0
31 Dec 2023
Jatmo: Prompt Injection Defense by Task-Specific Finetuning
Julien Piet
Maha Alrashed
Chawin Sitawarin
Sizhe Chen
Zeming Wei
Elizabeth Sun
Basel Alomair
David Wagner
AAML
SyDa
83
53
0
29 Dec 2023
A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators
Chen Zhang
L. F. D’Haro
Yiming Chen
Malu Zhang
Haizhou Li
ELM
21
29
0
24 Dec 2023
LingoQA: Video Question Answering for Autonomous Driving
Ana-Maria Marcu
Long Chen
Jan Hünermann
Alice Karnsund
Benoît Hanotte
...
Vijay Badrinarayanan
Alex Kendall
Jamie Shotton
Elahe Arani
Oleg Sinavski
29
32
0
21 Dec 2023
Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?
Tannon Kew
Florian Schottmann
Rico Sennrich
LRM
34
36
0
20 Dec 2023
Faithful Persona-based Conversational Dataset Generation with Large Language Models
Pegah Jandaghi
XiangHai Sheng
Xinyi Bai
Jay Pujara
Hakim Sidahmed
39
21
0
15 Dec 2023
LLMEval: A Preliminary Study on How to Evaluate Large Language Models
Yue Zhang
Ming Zhang
Haipeng Yuan
Shichun Liu
Yongyao Shi
Tao Gui
Qi Zhang
Xuanjing Huang
ALM
ELM
24
10
0
12 Dec 2023
Honeybee: Locality-enhanced Projector for Multimodal LLM
Junbum Cha
Wooyoung Kang
Jonghwan Mun
Byungseok Roh
MLLM
43
113
0
11 Dec 2023
Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding
Lifu Tu
Semih Yavuz
Jin Qu
Jiacheng Xu
Rui Meng
Caiming Xiong
Yingbo Zhou
24
1
0
11 Dec 2023
Make Them Spill the Beans! Coercive Knowledge Extraction from (Production) LLMs
Zhuo Zhang
Guangyu Shen
Guanhong Tao
Shuyang Cheng
Xiangyu Zhang
41
13
0
08 Dec 2023
Building Trustworthy NeuroSymbolic AI Systems: Consistency, Reliability, Explainability, and Safety
Manas Gaur
Amit P. Sheth
26
17
0
05 Dec 2023
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning
Bill Yuchen Lin
Abhilasha Ravichander
Ximing Lu
Nouha Dziri
Melanie Sclar
Khyathi Raghavi Chandu
Chandra Bhagavatula
Yejin Choi
22
169
0
04 Dec 2023
Mark My Words: Analyzing and Evaluating Language Model Watermarks
Julien Piet
Chawin Sitawarin
Vivian Fang
Norman Mu
David Wagner
WaLM
45
33
0
01 Dec 2023
Previous
1
2
3
...
10
11
12
...
14
15
16
Next