Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.14520
Cited By
Large Language Models Are State-of-the-Art Evaluators of Translation Quality
28 February 2023
Tom Kocmi
C. Federmann
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Large Language Models Are State-of-the-Art Evaluators of Translation Quality"
50 / 229 papers shown
Title
CreativEval: Evaluating Creativity of LLM-Based Hardware Code Generation
Matthew DeLorenzo
Vasudev Gohil
Jeyavijayan Rajendran
38
11
0
12 Apr 2024
Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations
Dayeon Ki
Marine Carpuat
38
17
0
11 Apr 2024
Concept -- An Evaluation Protocol on Conversational Recommender Systems with System-centric and User-centric Factors
Chen Huang
Peixin Qin
Yang Deng
Wenqiang Lei
Jiancheng Lv
Tat-Seng Chua
39
6
0
04 Apr 2024
Testing the Effect of Code Documentation on Large Language Model Code Understanding
William Macke
Michael Doyle
36
1
0
03 Apr 2024
CheckEval: A reliable LLM-as-a-Judge framework for evaluating text generation using checklists
Yukyung Lee
Joonghoon Kim
Jaehee Kim
Hyowon Cho
Pilsung Kang
Pilsung Kang
Najoung Kim
ELM
47
4
0
27 Mar 2024
Large Language Models Are State-of-the-Art Evaluator for Grammatical Error Correction
Masamune Kobayashi
Masato Mita
Mamoru Komachi
ELM
42
3
0
26 Mar 2024
Optimization-based Prompt Injection Attack to LLM-as-a-Judge
Jiawen Shi
Zenghui Yuan
Yinuo Liu
Yue Huang
Pan Zhou
Lichao Sun
Neil Zhenqiang Gong
AAML
45
41
0
26 Mar 2024
Enhanced Facet Generation with LLM Editing
Joosung Lee
Jinhong Kim
21
2
0
25 Mar 2024
Multi-Review Fusion-in-Context
Aviv Slobodkin
Ori Shapira
Ran Levy
Ido Dagan
105
1
0
22 Mar 2024
From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation
Haofei Zhao
Yilun Liu
Shimin Tao
Weibin Meng
Yimeng Chen
Xiang Geng
Chang Su
Min Zhang
Hao Yang
34
9
0
21 Mar 2024
Enhancing Taiwanese Hokkien Dual Translation by Exploring and Standardizing of Four Writing Systems
Bo-Han Lu
Yi-Hsuan Lin
En-Shiun Annie Lee
Richard Tzong-Han Tsai
19
0
0
18 Mar 2024
Word Order's Impacts: Insights from Reordering and Generation Analysis
Qinghua Zhao
Jiaang Li
Lei Li
Zenghui Zhou
Junfeng Liu
30
0
0
18 Mar 2024
Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models
Laura Fernández-Becerra
Miguel Ángel González Santamarta
Ángel Manuel Guerrero Higueras
Francisco J. Rodríguez-Lera
Vicente Matellán Olivera
36
0
0
14 Mar 2024
LMStyle Benchmark: Evaluating Text Style Transfer for Chatbots
Jianlin Chen
43
4
0
13 Mar 2024
Duwak: Dual Watermarks in Large Language Models
Chaoyi Zhu
Jeroen Galjaard
Pin-Yu Chen
Lydia Y. Chen
AAML
WaLM
35
5
0
12 Mar 2024
Can Large Language Models Automatically Score Proficiency of Written Essays?
Watheq Mansour
Salam Albatarni
Sohaila Eltanbouly
Tamer Elsayed
ELM
18
14
0
10 Mar 2024
Revisiting Meta-evaluation for Grammatical Error Correction
Masamune Kobayashi
Masato Mita
Mamoru Komachi
46
0
0
05 Mar 2024
Improving the Validity of Automatically Generated Feedback via Reinforcement Learning
Alexander Scarlatos
Digory Smith
Simon Woodhead
Andrew S. Lan
OffRL
52
11
0
02 Mar 2024
Fine-Tuned Machine Translation Metrics Struggle in Unseen Domains
Vilém Zouhar
Shuoyang Ding
Anna Currey
Tatyana Badeka
Jenyuan Wang
Brian Thompson
33
14
0
28 Feb 2024
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
56
17
0
28 Feb 2024
Consistency Matters: Explore LLMs Consistency From a Black-Box Perspective
Fufangchen Zhao
Guoqiang Jin
Jiaheng Huang
Rui Zhao
Fei Tan
35
1
0
27 Feb 2024
Likelihood-based Mitigation of Evaluation Bias in Large Language Models
Masanari Ohi
Masahiro Kaneko
Ryuto Koike
Mengsay Loem
Naoaki Okazaki
40
4
0
25 Feb 2024
HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition
Yuxuan Liu
Tianchi Yang
Shaohan Huang
Zihan Zhang
Haizhen Huang
Furu Wei
Weiwei Deng
Feng Sun
Qi Zhang
34
13
0
24 Feb 2024
The Lay Person's Guide to Biomedicine: Orchestrating Large Language Models
Zheheng Luo
Qianqian Xie
Sophia Ananiadou
45
0
0
21 Feb 2024
Factual consistency evaluation of summarization in the Era of large language models
Zheheng Luo
Qianqian Xie
Sophia Ananiadou
HILM
35
1
0
21 Feb 2024
Are LLM-based Evaluators Confusing NLG Quality Criteria?
Xinyu Hu
Mingqi Gao
Sen Hu
Yang Zhang
Yicheng Chen
Teng Xu
Xiaojun Wan
AAML
ELM
46
22
0
19 Feb 2024
One Prompt To Rule Them All: LLMs for Opinion Summary Evaluation
Tejpalsingh Siledar
Swaroop Nath
Sankara Sri Raghava Ravindra Muddu
Rupasai Rangaraju
Swaprava Nath
...
Suman Banerjee
Amey Patil
Sudhanshu Singh
M. Chelliah
Nikesh Garera
ALM
LRM
35
6
0
18 Feb 2024
Improving Black-box Robustness with In-Context Rewriting
Kyle O'Brien
Nathan Ng
Isha Puri
Jorge Mendez
Hamid Palangi
Yoon Kim
Marzyeh Ghassemi
Tom Hartvigsen
52
6
0
13 Feb 2024
Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing
Yong Cao
Wenyan Li
Jiaang Li
Yifei Yuan
Antonia Karamolegkou
Daniel Hershcovich
VLM
33
7
0
08 Feb 2024
In-context learning agents are asymmetric belief updaters
Johannes A. Schubert
Akshay K. Jagadish
Marcel Binz
Eric Schulz
LLMAG
15
9
0
06 Feb 2024
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs
Simone Balloccu
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
SILM
ELM
PILM
26
158
0
06 Feb 2024
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELM
LM&MA
65
29
0
02 Feb 2024
MT-Ranker: Reference-free machine translation evaluation by inter-system ranking
Ibraheem Muhammad Moosa
Rui Zhang
Wenpeng Yin
22
5
0
30 Jan 2024
Leveraging Professional Radiologists' Expertise to Enhance LLMs' Evaluation for Radiology Reports
Qingqing Zhu
Xiuying Chen
Qiao Jin
Benjamin Hou
T. Mathai
Pritam Mukherjee
Xin Gao
Ronald M. Summers
Zhiyong Lu
LM&MA
23
5
0
29 Jan 2024
Beyond Traditional Benchmarks: Analyzing Behaviors of Open LLMs on Data-to-Text Generation
Zdeněk Kasner
Ondrej Dusek
33
8
0
18 Jan 2024
Gradable ChatGPT Translation Evaluation
Hui Jiao
Bei Peng
Lu Zong
Xiaojun Zhang
Xinwei Li
33
2
0
18 Jan 2024
Machine Translation with Large Language Models: Prompt Engineering for Persian, English, and Russian Directions
Nooshin Pourkamali
Shler Ebrahim Sharifi
LRM
50
9
0
16 Jan 2024
Leveraging Large Language Models for NLG Evaluation: Advances and Challenges
Zhen Li
Xiaohan Xu
Tao Shen
Can Xu
Jia-Chen Gu
Yuxuan Lai
Chongyang Tao
Shuai Ma
LM&MA
ELM
39
9
0
13 Jan 2024
Lost in the Source Language: How Large Language Models Evaluate the Quality of Machine Translation
Xu Huang
Zhirui Zhang
Xiang Geng
Yichao Du
Jiajun Chen
Shujian Huang
48
7
0
12 Jan 2024
Convergences and Divergences between Automatic Assessment and Human Evaluation: Insights from Comparing ChatGPT-Generated Translation and Neural Machine Translation
Zhaokun Jiang
Ziyin Zhang
EGVM
24
3
0
10 Jan 2024
MERA: A Comprehensive LLM Evaluation in Russian
Alena Fenogenova
Artem Chervyakov
Nikita Martynov
Anastasia Kozlova
Maria Tikhonova
...
Nikita Savushkin
Polina Mikhailova
Denis Dimitrov
Alexander Panchenko
Sergey Markov
ELM
39
10
0
09 Jan 2024
InFoBench: Evaluating Instruction Following Ability in Large Language Models
Yiwei Qin
Kaiqiang Song
Yebowen Hu
Wenlin Yao
Sangwoo Cho
Xiaoyang Wang
Xuansheng Wu
Fei Liu
Pengfei Liu
Dong Yu
ELM
36
42
0
07 Jan 2024
RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models
Cheng Niu
Yuanhao Wu
Juno Zhu
Siliang Xu
Kashun Shum
Randy Zhong
Juntong Song
Tong Zhang
HILM
28
87
0
31 Dec 2023
Speech Translation with Large Language Models: An Industrial Practice
Zhichao Huang
Rong Ye
Tom Ko
Qianqian Dong
Shanbo Cheng
Mingxuan Wang
Hang Li
70
15
0
21 Dec 2023
Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?
Tannon Kew
Florian Schottmann
Rico Sennrich
LRM
28
36
0
20 Dec 2023
Instruct-SCTG: Guiding Sequential Controlled Text Generation through Instructions
Yinhong Liu
Yixuan Su
Ehsan Shareghi
Nigel Collier
27
1
0
19 Dec 2023
Split and Rephrase with Large Language Models
David Ponce
Thierry Etchegoyhen
Jesús Calleja-Perez
Harritxu Gete
ReLM
LRM
46
2
0
18 Dec 2023
Distinguishing Translations by Human, NMT, and ChatGPT: A Linguistic and Statistical Approach
Zhaokun Jiang
Qianxi Lv
Ziyin Zhang
22
1
0
17 Dec 2023
CoAScore: Chain-of-Aspects Prompting for NLG Evaluation
Peiyuan Gong
Jiaxin Mao
ELM
54
10
0
16 Dec 2023
How should the advent of large language models affect the practice of science?
Marcel Binz
Stephan Alaniz
Adina Roskies
B. Aczel
Carl T. Bergstrom
...
Emily M. Bender
M. Marelli
Matthew M. Botvinick
Zeynep Akata
Eric Schulz
36
9
0
05 Dec 2023
Previous
1
2
3
4
5
Next