ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.04696
  4. Cited By
BLEURT: Learning Robust Metrics for Text Generation

BLEURT: Learning Robust Metrics for Text Generation

9 April 2020
Thibault Sellam
Dipanjan Das
Ankur P. Parikh
ArXivPDFHTML

Papers citing "BLEURT: Learning Robust Metrics for Text Generation"

50 / 73 papers shown
Title
Teaching with Lies: Curriculum DPO on Synthetic Negatives for Hallucination Detection
Shrey Pandit
Ashwin Vinod
Liu Leqi
Ying Ding
HILM
44
0
0
23 May 2025
LLMs Are Not Scorers: Rethinking MT Evaluation with Generation-Based Methods
LLMs Are Not Scorers: Rethinking MT Evaluation with Generation-Based Methods
Hyang Cui
LRM
67
0
0
22 May 2025
Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models
Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models
Banca Calvo Figueras
Rodrigo Agerri
ALM
ELM
LRM
160
1
0
16 May 2025
Trans-Zero: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data
Trans-Zero: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data
Wei Zou
Sen Yang
Yu Bao
Shujian Huang
Jiajun Chen
Shanbo Cheng
SyDa
89
1
0
20 Apr 2025
LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline
LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline
Biao Fu
Minpeng Liao
Kai Fan
Chengxi Li
Li Zhang
Yidong Chen
Xiaodong Shi
OffRL
341
1
0
13 Apr 2025
FUSE : A Ridge and Random Forest-Based Metric for Evaluating MT in Indigenous Languages
FUSE : A Ridge and Random Forest-Based Metric for Evaluating MT in Indigenous Languages
Rahul Raja
A. Vats
110
1
0
28 Mar 2025
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang
Guoming Ling
Vincent S. Liang
Yupei Lin
Yandong Chen
Shanshan Zhong
Hefeng Wu
LRM
148
7
0
08 Mar 2025
Leveraging Large Language Models for Building Interpretable Rule-Based Data-to-Text Systems
Leveraging Large Language Models for Building Interpretable Rule-Based Data-to-Text Systems
Jędrzej Warczyński
Mateusz Lango
Ondrej Dusek
63
0
0
28 Feb 2025
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation
Zhaopeng Feng
Jiayuan Su
Jiamei Zheng
Jiahan Ren
Yan Zhang
Jian Wu
Hongwei Wang
Zuozhu Liu
ELM
234
1
0
21 Feb 2025
Mind the Style Gap: Meta-Evaluation of Style and Attribute Transfer Metrics
Mind the Style Gap: Meta-Evaluation of Style and Attribute Transfer Metrics
Amalie Brogaard Pauli
Isabelle Augenstein
Ira Assent
117
0
0
20 Feb 2025
Prompting a Weighting Mechanism into LLM-as-a-Judge in Two-Step: A Case Study
Prompting a Weighting Mechanism into LLM-as-a-Judge in Two-Step: A Case Study
Wenwen Xie
Gray Gwizdz
Dongji Feng
112
0
0
20 Feb 2025
A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond
A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond
Shreya Shukla
Jose Torres
Abhijit Mishra
Jacek Gwizdka
Shounak Roychowdhury
81
0
0
20 Feb 2025
Learning to Substitute Words with Model-based Score Ranking
Learning to Substitute Words with Model-based Score Ranking
Hongye Liu
Ricardo Henao
117
0
0
09 Feb 2025
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Borui Xu
Yao Chen
Zeyi Wen
Weiguo Liu
Bingsheng He
133
2
0
02 Feb 2025
MDEval: Evaluating and Enhancing Markdown Awareness in Large Language Models
Zhongpu Chen
Yixiao Liu
Long Shi
Zhi-Jie Wang
Xingyan Chen
Yu Zhao
Fuji Ren
87
1
0
28 Jan 2025
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Mingqi Gao
Xinyu Hu
Li Lin
Xiaojun Wan
53
2
0
28 Jan 2025
Dynamic Scene Understanding from Vision-Language Representations
Dynamic Scene Understanding from Vision-Language Representations
Shahaf Pruss
Morris Alper
Hadar Averbuch-Elor
OCL
412
0
0
20 Jan 2025
BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation
BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation
Suvodip Dey
M. Desarkar
OffRL
75
0
0
20 Jan 2025
A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls
A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls
Sheikh Shafayat
Dongkeun Yoon
Woori Jang
Jiwoo Choi
Alice Oh
Seohyon Jung
148
1
0
03 Jan 2025
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
Ziyan Jiang
Rui Meng
Xinyi Yang
Semih Yavuz
Yingbo Zhou
Wenhu Chen
MLLM
VLM
160
26
0
03 Jan 2025
LLM-based Translation Inference with Iterative Bilingual Understanding
LLM-based Translation Inference with Iterative Bilingual Understanding
Andong Chen
Kehai Chen
Yang Xiang
Xuefeng Bai
Muyun Yang
Yang Feng
Tiejun Zhao
Min Zhang
LRM
105
5
0
31 Dec 2024
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings
Miguel Moura Ramos
Tomás Almeida
Daniel Vareta
Filipe Azevedo
Sweta Agrawal
Patrick Fernandes
André F. T. Martins
82
4
0
08 Nov 2024
How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs
How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs
Ran Zhang
Wei Zhao
Steffen Eger
106
8
0
24 Oct 2024
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
HyoJung Han
Akiko Eriguchi
Haoran Xu
Hieu T. Hoang
Marine Carpuat
Huda Khayrallah
VLM
69
3
0
12 Oct 2024
SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models
SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models
H. Xia
Zhengbang Yang
Junbo Zou
Rhys Tracy
Yuqing Wang
...
Xun Shao
Zhuoqing Xie
Yuan-fang Wang
Weining Shen
Hanjie Chen
ReLM
LRM
ELM
64
3
0
11 Oct 2024
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
Hadas Orgad
Michael Toker
Zorik Gekhman
Roi Reichart
Idan Szpektor
Hadas Kotek
Yonatan Belinkov
HILM
AIFin
92
42
0
03 Oct 2024
Better Instruction-Following Through Minimum Bayes Risk
Better Instruction-Following Through Minimum Bayes Risk
Ian Wu
Patrick Fernandes
Amanda Bertsch
Seungone Kim
Sina Pakazad
Graham Neubig
116
11
0
03 Oct 2024
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
Genta Indra Winata
David Anugraha
Lucky Susanto
Garry Kuwanto
Derry Wijaya
99
11
0
03 Oct 2024
Cross-lingual Human-Preference Alignment for Neural Machine Translation with Direct Quality Optimization
Cross-lingual Human-Preference Alignment for Neural Machine Translation with Direct Quality Optimization
Kaden Uhlig
Joern Wuebker
Raphael Reinauer
John DeNero
78
0
0
26 Sep 2024
Your Weak LLM is Secretly a Strong Teacher for Alignment
Your Weak LLM is Secretly a Strong Teacher for Alignment
Leitian Tao
Yixuan Li
109
8
0
13 Sep 2024
An Efficient Sign Language Translation Using Spatial Configuration and Motion Dynamics with LLMs
An Efficient Sign Language Translation Using Spatial Configuration and Motion Dynamics with LLMs
Eui Jun Hwang
Sukmin Cho
Junmyeong Lee
Jong C. Park
SLR
85
4
0
20 Aug 2024
Recording for Eyes, Not Echoing to Ears: Contextualized Spoken-to-Written Conversion of ASR Transcripts
Recording for Eyes, Not Echoing to Ears: Contextualized Spoken-to-Written Conversion of ASR Transcripts
Jiaqing Liu
Chong Deng
Qinglin Zhang
Shilin Zhou
Hai Yu
Hai Yu
Wen Wang
71
0
0
19 Aug 2024
On Speeding Up Language Model Evaluation
On Speeding Up Language Model Evaluation
Jin Peng Zhou
Christian K. Belardi
Ruihan Wu
Travis Zhang
Carla P. Gomes
Wen Sun
Kilian Q. Weinberger
82
1
0
08 Jul 2024
Sentence-level Aggregation of Lexical Metrics Correlates Stronger with Human Judgements than Corpus-level Aggregation
Sentence-level Aggregation of Lexical Metrics Correlates Stronger with Human Judgements than Corpus-level Aggregation
Paulo Cavalin
P. Domingues
Claudio S. Pinhanez
69
1
0
03 Jul 2024
Adaptive Activation Steering: A Tuning-Free LLM Truthfulness Improvement Method for Diverse Hallucinations Categories
Adaptive Activation Steering: A Tuning-Free LLM Truthfulness Improvement Method for Diverse Hallucinations Categories
Tianlong Wang
Xianfeng Jiao
Yifan He
Zhongzhi Chen
Yinghao Zhu
Xu Chu
Junyi Gao
Yasha Wang
Liantao Ma
LLMSV
99
12
0
26 May 2024
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
Minghao Wu
Jiahao Xu
Yulin Yuan
Gholamreza Haffari
Longyue Wang
Weihua Luo
Kaifu Zhang
LLMAG
150
27
0
20 May 2024
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore
Junchao Wu
Runzhe Zhan
Derek F. Wong
Shu Yang
Xuebo Liu
Lidia S. Chao
Min Zhang
DeLMO
100
5
0
07 May 2024
From Matching to Generation: A Survey on Generative Information Retrieval
From Matching to Generation: A Survey on Generative Information Retrieval
Xiaoxi Li
Jiajie Jin
Yujia Zhou
Yuyao Zhang
Peitian Zhang
Yutao Zhu
Zhicheng Dou
3DV
154
55
0
23 Apr 2024
A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods
A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods
Hanlei Jin
Yang Zhang
Dan Meng
Jun Wang
Jinghua Tan
210
90
0
05 Mar 2024
A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision-Language Models
A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision-Language Models
Xiujie Song
Mengyue Wu
Ke Zhu
Chunhao Zhang
Yanyi Chen
LRM
ELM
69
3
0
28 Feb 2024
Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions
Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions
Hanjie Chen
Zhouxiang Fang
Yash Singla
Mark Dredze
ELM
AI4MH
92
39
0
28 Feb 2024
Only Send What You Need: Learning to Communicate Efficiently in Federated Multilingual Machine Translation
Only Send What You Need: Learning to Communicate Efficiently in Federated Multilingual Machine Translation
Yun-Wei Chu
Dong-Jun Han
Christopher G. Brinton
95
4
0
15 Jan 2024
Semantic Consistency for Assuring Reliability of Large Language Models
Semantic Consistency for Assuring Reliability of Large Language Models
Harsh Raj
Vipul Gupta
Domenic Rosati
S. Majumdar
HILM
132
14
0
17 Aug 2023
Learning Evaluation Models from Large Language Models for Sequence Generation
Learning Evaluation Models from Large Language Models for Sequence Generation
Chenglong Wang
Hang Zhou
Kai-Chun Chang
Tongran Liu
Chunliang Zhang
Quan Du
Tong Xiao
Yue Zhang
Jingbo Zhu
ELM
106
3
0
08 Aug 2023
Why is constrained neural language generation particularly challenging?
Why is constrained neural language generation particularly challenging?
Cristina Garbacea
Qiaozhu Mei
96
15
0
11 Jun 2022
Towards a Decomposable Metric for Explainable Evaluation of Text
  Generation from AMR
Towards a Decomposable Metric for Explainable Evaluation of Text Generation from AMR
Juri Opitz
Anette Frank
107
35
0
20 Aug 2020
Sticking to the Facts: Confident Decoding for Faithful Data-to-Text
  Generation
Sticking to the Facts: Confident Decoding for Faithful Data-to-Text Generation
Ran Tian
Shashi Narayan
Thibault Sellam
Ankur P. Parikh
HILM
84
96
0
19 Oct 2019
Automatic Quality Estimation for Natural Language Generation: Ranting
  (Jointly Rating and Ranking)
Automatic Quality Estimation for Natural Language Generation: Ranting (Jointly Rating and Ranking)
Ondrej Dusek
Karin Sevegnani
Ioannis Konstas
Verena Rieser
ALM
44
9
0
10 Oct 2019
MoverScore: Text Generation Evaluating with Contextualized Embeddings
  and Earth Mover Distance
MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance
Wei Zhao
Maxime Peyrard
Fei Liu
Yang Gao
Christian M. Meyer
Steffen Eger
167
598
0
05 Sep 2019
SumQE: a BERT-based Summary Quality Estimation Model
SumQE: a BERT-based Summary Quality Estimation Model
Stratos Xenouleas
Prodromos Malakasiotis
Marianna Apidianaki
Ion Androutsopoulos
45
37
0
02 Sep 2019
12
Next