ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.11520
  4. Cited By
BARTScore: Evaluating Generated Text as Text Generation

BARTScore: Evaluating Generated Text as Text Generation

22 June 2021
Weizhe Yuan
Graham Neubig
Pengfei Liu
ArXivPDFHTML

Papers citing "BARTScore: Evaluating Generated Text as Text Generation"

50 / 535 papers shown
Title
Towards Artificial General or Personalized Intelligence? A Survey on Foundation Models for Personalized Federated Intelligence
Towards Artificial General or Personalized Intelligence? A Survey on Foundation Models for Personalized Federated Intelligence
Yu Qiao
Huy Q. Le
Avi Deb Raha
Phuong-Nam Tran
Apurba Adhikary
Mengchun Zhang
Loc X. Nguyen
Eui-nam Huh
Dusit Niyato
Choong Seon Hong
AI4CE
31
0
0
11 May 2025
Summarisation of German Judgments in conjunction with a Class-based Evaluation
Summarisation of German Judgments in conjunction with a Class-based Evaluation
Bianca Steffes
Nils Torben Wiedemann
Alexander Gratz
Pamela Hochreither
Jana Elina Meyer
Katharina Luise Schilke
AILaw
ELM
58
0
0
09 May 2025
Adaptive Stress Testing Black-Box LLM Planners
Adaptive Stress Testing Black-Box LLM Planners
Neeloy Chakraborty
John Pohovey
Melkior Ornik
Katherine Driggs-Campbell
28
0
0
08 May 2025
ConSens: Assessing context grounding in open-book question answering
ConSens: Assessing context grounding in open-book question answering
Ivan Vankov
Matyo Ivanov
Adriana Correia
Victor Botev
ELM
69
0
0
30 Apr 2025
Robust Misinformation Detection by Visiting Potential Commonsense Conflict
Robust Misinformation Detection by Visiting Potential Commonsense Conflict
Bing Wang
Ximing Li
C. Li
Bingrui Zhao
Bo Fu
Renchu Guan
Shengsheng Wang
53
0
0
30 Apr 2025
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts
Hanhua Hong
Chenghao Xiao
Yang Wang
Y. Liu
Wenge Rong
Chenghua Lin
31
0
0
29 Apr 2025
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers
Dylan Bouchard
Mohit Singh Chauhan
HILM
84
0
0
27 Apr 2025
Automated Creativity Evaluation for Large Language Models: A Reference-Based Approach
Automated Creativity Evaluation for Large Language Models: A Reference-Based Approach
Ruizhe Li
Chiwei Zhu
Benfeng Xu
Xiaorui Wang
Zhendong Mao
27
0
0
22 Apr 2025
aiXamine: Simplified LLM Safety and Security
aiXamine: Simplified LLM Safety and Security
Fatih Deniz
Dorde Popovic
Yazan Boshmaf
Euisuh Jeong
M. Ahmad
Sanjay Chawla
Issa M. Khalil
ELM
80
0
0
21 Apr 2025
MusFlow: Multimodal Music Generation via Conditional Flow Matching
MusFlow: Multimodal Music Generation via Conditional Flow Matching
Jiahao Song
Yuzhao Wang
37
0
0
18 Apr 2025
Summarization of Multimodal Presentations with Vision-Language Models: Study of the Effect of Modalities and Structure
Summarization of Multimodal Presentations with Vision-Language Models: Study of the Effect of Modalities and Structure
Théo Gigant
Camille Guinaudeau
Frédéric Dufaux
29
0
0
14 Apr 2025
From Punchlines to Predictions: A Metric to Assess LLM Performance in Identifying Humor in Stand-Up Comedy
From Punchlines to Predictions: A Metric to Assess LLM Performance in Identifying Humor in Stand-Up Comedy
Adrianna Romanowski
Pedro Valois
Kazuhiro Fukui
36
0
0
12 Apr 2025
Large Language Models as Span Annotators
Large Language Models as Span Annotators
Zdeněk Kasner
Vilém Zouhar
Patrícia Schmidtová
Ivan Kartáč
Kristýna Onderková
Ondřej Plátek
Dimitra Gkatzia
Saad Mahamood
Ondrej Dusek
Simone Balloccu
ALM
37
0
0
11 Apr 2025
LLM for Comparative Narrative Analysis
LLM for Comparative Narrative Analysis
Leo Kampen
Carlos Rabat Villarreal
Louis Yu
Santu Karmaker
Dongji Feng
25
0
0
11 Apr 2025
VLMT: Vision-Language Multimodal Transformer for Multimodal Multi-hop Question Answering
VLMT: Vision-Language Multimodal Transformer for Multimodal Multi-hop Question Answering
Qi Zhi Lim
C. Lee
K. Lim
Kalaiarasi Sonai Muthu Anbananthen
31
0
0
11 Apr 2025
From Speech to Summary: A Comprehensive Survey of Speech Summarization
From Speech to Summary: A Comprehensive Survey of Speech Summarization
Fabian Retkowski
Maike Züfle
Andreas Sudmann
Dinah Pfau
Jan Niehues
Alexander Waibel
46
0
0
10 Apr 2025
Toward Holistic Evaluation of Recommender Systems Powered by Generative Models
Toward Holistic Evaluation of Recommender Systems Powered by Generative Models
Yashar Deldjoo
Nikhil Mehta
M. Sathiamoorthy
Shuai Zhang
Pablo Castells
Julian McAuley
EGVM
ELM
72
1
0
09 Apr 2025
HypoEval: Hypothesis-Guided Evaluation for Natural Language Generation
HypoEval: Hypothesis-Guided Evaluation for Natural Language Generation
Mingxuan Li
Hanchen Li
Chenhao Tan
ALM
ELM
49
0
0
09 Apr 2025
CoKe: Customizable Fine-Grained Story Evaluation via Chain-of-Keyword Rationalization
CoKe: Customizable Fine-Grained Story Evaluation via Chain-of-Keyword Rationalization
Brihi Joshi
Sriram Venkatapathy
Mohit Bansal
Nanyun Peng
Haw-Shiuan Chang
LRM
49
0
0
21 Mar 2025
Can one size fit all?: Measuring Failure in Multi-Document Summarization Domain Transfer
Can one size fit all?: Measuring Failure in Multi-Document Summarization Domain Transfer
Alexandra DeLucia
Mark Dredze
47
0
0
20 Mar 2025
Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings
Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings
Austin Xu
Srijan Bansal
Yifei Ming
Semih Yavuz
Chenyu You
ELM
95
3
0
19 Mar 2025
Inspecting the Representation Manifold of Differentially-Private Text
Inspecting the Representation Manifold of Differentially-Private Text
Stefan Arnold
42
0
0
19 Mar 2025
A Survey on Transformer Context Extension: Approaches and Evaluation
A Survey on Transformer Context Extension: Approaches and Evaluation
Yijun Liu
Jinzheng Yu
Yang Xu
Zhongyang Li
Qingfu Zhu
LLMAG
68
0
0
17 Mar 2025
OpeNLGauge: An Explainable Metric for NLG Evaluation with Open-Weights LLMs
OpeNLGauge: An Explainable Metric for NLG Evaluation with Open-Weights LLMs
Ivan Kartáč
Mateusz Lango
Ondrej Dusek
ELM
51
1
0
14 Mar 2025
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Mari Ashiga
Wei Jie
Fan Wu
Vardan K. Voskanyan
Fateme Dinmohammadi
P. Brookes
Jingzhi Gong
Zheng Wang
44
0
0
13 Mar 2025
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang
Guoming Ling
Vincent S. Liang
Yupei Lin
Yandong Chen
Shanshan Zhong
Hefeng Wu
LRM
54
2
0
08 Mar 2025
RocketEval: Efficient Automated LLM Evaluation via Grading Checklist
Tianjun Wei
Wei Wen
Ruizhi Qiao
Xing Sun
Jianghong Ma
ALM
ELM
50
1
0
07 Mar 2025
SINdex: Semantic INconsistency Index for Hallucination Detection in LLMs
Samir Abdaljalil
Hasan Kurban
Parichit Sharma
Erchin Serpedin
Rachad Atat
HILM
58
0
0
07 Mar 2025
Argument Summarization and its Evaluation in the Era of Large Language Models
Argument Summarization and its Evaluation in the Era of Large Language Models
Moritz Altemeyer
Steffen Eger
Johannes Daxenberger
Tim Altendorf
Philipp Cimiano
Benjamin Schiller
LM&MA
ELM
LRM
67
0
0
02 Mar 2025
Towards Efficient Educational Chatbots: Benchmarking RAG Frameworks
Umar Ali Khan
Ekram Khan
Fiza Khan
A. A. Moinuddin
48
0
0
02 Mar 2025
Multi2: Multi-Agent Test-Time Scalable Framework for Multi-Document Processing
Multi2: Multi-Agent Test-Time Scalable Framework for Multi-Document Processing
Juntai Cao
Xiang Zhang
Raymond Li
Chuyuan Li
Chenyu You
Shafiq Joty
Giuseppe Carenini
59
1
0
27 Feb 2025
MultiOCR-QA: Dataset for Evaluating Robustness of LLMs in Question Answering on Multilingual OCR Texts
MultiOCR-QA: Dataset for Evaluating Robustness of LLMs in Question Answering on Multilingual OCR Texts
Bhawna Piryani
Jamshid Mozafari
Abdelrahman Abdallah
Antoine Doucet
Adam Jatowt
47
1
0
24 Feb 2025
OrderSum: Semantic Sentence Ordering for Extractive Summarization
OrderSum: Semantic Sentence Ordering for Extractive Summarization
Taewan Kwon
Sangyong Lee
46
0
0
22 Feb 2025
IPAD: Inverse Prompt for AI Detection -- A Robust and Explainable LLM-Generated Text Detector
IPAD: Inverse Prompt for AI Detection -- A Robust and Explainable LLM-Generated Text Detector
Zheng Chen
Yushi Feng
Changyang He
Yue Deng
Hongxi Pu
Bo-wen Li
DeLMO
49
1
0
21 Feb 2025
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation
SeongYeub Chu
JongWoo Kim
MunYong Yi
60
3
0
21 Feb 2025
Prompting a Weighting Mechanism into LLM-as-a-Judge in Two-Step: A Case Study
Prompting a Weighting Mechanism into LLM-as-a-Judge in Two-Step: A Case Study
Wenwen Xie
Gray Gwizdz
Dongji Feng
85
0
0
20 Feb 2025
G-Refer: Graph Retrieval-Augmented Large Language Model for Explainable Recommendation
G-Refer: Graph Retrieval-Augmented Large Language Model for Explainable Recommendation
Yuhan Li
Xinni Zhang
Linhao Luo
Heng Chang
Yuxiang Ren
Irwin King
Jiajian Li
60
3
0
18 Feb 2025
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
Hieu Nguyen
Zihao He
Shoumik Atul Gandre
Ujjwal Pasupulety
Sharanya Kumari Shivakumar
Kristina Lerman
HILM
59
1
0
16 Feb 2025
Accelerating Unbiased LLM Evaluation via Synthetic Feedback
Accelerating Unbiased LLM Evaluation via Synthetic Feedback
Zhaoyi Zhou
Yuda Song
Andrea Zanette
ALM
73
0
0
14 Feb 2025
Learning to Substitute Words with Model-based Score Ranking
Learning to Substitute Words with Model-based Score Ranking
Hongye Liu
Ricardo Henao
43
0
0
09 Feb 2025
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Borui Xu
Yao Chen
Zeyi Wen
Weiguo Liu
Bingsheng He
79
1
0
02 Feb 2025
Learning to Explore and Select for Coverage-Conditioned Retrieval-Augmented Generation
Learning to Explore and Select for Coverage-Conditioned Retrieval-Augmented Generation
Takyoung Kim
Kyungjae Lee
Y. Jang
Ji Yong Cho
Gangwoo Kim
Minseok Cho
Moontae Lee
156
0
0
28 Jan 2025
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
Mingqi Gao
Xinyu Hu
Li Lin
Xiaojun Wan
28
1
0
28 Jan 2025
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators
Yinhong Liu
Han Zhou
Zhijiang Guo
Ehsan Shareghi
Ivan Vulić
Anna Korhonen
Nigel Collier
ALM
132
69
0
20 Jan 2025
Clinical Insights: A Comprehensive Review of Language Models in Medicine
Clinical Insights: A Comprehensive Review of Language Models in Medicine
Nikita Neveditsin
Pawan Lingras
V. Mago
LM&MA
58
4
0
08 Jan 2025
CaseSumm: A Large-Scale Dataset for Long-Context Summarization from U.S. Supreme Court Opinions
Mourad Heddaya
Kyle MacMillan
Anup Malani
Hongyuan Mei
Chenhao Tan
AILaw
ELM
34
0
0
03 Jan 2025
Evaluate Summarization in Fine-Granularity: Auto Evaluation with LLM
Evaluate Summarization in Fine-Granularity: Auto Evaluation with LLM
Dong Yuan
Eti Rastogi
Fen Zhao
Sagar Goyal
Gautam Naik
Sree Prasanna Rajagopal
44
0
0
31 Dec 2024
Towards Automatic Evaluation for Image Transcreation
Towards Automatic Evaluation for Image Transcreation
Simran Khanuja
Vivek Iyer
Claire He
Graham Neubig
ViT
90
1
0
18 Dec 2024
Coverage-based Fairness in Multi-document Summarization
Coverage-based Fairness in Multi-document Summarization
Haoyuan Li
Yusen Zhang
Rui Zhang
Snigdha Chaturvedi
80
0
0
11 Dec 2024
QAPyramid: Fine-grained Evaluation of Content Selection for Text
  Summarization
QAPyramid: Fine-grained Evaluation of Content Selection for Text Summarization
Shiyue Zhang
David Wan
Arie Cattan
Ayal Klein
Ido Dagan
Joey Tianyi Zhou
86
0
0
10 Dec 2024
1234...91011
Next