ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.09675
  4. Cited By
BERTScore: Evaluating Text Generation with BERT
v1v2v3 (latest)

BERTScore: Evaluating Text Generation with BERT

21 April 2019
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
ArXiv (abs)PDFHTML

Papers citing "BERTScore: Evaluating Text Generation with BERT"

50 / 3,519 papers shown
Title
PsycoLLM: Enhancing LLM for Psychological Understanding and Evaluation
PsycoLLM: Enhancing LLM for Psychological Understanding and Evaluation
Jinpeng Hu
Tengteng Dong
Luo Gang
Hui Ma
Peng Zou
Xiao Sun
Dan Guo
Meng Wang
AI4MH
92
7
0
08 Jul 2024
A Factuality and Diversity Reconciled Decoding Method for
  Knowledge-Grounded Dialogue Generation
A Factuality and Diversity Reconciled Decoding Method for Knowledge-Grounded Dialogue Generation
Chenxu Yang
Zheng Lin
Chong Tian
Liang Pang
Lanrui Wang
Zhengyang Tong
Qirong Ho
Yanan Cao
Weiping Wang
HILM
85
1
0
08 Jul 2024
Exposing Privacy Gaps: Membership Inference Attack on Preference Data for LLM Alignment
Exposing Privacy Gaps: Membership Inference Attack on Preference Data for LLM Alignment
Qizhang Feng
Siva Rajesh Kasa
Santhosh Kumar Kasa
Hyokun Yun
C. Teo
S. Bodapati
146
8
0
08 Jul 2024
InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation
InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation
Gaurav Sahu
Abhay Puri
Juan A. Rodriguez
Alexandre Drouin
Perouz Taslakian
...
Christopher Pal
Nicolas Chapados
I. Laradji
Sai Rajeswar Mudumba
Issam Hadj Laradji
ELM
130
7
0
08 Jul 2024
On Speeding Up Language Model Evaluation
On Speeding Up Language Model Evaluation
Jin Peng Zhou
Christian K. Belardi
Ruihan Wu
Travis Zhang
Carla P. Gomes
Wen Sun
Kilian Q. Weinberger
159
2
0
08 Jul 2024
IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning
IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning
Abhinav Joshi
Shounak Paul
Akshat Sharma
Pawan Goyal
Saptarshi Ghosh
Ashutosh Modi
AILawELM
71
12
0
07 Jul 2024
MINDECHO: Role-Playing Language Agents for Key Opinion Leaders
MINDECHO: Role-Playing Language Agents for Key Opinion Leaders
Rui Xu
Dakuan Lu
Jue Chen
Xintao Wang
Siyu Yuan
Jiangjie Chen
Wei Chu
Xu Yinghui
LLMAG
99
4
0
07 Jul 2024
Faux Polyglot: A Study on Information Disparity in Multilingual Large Language Models
Faux Polyglot: A Study on Information Disparity in Multilingual Large Language Models
Nikhil Sharma
Kenton Murray
Ziang Xiao
153
1
0
07 Jul 2024
FlowLearn: Evaluating Large Vision-Language Models on Flowchart
  Understanding
FlowLearn: Evaluating Large Vision-Language Models on Flowchart Understanding
Huitong Pan
Qi Zhang
Cornelia Caragea
Eduard Constantin Dragut
Longin Jan Latecki
82
6
0
06 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
100
19
0
06 Jul 2024
EVA-Score: Evaluation of Long-form Summarization on Informativeness
  through Extraction and Validation
EVA-Score: Evaluation of Long-form Summarization on Informativeness through Extraction and Validation
Yuchen Fan
Xin Zhong
Chengsi Wang
Gaoche Wu
Bowen Zhou
67
2
0
06 Jul 2024
Applicability of Large Language Models and Generative Models for Legal
  Case Judgement Summarization
Applicability of Large Language Models and Generative Models for Legal Case Judgement Summarization
Aniket Deroy
Kripabandhu Ghosh
Saptarshi Ghosh
ELMAILaw
105
23
0
06 Jul 2024
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language
  Models
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models
Yuzhe Gu
Ziwei Ji
Wenwei Zhang
Chengqi Lyu
Dahua Lin
Kai Chen
HILM
81
5
0
05 Jul 2024
Not (yet) the whole story: Evaluating Visual Storytelling Requires More
  than Measuring Coherence, Grounding, and Repetition
Not (yet) the whole story: Evaluating Visual Storytelling Requires More than Measuring Coherence, Grounding, and Repetition
Aditya K Surikuchi
Raquel Fernández
Sandro Pezzelle
61
6
0
05 Jul 2024
Aligning Model Evaluations with Human Preferences: Mitigating Token
  Count Bias in Language Model Assessments
Aligning Model Evaluations with Human Preferences: Mitigating Token Count Bias in Language Model Assessments
Roland Daynauth
Jason Mars
ALM
58
0
0
05 Jul 2024
Hallucination Detection: Robustly Discerning Reliable Answers in Large
  Language Models
Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models
Yuyan Chen
Qiang Fu
Yichen Yuan
Zhihao Wen
Ge Fan
Dayiheng Liu
Dongmei Zhang
Zhixu Li
Yanghua Xiao
HILM
74
77
0
04 Jul 2024
A Systematic Survey and Critical Review on Evaluating Large Language
  Models: Challenges, Limitations, and Recommendations
A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Md Tahmid Rahman Laskar
Sawsan Alqahtani
M Saiful Bari
Mizanur Rahman
Mohammad Abdullah Matin Khan
...
Chee Wei Tan
Md. Rizwan Parvez
Enamul Hoque
Shafiq Joty
Jimmy Huang
ELMALM
105
41
0
04 Jul 2024
Semantic Graphs for Syntactic Simplification: A Revisit from the Age of
  LLM
Semantic Graphs for Syntactic Simplification: A Revisit from the Age of LLM
Peiran Yao
Kostyantyn Guzhva
Denilson Barbosa
87
4
0
04 Jul 2024
Systematic Task Exploration with LLMs: A Study in Citation Text
  Generation
Systematic Task Exploration with LLMs: A Study in Citation Text Generation
Furkan Şahinuç
Ilia Kuznetsov
Yufang Hou
Iryna Gurevych
65
6
0
04 Jul 2024
A Survey on Natural Language Counterfactual Generation
A Survey on Natural Language Counterfactual Generation
Yongjie Wang
Xiaoqi Qiu
Yu Yue
Xu Guo
Zhiwei Zeng
Yuhong Feng
Zhiqi Shen
81
9
0
04 Jul 2024
TartuNLP @ AXOLOTL-24: Leveraging Classifier Output for New Sense
  Detection in Lexical Semantics
TartuNLP @ AXOLOTL-24: Leveraging Classifier Output for New Sense Detection in Lexical Semantics
Aleksei Dorkin
Kairit Sirts
52
2
0
04 Jul 2024
M5 -- A Diverse Benchmark to Assess the Performance of Large Multimodal
  Models Across Multilingual and Multicultural Vision-Language Tasks
M5 -- A Diverse Benchmark to Assess the Performance of Large Multimodal Models Across Multilingual and Multicultural Vision-Language Tasks
Florian Schneider
Sunayana Sitaram
VLM
81
12
0
04 Jul 2024
HEMM: Holistic Evaluation of Multimodal Foundation Models
HEMM: Holistic Evaluation of Multimodal Foundation Models
Paul Pu Liang
Akshay Goindani
Talha Chafekar
Leena Mathur
Haofei Yu
Ruslan Salakhutdinov
Louis-Philippe Morency
96
15
0
03 Jul 2024
Evaluating Automatic Metrics with Incremental Machine Translation
  Systems
Evaluating Automatic Metrics with Incremental Machine Translation Systems
Guojun Wu
Shay B. Cohen
Rico Sennrich
96
0
0
03 Jul 2024
Artificial Inductive Bias for Synthetic Tabular Data Generation in
  Data-Scarce Scenarios
Artificial Inductive Bias for Synthetic Tabular Data Generation in Data-Scarce Scenarios
Patricia A. Apellániz
Ana Jiménez
Borja Arroyo Galende
J. Parras
Santiago Zazo
115
1
0
03 Jul 2024
VIVA: A Benchmark for Vision-Grounded Decision-Making with Human Values
VIVA: A Benchmark for Vision-Grounded Decision-Making with Human Values
Zhe Hu
Yixiao Ren
Jing Li
Yu Yin
VLM
83
7
0
03 Jul 2024
Large Language Models as Evaluators for Scientific Synthesis
Large Language Models as Evaluators for Scientific Synthesis
Julia Evans
Jennifer D'Souza
Sören Auer
ELM
95
4
0
03 Jul 2024
MentalAgora: A Gateway to Advanced Personalized Care in Mental Health
  through Multi-Agent Debating and Attribute Control
MentalAgora: A Gateway to Advanced Personalized Care in Mental Health through Multi-Agent Debating and Attribute Control
Yeonji Lee
Sangjun Park
Kyunghyun Cho
Jinyeong Bak
102
2
0
03 Jul 2024
e-Health CSIRO at "Discharge Me!" 2024: Generating Discharge Summary
  Sections with Fine-tuned Language Models
e-Health CSIRO at "Discharge Me!" 2024: Generating Discharge Summary Sections with Fine-tuned Language Models
Jinghui Liu
Aaron Nicolson
Jason Dowling
Bevan Koopman
Anthony N. Nguyen
85
5
0
03 Jul 2024
MedPix 2.0: A Comprehensive Multimodal Biomedical Data set for Advanced AI Applications
MedPix 2.0: A Comprehensive Multimodal Biomedical Data set for Advanced AI Applications
Irene Siragusa
Salvatore Contino
Massimo La Ciura
Rosario Alicata
Roberto Pirrone
203
3
0
03 Jul 2024
Change My Frame: Reframing in the Wild in r/ChangeMyView
Change My Frame: Reframing in the Wild in r/ChangeMyView
Arturo Martinez Peguero
Taro Watanabe
32
0
0
02 Jul 2024
ValueScope: Unveiling Implicit Norms and Values via Return Potential
  Model of Social Interactions
ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions
Chan Young Park
Shuyue Stella Li
Hayoung Jung
Svitlana Volkova
Tanushree Mitra
David Jurgens
Yulia Tsvetkov
95
8
0
02 Jul 2024
Efficient Sparse Attention needs Adaptive Token Release
Efficient Sparse Attention needs Adaptive Token Release
Chaoran Zhang
Lixin Zou
Dan Luo
Min Tang
Xiangyang Luo
Zihao Li
Chenliang Li
108
5
0
02 Jul 2024
Towards a Holistic Framework for Multimodal Large Language Models in
  Three-dimensional Brain CT Report Generation
Towards a Holistic Framework for Multimodal Large Language Models in Three-dimensional Brain CT Report Generation
Cheng-Yi Li
Kao-Jung Chang
Cheng-Fu Yang
Hsin-Yu Wu
Wenting Chen
...
Yu-Chun Chen
Shih-Pin Chen
J. Lirng
Kai-Wei Chang
Shih-Hwa Chiou
LM&MAMedIm
72
7
0
02 Jul 2024
Synthetic Multimodal Question Generation
Synthetic Multimodal Question Generation
Ian Wu
Sravan Jayanthi
Vijay Viswanathan
Simon Rosenberg
Sina Pakazad
Tongshuang Wu
Graham Neubig
89
5
0
02 Jul 2024
Integrate the Essence and Eliminate the Dross: Fine-Grained
  Self-Consistency for Free-Form Language Generation
Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation
Xinglin Wang
Yiwei Li
Shaoxiong Feng
Peiwen Yuan
Boyuan Pan
Heda Wang
Yao Hu
Kan Li
96
11
0
02 Jul 2024
An End-to-End Speech Summarization Using Large Language Model
An End-to-End Speech Summarization Using Large Language Model
Hengchao Shang
Zongyao Li
Jiaxin Guo
Shaojun Li
Zhiqiang Rao
Yuanchang Luo
Daimeng Wei
Hao Yang
74
0
0
02 Jul 2024
Extracting and Encoding: Leveraging Large Language Models and Medical
  Knowledge to Enhance Radiological Text Representation
Extracting and Encoding: Leveraging Large Language Models and Medical Knowledge to Enhance Radiological Text Representation
Pablo Messina
René Vidal
Denis Parra
Álvaro Soto
Vladimir Araujo
MedIm
113
4
0
02 Jul 2024
Compare without Despair: Reliable Preference Evaluation with Generation
  Separability
Compare without Despair: Reliable Preference Evaluation with Generation Separability
Sayan Ghosh
Tejas Srinivasan
Swabha Swayamdipta
77
2
0
02 Jul 2024
AutoFlow: Automated Workflow Generation for Large Language Model Agents
AutoFlow: Automated Workflow Generation for Large Language Model Agents
Zelong Li
Shuyuan Xu
Kai Mei
Wenyue Hua
Balaji Rama
Om Raheja
Hao Wang
He Zhu
Yongfeng Zhang
AIFinAI4CELLMAG
100
19
0
01 Jul 2024
Face4RAG: Factual Consistency Evaluation for Retrieval Augmented
  Generation in Chinese
Face4RAG: Factual Consistency Evaluation for Retrieval Augmented Generation in Chinese
Yunqi Xu
Tianchi Cai
Jiyan Jiang
Xierui Song
95
3
0
01 Jul 2024
Hybrid RAG-empowered Multi-modal LLM for Secure Healthcare Data
  Management: A Diffusion-based Contract Theory Approach
Hybrid RAG-empowered Multi-modal LLM for Secure Healthcare Data Management: A Diffusion-based Contract Theory Approach
Cheng Su
Jinbo Wen
Jiawen Kang
Yonghua Wang
Hudan Pan
M. S. Hossain
MedIm
45
0
0
01 Jul 2024
ProductAgent: Benchmarking Conversational Product Search Agent with
  Asking Clarification Questions
ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions
Jingheng Ye
Yong Jiang
Xiaobin Wang
Hai-Tao Zheng
Yangning Li
Hai-Tao Zheng
Pengjun Xie
Fei Huang
80
2
0
01 Jul 2024
FineSurE: Fine-grained Summarization Evaluation using LLMs
FineSurE: Fine-grained Summarization Evaluation using LLMs
Hwanjun Song
Hang Su
Igor Shalyminov
Jason (Jinglun) Cai
Saab Mansour
HILM
85
36
0
01 Jul 2024
From Introspection to Best Practices: Principled Analysis of Demonstrations in Multimodal In-Context Learning
From Introspection to Best Practices: Principled Analysis of Demonstrations in Multimodal In-Context Learning
Nan Xu
Fei Wang
Sheng Zhang
Hoifung Poon
Muhao Chen
139
7
0
01 Jul 2024
Free-text Rationale Generation under Readability Level Control
Free-text Rationale Generation under Readability Level Control
Yi-Sheng Hsu
Nils Feldhus
Sherzod Hakimov
112
2
0
01 Jul 2024
Cross-Lingual Transfer Learning for Speech Translation
Cross-Lingual Transfer Learning for Speech Translation
Rao Ma
Yassir Fathullah
Mengjie Qian
Siyuan Tang
Mark Gales
Kate Knill
174
4
0
01 Jul 2024
A Comparative Study of Quality Evaluation Methods for Text Summarization
A Comparative Study of Quality Evaluation Methods for Text Summarization
Huyen Nguyen
Haihua Chen
Lavanya Pobbathi
Junhua Ding
ELM
83
6
0
30 Jun 2024
Locate&Edit: Energy-based Text Editing for Efficient, Flexible, and
  Faithful Controlled Text Generation
Locate&Edit: Energy-based Text Editing for Efficient, Flexible, and Faithful Controlled Text Generation
Hye Ryung Son
Jay-Yoon Lee
73
0
0
30 Jun 2024
"I understand why I got this grade": Automatic Short Answer Grading with Feedback
"I understand why I got this grade": Automatic Short Answer Grading with Feedback
Dishank Aggarwal
Pushpak Bhattacharyya
Bhaskaran Raman
Pushpak Bhattacharyya
43
4
0
30 Jun 2024
Previous
123...202122...697071
Next