ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.09675
  4. Cited By
BERTScore: Evaluating Text Generation with BERT
v1v2v3 (latest)

BERTScore: Evaluating Text Generation with BERT

21 April 2019
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
ArXiv (abs)PDFHTML

Papers citing "BERTScore: Evaluating Text Generation with BERT"

50 / 3,519 papers shown
Title
Enhancing Step-by-Step and Verifiable Medical Reasoning in MLLMs
Enhancing Step-by-Step and Verifiable Medical Reasoning in MLLMs
Haoran Sun
Yankai Jiang
Wenjie Lou
Yujie Zhang
Wenjie Li
Lilong Wang
Mianxin Liu
Lei Liu
Xiaosong Wang
LRM
15
0
0
20 Jun 2025
PersonalAI: Towards digital twins in the graph form
PersonalAI: Towards digital twins in the graph form
Mikhail Menschikov
Dmitry Evseev
R. Kostoev
Ilya Perepechkin
Ilnaz Salimov
Victoria Dochkina
Petr Anokhin
Evgeny Burnaev
Nikita Semenov
RALM
14
0
0
20 Jun 2025
Comparative Analysis of Abstractive Summarization Models for Clinical Radiology Reports
Comparative Analysis of Abstractive Summarization Models for Clinical Radiology Reports
Anindita Bhattacharya
Tohida Rehman
Debarshi Kumar Sanyal
S. Chattopadhyay
15
0
0
19 Jun 2025
Probe before You Talk: Towards Black-box Defense against Backdoor Unalignment for Large Language Models
Probe before You Talk: Towards Black-box Defense against Backdoor Unalignment for Large Language Models
Biao Yi
Tiansheng Huang
Sishuo Chen
Tong Li
Zheli Liu
Zhixuan Chu
Yiming Li
AAML
24
9
0
19 Jun 2025
Aligning ASR Evaluation with Human and LLM Judgments: Intelligibility Metrics Using Phonetic, Semantic, and NLI Approaches
Aligning ASR Evaluation with Human and LLM Judgments: Intelligibility Metrics Using Phonetic, Semantic, and NLI Approaches
Bornali Phukon
Xiuwen Zheng
Mark Hasegawa-Johnson
15
0
0
19 Jun 2025
GeoGuess: Multimodal Reasoning based on Hierarchy of Visual Information in Street View
GeoGuess: Multimodal Reasoning based on Hierarchy of Visual Information in Street View
Fenghua Cheng
Jinxiang Wang
Sen Wang
Zi Huang
Xue Li
LRM
19
0
0
19 Jun 2025
Reranking-based Generation for Unbiased Perspective Summarization
Reranking-based Generation for Unbiased Perspective Summarization
Narutatsu Ri
Nicholas Deas
Kathleen McKeown
OffRL
15
0
0
19 Jun 2025
Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation
Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation
Zongxia Li
Yapei Chang
Yuhang Zhou
Xiyang Wu
Zichao Liang
Yoo Yeon Sung
Jordan L. Boyd-Graber
22
0
0
18 Jun 2025
COSMMIC: Comment-Sensitive Multimodal Multilingual Indian Corpus for Summarization and Headline Generation
COSMMIC: Comment-Sensitive Multimodal Multilingual Indian Corpus for Summarization and Headline Generation
Raghvendra Kumar
S. A. Mohammed Salman
Aryan Sahu
Tridib Nandi
Pragathi Y. P.
S. Saha
Jose G. Moreno
15
0
0
18 Jun 2025
SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning
SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning
Anuradha Chopra
Abhinaba Roy
Dorien Herremans
15
0
0
18 Jun 2025
MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs
MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs
Yongqi Fan
Yating Wang
Guandong Wang
Jie Zhai
Jingping Liu
Qi Ye
Tong Ruan
21
0
0
18 Jun 2025
Evaluation Should Not Ignore Variation: On the Impact of Reference Set Choice on Summarization Metrics
Evaluation Should Not Ignore Variation: On the Impact of Reference Set Choice on Summarization Metrics
Silvia Casola
Yang Liu
Siyao Peng
Oliver Kraus
Albert Gatt
Barbara Plank
23
0
0
17 Jun 2025
From What to Respond to When to Respond: Timely Response Generation for Open-domain Dialogue Agents
From What to Respond to When to Respond: Timely Response Generation for Open-domain Dialogue Agents
Seongbo Jang
Minjin Jeon
Jaehoon Lee
Seonghyeon Lee
Dongha Lee
Hwanjo Yu
23
0
0
17 Jun 2025
An Empirical Study of LLM-as-a-Judge: How Design Choices Impact Evaluation Reliability
An Empirical Study of LLM-as-a-Judge: How Design Choices Impact Evaluation Reliability
Yusuke Yamauchi
Taro Yano
Masafumi Oyamada
ELM
17
0
0
16 Jun 2025
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following
Yinghao Ma
Siyou Li
Juntao Yu
Emmanouil Benetos
Akira Maezawa
AuLLMVLM
22
0
0
14 Jun 2025
Language Surgery in Multilingual Large Language Models
Language Surgery in Multilingual Large Language Models
Joanito Agili Lopo
Muhammad Ravi Shulthan Habibi
Tack Hwa Wong
Muhammad Ilham Ghozali
Fajri Koto
Genta Indra Winata
Peerat Limkonchotiwat
Alham Fikri Aji
Samuel Cahyawijaya
20
0
0
14 Jun 2025
From Persona to Person: Enhancing the Naturalness with Multiple Discourse Relations Graph Learning in Personalized Dialogue Generation
From Persona to Person: Enhancing the Naturalness with Multiple Discourse Relations Graph Learning in Personalized Dialogue Generation
Chih-Hao Hsu
Ying-Jia Lin
Hung-Yu Kao
13
0
0
13 Jun 2025
Enhance Multimodal Consistency and Coherence for Text-Image Plan Generation
Enhance Multimodal Consistency and Coherence for Text-Image Plan Generation
Xiaoxin Lu
Ranran Haoran Zhang
Yusen Zhang
Rui Zhang
DiffM
15
0
0
13 Jun 2025
A Gamified Evaluation and Recruitment Platform for Low Resource Language Machine Translation Systems
A Gamified Evaluation and Recruitment Platform for Low Resource Language Machine Translation Systems
Carlos Rafael Catalan
ELM
62
0
0
13 Jun 2025
Constructing and Evaluating Declarative RAG Pipelines in PyTerrier
Constructing and Evaluating Declarative RAG Pipelines in PyTerrier
Craig Macdonald
Jinyuan Fang
Andrew Parry
Zaiqiao Meng
AI4TS
104
0
0
12 Jun 2025
Neural at ArchEHR-QA 2025: Agentic Prompt Optimization for Evidence-Grounded Clinical Question Answering
Neural at ArchEHR-QA 2025: Agentic Prompt Optimization for Evidence-Grounded Clinical Question Answering
Sai Prasanna Teja Reddy Bogireddy
Abrar Majeedi
Viswanatha Reddy Gajjala
Zhuoyan Xu
Siddhant Rai
Vaishnav Potlapalli
104
0
0
12 Jun 2025
LLM-Driven Personalized Answer Generation and Evaluation
LLM-Driven Personalized Answer Generation and Evaluation
Mohammadreza Molavi
Mohammadreza Tavakoli
Mohammad Moein
Abdolali Faraji
Gábor Kismihók
AI4Ed
116
0
0
12 Jun 2025
Burn After Reading: Do Multimodal Large Language Models Truly Capture Order of Events in Image Sequences?
Burn After Reading: Do Multimodal Large Language Models Truly Capture Order of Events in Image Sequences?
Yingjin Song
Yupei Du
Denis Paperno
Albert Gatt
MLLM
121
0
0
12 Jun 2025
Towards Scalable SOAP Note Generation: A Weakly Supervised Multimodal Framework
Towards Scalable SOAP Note Generation: A Weakly Supervised Multimodal Framework
Sadia Kamal
Tim Oates
Joy Wan
110
0
0
12 Jun 2025
Can LLMs Generate Good Stories? Insights and Challenges from a Narrative Planning Perspective
Can LLMs Generate Good Stories? Insights and Challenges from a Narrative Planning Perspective
Yi Wang
Max Kreminski
45
0
0
11 Jun 2025
Is Fine-Tuning an Effective Solution? Reassessing Knowledge Editing for Unstructured Data
Is Fine-Tuning an Effective Solution? Reassessing Knowledge Editing for Unstructured Data
Hao Xiong
Chuanyuan Tan
Wenliang Chen
KELM
49
0
0
11 Jun 2025
AI5GTest: AI-Driven Specification-Aware Automated Testing and Validation of 5G O-RAN Components
AI5GTest: AI-Driven Specification-Aware Automated Testing and Validation of 5G O-RAN Components
Abiodun Ganiyu
Pranshav Gajjar
Vijay K. Shah
48
0
0
11 Jun 2025
HSENet: Hybrid Spatial Encoding Network for 3D Medical Vision-Language Understanding
HSENet: Hybrid Spatial Encoding Network for 3D Medical Vision-Language Understanding
Yanzhao Shi
Xiaodan Zhang
Junzhong Ji
Haoning Jiang
Chengxin Zheng
Y. Wang
Liangqiong Qu
86
0
0
11 Jun 2025
Prompt Variability Effects On LLM Code Generation
Prompt Variability Effects On LLM Code Generation
Andrei Paleyes
Radzim Sendyka
Diana Robinson
Christian Cabrera
Neil D. Lawrence
54
0
0
11 Jun 2025
3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks
3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks
Xiaotang Gai
Jiaxiang Liu
Yichen Li
Zijie Meng
Jian Wu
Zuozhu Liu
VGen
13
0
0
11 Jun 2025
Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy
Sushant Gautam
Michael A. Riegler
Pål Halvorsen
51
0
0
11 Jun 2025
Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search
Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search
Linhao Yu
Xinguang Ji
Yahui Liu
Fanheng Kong
Chenxi Sun
Jingyuan Zhang
Hongzhi Zhang
Victoria A. Webster-Wood
Fuzheng Zhang
Deyi Xiong
18
0
0
11 Jun 2025
Your Agent Can Defend Itself against Backdoor Attacks
Your Agent Can Defend Itself against Backdoor Attacks
Li Changjiang
Liang Jiacheng
Cao Bochuan
Chen Jinghui
Wang Ting
AAMLLLMAG
51
0
0
10 Jun 2025
Addressing Pitfalls in Auditing Practices of Automatic Speech Recognition Technologies: A Case Study of People with Aphasia
Katelyn Mei
A. S. G. Choi
Hilke Schellmann
Mona Sloane
Allison Koenecke
23
0
0
10 Jun 2025
CC-RAG: Structured Multi-Hop Reasoning via Theme-Based Causal Graphs
Jash Rajesh Parekh
Pengcheng Jiang
Jiawei Han
LRM
22
0
0
10 Jun 2025
TreeReview: A Dynamic Tree of Questions Framework for Deep and Efficient LLM-based Scientific Peer Review
TreeReview: A Dynamic Tree of Questions Framework for Deep and Efficient LLM-based Scientific Peer Review
Yuan Chang
Ziyue Li
Hengyuan Zhang
Yuanbo Kong
Yanru Wu
Zhijiang Guo
Ngai Wong
23
1
0
09 Jun 2025
Plug-in and Fine-tuning: Bridging the Gap between Small Language Models and Large Language Models
Plug-in and Fine-tuning: Bridging the Gap between Small Language Models and Large Language Models
Kyeonghyun Kim
Jinhee Jang
Juhwan Choi
Yoonji Lee
Kyohoon Jin
Youngbin Kim
16
0
0
09 Jun 2025
Federated In-Context Learning: Iterative Refinement for Improved Answer Quality
Federated In-Context Learning: Iterative Refinement for Improved Answer Quality
Ruhan Wang
Zhiyong Wang
Chengkai Huang
Rui Wang
Tong Yu
Lina Yao
John C. S. Lui
Dongruo Zhou
15
0
0
09 Jun 2025
Less is More: some Computational Principles based on Parcimony, and Limitations of Natural Intelligence
Less is More: some Computational Principles based on Parcimony, and Limitations of Natural Intelligence
Laura Cohen
Xavier Hinaut
Lilyana Petrova
Alexandre Pitti
Syd Reynal
Ichiro Tsuda
23
0
0
08 Jun 2025
Breaking the Reviewer: Assessing the Vulnerability of Large Language Models in Automated Peer Review Under Textual Adversarial Attacks
Breaking the Reviewer: Assessing the Vulnerability of Large Language Models in Automated Peer Review Under Textual Adversarial Attacks
Tzu-Ling Lin
Wei Chen
Teng-Fang Hsiao
Hou-I Liu
Ya-Hsin Yeh
Yu Kai Chan
Wen-Sheng Lien
Po-Yen Kuo
Philip S. Yu
Hong-Han Shuai
AAML
22
0
0
08 Jun 2025
RARL: Improving Medical VLM Reasoning and Generalization with Reinforcement Learning and LoRA under Data and Hardware Constraints
RARL: Improving Medical VLM Reasoning and Generalization with Reinforcement Learning and LoRA under Data and Hardware Constraints
Tan-Hanh Pham
Chris Ngo
OffRLLRM
23
0
0
07 Jun 2025
LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles
LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles
Ho Yin Sam Ng
Ting-Yao Hsu
Aashish Anantha Ramakrishnan
Branislav Kveton
Nedim Lipka
...
Dongwon Lee
Tong Yu
Sungchul Kim
Ryan Rossi
Ting-Hao 'Kenneth' Huang
21
0
0
06 Jun 2025
Improving LLM-Powered EDA Assistants with RAFT
Improving LLM-Powered EDA Assistants with RAFT
Luyao Shi
Michael A. Kazda
Charles Schmitter
Hemlata Gupta
18
0
0
06 Jun 2025
Peer-Ranked Precision: Creating a Foundational Dataset for Fine-Tuning Vision Models from DataSeeds' Annotated Imagery
Peer-Ranked Precision: Creating a Foundational Dataset for Fine-Tuning Vision Models from DataSeeds' Annotated Imagery
Sajjad Abdoli
Freeman Lewin
Gediminas Vasiliauskas
Fabian Schonholz
EGVMAI4TSVLM
52
0
0
06 Jun 2025
BioMol-MQA: A Multi-Modal Question Answering Dataset For LLM Reasoning Over Bio-Molecular Interactions
BioMol-MQA: A Multi-Modal Question Answering Dataset For LLM Reasoning Over Bio-Molecular Interactions
Saptarshi Sengupta
Shuhua Yang
Paul Kwong Yu
Fali Wang
Suhang Wang
37
0
0
06 Jun 2025
Tau-Eval: A Unified Evaluation Framework for Useful and Private Text Anonymization
Tau-Eval: A Unified Evaluation Framework for Useful and Private Text Anonymization
Gabriel Loiseau
Damien Sileo
Damien Riquet
Maxime Meyer
Marc Tommasi
55
0
0
06 Jun 2025
Dynamic Context Tuning for Retrieval-Augmented Generation: Enhancing Multi-Turn Planning and Tool Adaptation
Dynamic Context Tuning for Retrieval-Augmented Generation: Enhancing Multi-Turn Planning and Tool Adaptation
Jubin Abhishek Soni
Amit Anand
Rajesh Kumar Pandey
Aniket Abhishek Soni
15
0
0
05 Jun 2025
StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models
StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models
Ya Jiang
Chuxiong Wu
Massieh Kordi Boroujeny
Brian L. Mark
Kai Zeng
WaLM
33
0
0
05 Jun 2025
Coordinated Robustness Evaluation Framework for Vision-Language Models
Coordinated Robustness Evaluation Framework for Vision-Language Models
Ashwin Ramesh Babu
Sajad Mousavi
Vineet Gundecha
Sahand Ghorbanpour
Avisek Naug
Antonio Guillen
Ricardo Luna Gutierrez
Soumyendu Sarkar
AAML
25
0
0
05 Jun 2025
Prompting LLMs: Length Control for Isometric Machine Translation
Dávid Javorský
Ondrej Bojar
François Yvon
91
0
0
05 Jun 2025
1234...697071
Next