ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.09675
  4. Cited By
BERTScore: Evaluating Text Generation with BERT
v1v2v3 (latest)

BERTScore: Evaluating Text Generation with BERT

21 April 2019
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
ArXiv (abs)PDFHTML

Papers citing "BERTScore: Evaluating Text Generation with BERT"

50 / 3,519 papers shown
Title
Talk With Human-like Agents: Empathetic Dialogue Through Perceptible
  Acoustic Reception and Reaction
Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction
Haoqiu Yan
Yongxin Zhu
Kai Zheng
Bing Liu
Haoyu Cao
Deqiang Jiang
Linli Xu
AuLLM
88
5
0
18 Jun 2024
Using LLMs to Aid Annotation and Collection of Clinically-Enriched Data
  in Bipolar Disorder and Schizophrenia
Using LLMs to Aid Annotation and Collection of Clinically-Enriched Data in Bipolar Disorder and Schizophrenia
Ankit Aich
Avery Quynh
Pamela Osseyi
Amy Pinkham
Philip Harvey
Brenda L. Curtis
Colin A. Depp
Natalie Parde
AI4MH
49
2
0
18 Jun 2024
MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection
  of Social-Media Texts
MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts
Dominik Macko
Jakub Kopal
Robert Moro
Ivan Srba
DeLMO
71
3
0
18 Jun 2024
EMO-KNOW: A Large Scale Dataset on Emotion and Emotion-cause
EMO-KNOW: A Large Scale Dataset on Emotion and Emotion-cause
M. Nguyen
Yasith Samaradivakara
P. Sasikumar
Chitralekha Gupta
Suranga Nanayakkara
66
1
0
18 Jun 2024
Unveiling Implicit Table Knowledge with Question-Then-Pinpoint Reasoner
  for Insightful Table Summarization
Unveiling Implicit Table Knowledge with Question-Then-Pinpoint Reasoner for Insightful Table Summarization
Kwangwook Seo
Jinyoung Yeo
Dongha Lee
ReLMLMTDLRM
53
2
0
18 Jun 2024
DTGB: A Comprehensive Benchmark for Dynamic Text-Attributed Graphs
DTGB: A Comprehensive Benchmark for Dynamic Text-Attributed Graphs
Jiasheng Zhang
Jialin Chen
Menglin Yang
Aosong Feng
Shuang Liang
Jie Shao
Rex Ying
71
12
0
17 Jun 2024
Extrinsic Evaluation of Cultural Competence in Large Language Models
Extrinsic Evaluation of Cultural Competence in Large Language Models
Shaily Bhatt
Fernando Diaz
ELMEGVM
114
9
0
17 Jun 2024
Fairer Preferences Elicit Improved Human-Aligned Large Language Model
  Judgments
Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments
Han Zhou
Xingchen Wan
Yinhong Liu
Nigel Collier
Ivan Vulić
Anna Korhonen
ALM
79
12
0
17 Jun 2024
A Systematic Survey of Text Summarization: From Statistical Methods to
  Large Language Models
A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models
Haopeng Zhang
Philip S. Yu
Jiawei Zhang
134
27
0
17 Jun 2024
ComperDial: Commonsense Persona-grounded Dialogue Dataset and Benchmark
ComperDial: Commonsense Persona-grounded Dialogue Dataset and Benchmark
Hiromi Wakaki
Yuki Mitsufuji
Yoshinori Maeda
Yukiko Nishimura
Silin Gao
Mengjie Zhao
Keiichi Yamada
Antoine Bosselut
96
0
0
17 Jun 2024
WeatherQA: Can Multimodal Language Models Reason about Severe Weather?
WeatherQA: Can Multimodal Language Models Reason about Severe Weather?
Chengqian Ma
Zhanxiang Hua
Alexandra Anderson-Frey
Vikram Iyer
Xin Liu
Lianhui Qin
106
6
0
17 Jun 2024
InstructCMP: Length Control in Sentence Compression through
  Instruction-based Large Language Models
InstructCMP: Length Control in Sentence Compression through Instruction-based Large Language Models
Juseon-Do
Jingun Kwon
Hidetaka Kamigaito
Manabu Okumura
81
2
0
16 Jun 2024
Revisiting Cosine Similarity via Normalized ICA-transformed Embeddings
Revisiting Cosine Similarity via Normalized ICA-transformed Embeddings
Hiroaki Yamagiwa
Momose Oyama
Hidetoshi Shimodaira
LLMSV
88
2
0
16 Jun 2024
Improving Adversarial Robustness via Decoupled Visual Representation
  Masking
Improving Adversarial Robustness via Decoupled Visual Representation Masking
Decheng Liu
Tao Chen
Chunlei Peng
Nannan Wang
Ruimin Hu
Xinbo Gao
AAML
73
1
0
16 Jun 2024
Distilling Opinions at Scale: Incremental Opinion Summarization using
  XL-OPSUMM
Distilling Opinions at Scale: Incremental Opinion Summarization using XL-OPSUMM
Sri Raghava Muddu
Rupasai Rangaraju
Tejpalsingh Siledar
Swaroop Nath
Pushpak Bhattacharyya
...
Suman Banerjee
Amey Patil
M. Chelliah
Sudhanshu Singh
Nikesh Garera
ELMLRM
81
1
0
16 Jun 2024
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes
  in Mathematical Reasoning
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
Joykirat Singh
A. Nambi
Vibhav Vineet
LRM
97
6
0
16 Jun 2024
KGPA: Robustness Evaluation for Large Language Models via Cross-Domain
  Knowledge Graphs
KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs
Aihua Pei
Zehua Yang
Shunan Zhu
Ruoxi Cheng
Ju Jia
Lina Wang
121
1
0
16 Jun 2024
EchoGuide: Active Acoustic Guidance for LLM-Based Eating Event Analysis
  from Egocentric Videos
EchoGuide: Active Acoustic Guidance for LLM-Based Eating Event Analysis from Egocentric Videos
Vineet Parikh
Saif Mahmud
Devansh Agarwal
Ke Li
François Guimbretière
Cheng Zhang
64
4
0
15 Jun 2024
Facts-and-Feelings: Capturing both Objectivity and Subjectivity in
  Table-to-Text Generation
Facts-and-Feelings: Capturing both Objectivity and Subjectivity in Table-to-Text Generation
Tathagata Dey
Pushpak Bhattacharyya
51
0
0
15 Jun 2024
SciEx: Benchmarking Large Language Models on Scientific Exams with Human
  Expert Grading and Automatic Grading
SciEx: Benchmarking Large Language Models on Scientific Exams with Human Expert Grading and Automatic Grading
Tu Anh Dinh
Carlos Mullov
Leonard Barmann
Zhaolin Li
Danni Liu
...
Michael Beigl
Rainer Stiefelhagen
Carsten Dachsbacher
Klemens Bohm
Jan Niehues
ELM
88
12
0
14 Jun 2024
Be like a Goldfish, Don't Memorize! Mitigating Memorization in
  Generative LLMs
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Abhimanyu Hans
Yuxin Wen
Neel Jain
John Kirchenbauer
Hamid Kazemi
...
Siddharth Singh
Gowthami Somepalli
Jonas Geiping
A. Bhatele
Tom Goldstein
113
37
0
14 Jun 2024
Inclusive ASR for Disfluent Speech: Cascaded Large-Scale Self-Supervised
  Learning with Targeted Fine-Tuning and Data Augmentation
Inclusive ASR for Disfluent Speech: Cascaded Large-Scale Self-Supervised Learning with Targeted Fine-Tuning and Data Augmentation
Dena F. Mujtaba
Nihar R. Mahapatra
Megan Arney
J Scott Yaruss
Caryn Herring
Jia Bin
67
1
0
14 Jun 2024
On the Evaluation of Speech Foundation Models for Spoken Language
  Understanding
On the Evaluation of Speech Foundation Models for Spoken Language Understanding
Siddhant Arora
Ankita Pasad
Chung-Ming Chien
Jionghao Han
Roshan S. Sharma
...
William Chen
Suwon Shon
Hung-yi Lee
Karen Livescu
Shinji Watanabe
ELM
87
6
0
14 Jun 2024
Nymeria: A Massive Collection of Multimodal Egocentric Daily Motion in
  the Wild
Nymeria: A Massive Collection of Multimodal Egocentric Daily Motion in the Wild
Lingni Ma
Yuting Ye
Fangzhou Hong
Vladimir Guzov
Yifeng Jiang
...
C. Karen Liu
Ziwei Liu
Jakob Engel
R. D. Nardi
Richard Newcombe
94
25
0
14 Jun 2024
A Survey on Large Language Models from General Purpose to Medical
  Applications: Datasets, Methodologies, and Evaluations
A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations
Jinqiang Wang
Huansheng Ning
Yi Peng
Qikai Wei
Daniel Tesfai
Wenwei Mao
Tao Zhu
Runhe Huang
LM&MAAI4MHELM
143
8
0
14 Jun 2024
Improving Autoregressive Training with Dynamic Oracles
Improving Autoregressive Training with Dynamic Oracles
Jianing Yang
Harshine Visvanathan
Yilin Wang
Xinyi Hu
Matthew R. Gormley
65
0
0
13 Jun 2024
Language-driven Grasp Detection
Language-driven Grasp Detection
An Dinh Vuong
Minh Nhat Vu
Baoru Huang
Nghia Nguyen
Hieu Le
T. Vo
Anh Nguyen
VLM
116
19
0
13 Jun 2024
Chain-of-Though (CoT) prompting strategies for medical error detection
  and correction
Chain-of-Though (CoT) prompting strategies for medical error detection and correction
Zhaolong Wu
Abul Hasan
Jinge Wu
Yunsoo Kim
Jason PY Cheung
Teng Zhang
Honghan Wu
LRM
63
4
0
13 Jun 2024
Word Order in English-Japanese Simultaneous Interpretation: Analyses and
  Evaluation using Chunk-wise Monotonic Translation
Word Order in English-Japanese Simultaneous Interpretation: Analyses and Evaluation using Chunk-wise Monotonic Translation
Kosuke Doi
Yuka Ko
Mana Makinae
Katsuhito Sudoh
Satoshi Nakamura
97
2
0
13 Jun 2024
No perspective, no perception!! Perspective-aware Healthcare Answer
  Summarization
No perspective, no perception!! Perspective-aware Healthcare Answer Summarization
Gauri Naik
Sharad Chandakacherla
S. Yadav
Md. Shad Akhtar
88
11
0
13 Jun 2024
cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations
  in Scientific Papers
cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers
Anirudh S. Sundar
Jin Xu
William Gay
Christopher Richardson
Larry Heck
110
1
0
12 Jun 2024
A Concept-Based Explainability Framework for Large Multimodal Models
A Concept-Based Explainability Framework for Large Multimodal Models
Jayneel Parekh
Pegah Khayatan
Mustafa Shukor
A. Newson
Matthieu Cord
102
18
0
12 Jun 2024
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts
  for Text-to-Speech and Style Captioning
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
Masaya Kawamura
Ryuichi Yamamoto
Yuma Shirahata
Takuya Hasumi
Kentaro Tachibana
VLM
73
12
0
12 Jun 2024
Better than Random: Reliable NLG Human Evaluation with Constrained
  Active Sampling
Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling
Jie Ruan
Xiao Pu
Mingqi Gao
Xiaojun Wan
Yuesheng Zhu
56
5
0
12 Jun 2024
Defining and Detecting Vulnerability in Human Evaluation Guidelines: A
  Preliminary Study Towards Reliable NLG Evaluation
Defining and Detecting Vulnerability in Human Evaluation Guidelines: A Preliminary Study Towards Reliable NLG Evaluation
Jie Ruan
Wenqing Wang
Xiaojun Wan
AAMLELM
80
6
0
12 Jun 2024
Dynamic Stochastic Decoding Strategy for Open-Domain Dialogue Generation
Dynamic Stochastic Decoding Strategy for Open-Domain Dialogue Generation
Yiwei Li
Fei Mi
Yitong Li
Yasheng Wang
Bin Sun
Shaoxiong Feng
Kan Li
77
4
0
12 Jun 2024
Are Large Language Models Good Statisticians?
Are Large Language Models Good Statisticians?
Yizhang Zhu
Shiyin Du
Boyan Li
Yuyu Luo
Nan Tang
ELM
85
18
0
12 Jun 2024
Prompt-Based Length Controlled Generation with Multiple Control Types
Prompt-Based Length Controlled Generation with Multiple Control Types
Renlong Jie
Xiaojun Meng
Lifeng Shang
Xin Jiang
Qun Liu
87
8
0
12 Jun 2024
Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation
Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation
Zhi Qu
Chenchen Ding
Taro Watanabe
163
1
0
12 Jun 2024
Estimating the Hallucination Rate of Generative AI
Estimating the Hallucination Rate of Generative AI
Andrew Jesson
Nicolas Beltran-Velez
Quentin Chu
Sweta Karlekar
Jannik Kossen
Yarin Gal
John P. Cunningham
David M. Blei
118
12
0
11 Jun 2024
Textual Similarity as a Key Metric in Machine Translation Quality
  Estimation
Textual Similarity as a Key Metric in Machine Translation Quality Estimation
Kun Sun
Rong Wang
82
1
0
11 Jun 2024
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level
  Hallucination Evaluation
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation
Wen Luo
Tianshu Shen
Wei Li
Guangyue Peng
Richeng Xuan
Houfeng Wang
Xi Yang
HILM
111
12
0
11 Jun 2024
Merlin: A Vision Language Foundation Model for 3D Computed Tomography
Merlin: A Vision Language Foundation Model for 3D Computed Tomography
Louis Blankemeier
Joseph Paul Cohen
Ashwin Kumar
Dave Van Veen
Syed Jamal Safdar Gardezi
...
Andrew L. Wentland
C. Langlotz
Jason Hom
S. Gatidis
Akshay S. Chaudhari
LM&MAMedIm
89
41
0
10 Jun 2024
Evaluating the Retrieval Component in LLM-Based Question Answering
  Systems
Evaluating the Retrieval Component in LLM-Based Question Answering Systems
Ashkan Alinejad
Krtin Kumar
Ali Vahdat
103
5
0
10 Jun 2024
MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific
  Workflows
MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows
Xingjian Zhang
Yutong Xie
Jin Huang
Jinge Ma
Zhaoying Pan
...
Ziyang Xiong
Tolga Ergen
Dongsub Shim
Honglak Lee
Qiaozhu Mei
107
12
0
10 Jun 2024
MedExQA: Medical Question Answering Benchmark with Multiple Explanations
MedExQA: Medical Question Answering Benchmark with Multiple Explanations
Yunsoo Kim
Jinge Wu
Yusuf Abdulle
Honghan Wu
ELM
106
24
0
10 Jun 2024
HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question
  Answering using LLMs
HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs
Pranoy Panda
Ankush Agarwal
Chaitanya Devaguptapu
Manohar Kaul
Prathosh A P
RALM
106
13
0
10 Jun 2024
FLEUR: An Explainable Reference-Free Evaluation Metric for Image
  Captioning Using a Large Multimodal Model
FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model
Yebin Lee
Imseong Park
Myungjoo Kang
75
18
0
10 Jun 2024
Prompting Large Language Models with Audio for General-Purpose Speech
  Summarization
Prompting Large Language Models with Audio for General-Purpose Speech Summarization
Wonjune Kang
Deb Roy
LRM
71
7
0
10 Jun 2024
MedREQAL: Examining Medical Knowledge Recall of Large Language Models
  via Question Answering
MedREQAL: Examining Medical Knowledge Recall of Large Language Models via Question Answering
Juraj Vladika
Phillip Schneider
Florian Matthes
75
1
0
09 Jun 2024
Previous
123...222324...697071
Next