Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.09675
Cited By
v1
v2
v3 (latest)
BERTScore: Evaluating Text Generation with BERT
21 April 2019
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERTScore: Evaluating Text Generation with BERT"
50 / 3,519 papers shown
Title
CinePile: A Long Video Question Answering Dataset and Benchmark
Ruchit Rawal
Khalid Saifullah
Ronen Basri
David Jacobs
Gowthami Somepalli
Tom Goldstein
103
57
0
14 May 2024
Enhancing Gender-Inclusive Machine Translation with Neomorphemes and Large Language Models
Andrea Piergentili
Beatrice Savoldi
Matteo Negri
L. Bentivogli
86
6
0
14 May 2024
PromptMind Team at MEDIQA-CORR 2024: Improving Clinical Text Correction with Error Categorization and LLM Ensembles
Kesav Gundabathula
Sriram R Kolar
LRM
63
7
0
14 May 2024
Russian-Language Multimodal Dataset for Automatic Summarization of Scientific Papers
Alena Tsanda
E. Bruches
50
0
0
13 May 2024
Open-vocabulary Auditory Neural Decoding Using fMRI-prompted LLM
Xiaoyu Chen
Changde Du
Che Liu
Yizhe Wang
Huiguang He
54
3
0
13 May 2024
Synthetic Tabular Data Validation: A Divergence-Based Approach
Patricia A. Apellániz
Ana Jiménez
Borja Arroyo Galende
J. Parras
Santiago Zazo
48
4
0
13 May 2024
Evaluation of Retrieval-Augmented Generation: A Survey
Hao Yu
Aoran Gan
Kai Zhang
Shiwei Tong
Qi Liu
Zhaofeng Liu
3DV
140
100
0
13 May 2024
MedVersa: A Generalist Foundation Model for Medical Image Interpretation
Hong-Yu Zhou
Subathra Adithan
J. N. Acosta
Suvrankar Datta
E. Topol
Pranav Rajpurkar
MedIm
143
29
0
13 May 2024
SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset
Sushant Gautam
Mehdi Houshmand Sarkhoosh
Jan Held
Cise Midoglu
A. Cioppa
Silvio Giancola
Vajira Thambawita
Michael A. Riegler
Pål Halvorsen
Mubarak Shah
75
6
0
12 May 2024
PHUDGE: Phi-3 as Scalable Judge
Mahesh Deshwal
Apoorva Chawla
ALM
29
0
0
12 May 2024
Disentangling Specificity for Abstractive Multi-document Summarization
Congbo Ma
Wei Emma Zhang
Hu Wang
Haojie Zhuang
Mingyu Guo
59
0
0
12 May 2024
AIOS Compiler: LLM as Interpreter for Natural Language Programming and Flow Programming of AI Agents
Shuyuan Xu
Zelong Li
Kai Mei
Yongfeng Zhang
73
5
0
11 May 2024
Automatic Generation of Model and Data Cards: A Step Towards Responsible AI
Jiarui Liu
Wenkai Li
Zhijing Jin
Mona T. Diab
SyDa
96
7
0
10 May 2024
Lost in Transcription: Identifying and Quantifying the Accuracy Biases of Automatic Speech Recognition Systems Against Disfluent Speech
Dena F. Mujtaba
Nihar R. Mahapatra
Megan Arney
J Scott Yaruss
Hope Gerlach-Houck
Caryn Herring
Jia Bin
67
4
0
10 May 2024
Efficient LLM Comparative Assessment: a Product of Experts Framework for Pairwise Comparisons
Adian Liusie
Vatsal Raina
Yassir Fathullah
Mark Gales
104
12
0
09 May 2024
Review-based Recommender Systems: A Survey of Approaches, Challenges and Future Perspectives
Emrul Hasan
Mizanur Rahman
Chen Ding
Jimmy Xiangji Huang
Shaina Raza
92
5
0
09 May 2024
MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning
Inderjeet Nair
Lu Wang
LRM
49
1
0
08 May 2024
QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs
Weijia Zhang
Vaishali Pal
Jia-Hong Huang
Evangelos Kanoulas
Maarten de Rijke
LMTD
109
8
0
08 May 2024
Topicwise Separable Sentence Retrieval for Medical Report Generation
Junting Zhao
Yang Zhou
Zhihao Chen
Huazhu Fu
Liang Wan
MedIm
62
1
0
07 May 2024
MEDVOC: Vocabulary Adaptation for Fine-tuning Pre-trained Language Models on Medical Text Summarization
Gunjan Balde
Soumyadeep Roy
Mainack Mondal
Niloy Ganguly
VLM
51
6
0
07 May 2024
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore
Junchao Wu
Runzhe Zhan
Derek F. Wong
Shu Yang
Xuebo Liu
Lidia S. Chao
Min Zhang
DeLMO
123
5
0
07 May 2024
Self-Improving Customer Review Response Generation Based on LLMs
Guy Azov
Tatiana Pelc
Adi Fledel Alon
Gila Kamhi
68
2
0
06 May 2024
GREEN: Generative Radiology Report Evaluation and Error Notation
Sophie Ostmeier
Justin Xu
Zhihong Chen
Maya Varma
Louis Blankemeier
...
Arne Edward Michalson
Michael E. Moseley
Curtis P. Langlotz
Akshay S. Chaudhari
Jean-Benoit Delbrouck
MedIm
103
28
0
06 May 2024
Instruction-Guided Bullet Point Summarization of Long Financial Earnings Call Transcripts
Subhendu Khatuya
Koushiki Sinha
Niloy Ganguly
Saptarshi Ghosh
Pawan Goyal
63
4
0
03 May 2024
OARelatedWork: A Large-Scale Dataset of Related Work Sections with Full-texts from Open Access Sources
Martin Docekal
Martin Fajcik
Pavel Smrz
VLM
75
1
0
03 May 2024
ModelShield: Adaptive and Robust Watermark against Model Extraction Attack
Kaiyi Pang
Tao Qi
Chuhan Wu
Minhao Bai
Minghu Jiang
Yongfeng Huang
AAML
WaLM
166
5
0
03 May 2024
SUKHSANDESH: An Avatar Therapeutic Question Answering Platform for Sexual Education in Rural India
Salam Michael Singh
Shubhmoy Kumar Garg
Amitesh Misra
Aaditeshwar Seth
Tanmoy Chakraborty
68
0
0
03 May 2024
Understanding Position Bias Effects on Fairness in Social Multi-Document Summarization
Olubusayo Olabisi
Ameeta Agrawal
81
2
0
03 May 2024
Large Language Models are Inconsistent and Biased Evaluators
Rickard Stureborg
Dimitris Alikaniotis
Yoshi Suhara
ALM
123
66
0
02 May 2024
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Seungone Kim
Juyoung Suk
Shayne Longpre
Bill Yuchen Lin
Jamin Shin
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
MoMe
ALM
ELM
147
205
0
02 May 2024
On the Evaluation of Machine-Generated Reports
James Mayfield
Eugene Yang
Dawn J Lawrie
Sean MacAvaney
Paul McNamee
...
Orion Weller
Efsun Kayi
Kate Sanders
Marc Mason
Noah Hibbler
ALM
182
17
0
02 May 2024
RST-LoRA: A Discourse-Aware Low-Rank Adaptation for Long Document Abstractive Summarization
Dongqi Pu
Vera Demberg
138
6
0
01 May 2024
RepEval: Effective Text Evaluation with LLM Representation
Shuqian Sheng
Yi Xu
Tianhang Zhang
Zanwei Shen
Luoyi Fu
Jiaxin Ding
Lei Zhou
Xinbing Wang
Cheng Zhou
72
2
0
30 Apr 2024
Does Whisper understand Swiss German? An automatic, qualitative, and human evaluation
Eyal Liron Dolev
Clemens Fidel Lutz
Noemi Aepli
69
7
0
30 Apr 2024
Game-MUG: Multimodal Oriented Game Situation Understanding and Commentary Generation Dataset
Zhihao Zhang
Feiqi Cao
Yingbin Mo
Yiran Zhang
Josiah Poon
S. Han
44
1
0
30 Apr 2024
In-Context Learning with Long-Context Models: An In-Depth Exploration
Amanda Bertsch
Maor Ivgi
Uri Alon
Jonathan Berant
Matthew R. Gormley
Matthew R. Gormley
Graham Neubig
ReLM
AIMat
189
80
0
30 Apr 2024
Calibration of Large Language Models on Code Summarization
Yuvraj Virk
Prem Devanbu
Toufique Ahmed
99
11
0
30 Apr 2024
How Did We Get Here? Summarizing Conversation Dynamics
Yilun Hua
Nicholas Chernogor
Yuzhe Gu
Seoyeon Julie Jeong
Miranda Luo
Cristian Danescu-Niculescu-Mizil
88
6
0
29 Apr 2024
3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset
Xinyu Ma
Xuebo Liu
Derek F. Wong
Jun Rao
Bei Li
Liang Ding
Lidia S. Chao
Dacheng Tao
Min Zhang
63
3
0
29 Apr 2024
Hallucination of Multimodal Large Language Models: A Survey
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
VLM
LRM
258
197
0
29 Apr 2024
Quality Estimation with
k
k
k
-nearest Neighbors and Automatic Evaluation for Model-specific Quality Estimation
Tu Anh Dinh
Tobias Palzer
Jan Niehues
70
1
0
27 Apr 2024
GPT for Games: A Scoping Review (2020-2023)
Daijin Yang
Erica Kleinman
Casper Harteveld
AI4TS
AI4CE
137
14
0
27 Apr 2024
MRScore: Evaluating Radiology Report Generation with LLM-based Reward System
Yunyi Liu
Zhanyu Wang
Yingshu Li
Xinyu Liang
Lingqiao Liu
Lei Wang
Luping Zhou
LM&MA
28
3
0
27 Apr 2024
Automating Customer Needs Analysis: A Comparative Study of Large Language Models in the Travel Industry
Simone Barandoni
F. Chiarello
Lorenzo Cascone
Emiliano Marrale
Salvatore Puccio
133
6
0
27 Apr 2024
On the Limitations of Embedding Based Methods for Measuring Functional Correctness for Code Generation
Atharva Naik
92
3
0
26 Apr 2024
Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning
Tianhui Zhang
Bei Peng
Danushka Bollegala
LRM
53
10
0
25 Apr 2024
Label-Free Topic-Focused Summarization Using Query Augmentation
Wenchuan Mu
Kwan Hui Lim
RALM
65
1
0
25 Apr 2024
Learning Long-form Video Prior via Generative Pre-Training
Jinheng Xie
Jiajun Feng
Zhaoxu Tian
Kevin Qinghong Lin
Yawen Huang
...
Nanxu Gong
Xu Zuo
Jiaqi Yang
Yefeng Zheng
Mike Zheng Shou
69
6
0
24 Apr 2024
CASPR: Automated Evaluation Metric for Contrastive Summarization
Nirupan Ananthamurugan
Dat Duong
Philip George
Ankita Gupta
Sandeep Tata
Beliz Gunel
64
0
0
23 Apr 2024
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
Jo˜ao Monteiro
Étienne Marcotte
Pierre-Andre Noel
Valentina Zantedeschi
David Vázquez
Nicolas Chapados
Christopher Pal
Perouz Taslakian
77
5
0
23 Apr 2024
Previous
1
2
3
...
25
26
27
...
69
70
71
Next