Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.09675
Cited By
v1
v2
v3 (latest)
BERTScore: Evaluating Text Generation with BERT
21 April 2019
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERTScore: Evaluating Text Generation with BERT"
50 / 3,519 papers shown
Title
Advocating Character Error Rate for Multilingual ASR Evaluation
Thennal D K
Jesin James
D. Gopinath
Muhammed Ashraf K
52
3
0
09 Oct 2024
Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training
Sara Sarto
Nicholas Moratelli
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
80
4
0
09 Oct 2024
Exploring the Readiness of Prominent Small Language Models for the Democratization of Financial Literacy
Tagore Rao Kosireddy
Jeffrey D. Wall
Evan Lucas
53
1
0
09 Oct 2024
Mind Your Questions! Towards Backdoor Attacks on Text-to-Visualization Models
Shuaimin Li
Yuanfeng Song
Xuanang Chen
Anni Peng
Zhuoyue Wan
Chen Jason Zhang
Raymond Chi-Wing Wong
SILM
75
0
0
09 Oct 2024
Detecting Bias and Enhancing Diagnostic Accuracy in Large Language Models for Healthcare
Pardis Sadat Zahraei
Zahra Shakeri
LM&MA
61
0
0
09 Oct 2024
AuditWen:An Open-Source Large Language Model for Audit
Jiajia Huang
Haoran Zhu
Chao Xu
Tianming Zhan
Qianqian Xie
J. Huang
25
1
0
09 Oct 2024
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints
Thomas Palmeira Ferraz
Kartik Mehta
Yu-Hsiang Lin
Haw-Shiuan Chang
Shereen Oraby
Sijia Liu
Vivek Subramanian
Tagyoung Chung
Mohit Bansal
Nanyun Peng
104
14
0
09 Oct 2024
LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning
Zhe Li
Weihao Yuan
Yisheng He
Lingteng Qiu
Shenhao Zhu
Xiaodong Gu
Weichao Shen
Yuan Dong
Zilong Dong
Laurence T. Yang
88
10
0
09 Oct 2024
PREDICT: Preference Reasoning by Evaluating Decomposed preferences Inferred from Candidate Trajectories
Stephane Aroca-Ouellette
Natalie Mackraz
B. Theobald
Katherine Metcalf
56
0
0
08 Oct 2024
Mitigating the Impact of Reference Quality on Evaluation of Summarization Systems with Reference-Free Metrics
Théo Gigant
Camille Guinaudeau
Marc Decombas
Frédéric Dufaux
75
1
0
08 Oct 2024
Automatic Summarization of Long Documents
Naman Chhibbar
Jugal Kalita
120
0
0
08 Oct 2024
CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models Using Discrete Concept
YuXuan Wu
Bonaventure F. P. Dossou
Dianbo Liu
MU
49
0
0
08 Oct 2024
Label Confidence Weighted Learning for Target-level Sentence Simplification
Xinying Qiu
Jingshen Zhang
47
1
0
08 Oct 2024
Translation Canvas: An Explainable Interface to Pinpoint and Analyze Translation Systems
Chinmay Dandekar
Wenyuan Xu
Xi Xu
Siqi Ouyang
Lei Li
ELM
43
0
0
07 Oct 2024
SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks
Fenia Christopoulou
Ronald Cardenas
Gerasimos Lampouras
Haitham Bou-Ammar
Jun Wang
79
2
0
07 Oct 2024
EgoOops: A Dataset for Mistake Action Detection from Egocentric Videos Referring to Procedural Texts
Yuto Haneji
Taichi Nishimura
Hirotaka Kameko
Keisuke Shirai
Tomoya Yoshida
Keiya Kajimura
Koki Yamamoto
Taiyu Cui
Tomohiro Nishimoto
Shinsuke Mori
EgoV
84
2
0
07 Oct 2024
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Qiyuan Zhang
Yufei Wang
Tiezheng YU
Yuxin Jiang
Chuhan Wu
...
Xin Jiang
Lifeng Shang
Ruiming Tang
Fuyuan Lyu
Chen Ma
124
7
0
07 Oct 2024
Realizing Video Summarization from the Path of Language-based Semantic Understanding
Kuan-Chen Mu
Zhi-Yi Chin
Wei-Chen Chiu
47
0
0
06 Oct 2024
GlobeSumm: A Challenging Benchmark Towards Unifying Multi-lingual, Cross-lingual and Multi-document News Summarization
Yangfan Ye
Xiachong Feng
Xiaocheng Feng
Weitao Ma
Libo Qin
Dongliang Xu
Qing Yang
Hongtao Liu
Bing Qin
70
6
0
05 Oct 2024
Text2Chart31: Instruction Tuning for Chart Generation with Automatic Feedback
Fatemeh Pesaran Zadeh
Juyeon Kim
Jin-Hwa Kim
Gunhee Kim
ALM
126
5
0
05 Oct 2024
PersonalSum: A User-Subjective Guided Personalized Summarization Dataset for Large Language Models
Lemei Zhang
Peng Liu
Marcus Tiedemann Oekland Henriksboe
Even W. Lauvrak
J. Gulla
Heri Ramampiaro
87
1
0
04 Oct 2024
Crafting Narrative Closures: Zero-Shot Learning with SSM Mamba for Short Story Ending Generation
Divyam Sharma
Divya Santhanam
23
0
0
04 Oct 2024
What do Large Language Models Need for Machine Translation Evaluation?
Shenbin Qian
Archchana Sindhujan
Minnie Kabra
Diptesh Kanojia
Constantin Orasan
Tharindu Ranasinghe
Frédéric Blain
ELM
LRM
ALM
LM&MA
77
1
0
04 Oct 2024
Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval
Seungheon Doh
Minhee Lee
Dasaem Jeong
Juhan Nam
116
12
0
04 Oct 2024
PersoBench: Benchmarking Personalized Response Generation in Large Language Models
Saleh Afzoon
Usman Naseem
Amin Beheshti
Zahra Jamali
59
4
0
04 Oct 2024
Self-Powered LLM Modality Expansion for Large Speech-Text Models
Tengfei Yu
Xuebo Liu
Zhiyi Hou
Liang Ding
Dacheng Tao
Min Zhang
63
1
0
04 Oct 2024
Ward: Provable RAG Dataset Inference via LLM Watermarks
Nikola Jovanović
Robin Staab
Maximilian Baader
Martin Vechev
466
5
0
04 Oct 2024
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs
Wei Wu
Chao Wang
L. Chen
Mingze Yin
Yiheng Zhu
Kun Fu
Jieping Ye
Hui Xiong
Zheng Wang
143
1
0
04 Oct 2024
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
Wenhao Chai
Enxin Song
Y. Du
Chenlin Meng
Vashisht Madhavan
Omer Bar-Tal
Jeng-Neng Hwang
Saining Xie
Christopher D. Manning
3DV
217
37
0
04 Oct 2024
CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation
Han He
Qianchu Liu
Lei Xu
Chaitanya P. Shivade
Yi Zhang
S. Srinivasan
Katrin Kirchhoff
105
1
0
03 Oct 2024
Demonstration Attack against In-Context Learning for Code Intelligence
Yifei Ge
Weisong Sun
Yihang Lou
Chunrong Fang
Yiran Zhang
Yiming Li
Xiaofang Zhang
Yang Liu
Zhihong Zhao
Zhenyu Chen
AAML
59
2
0
03 Oct 2024
CodeJudge: Evaluating Code Generation with Large Language Models
Weixi Tong
Tianyi Zhang
ELM
ALM
58
17
0
03 Oct 2024
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
Genta Indra Winata
David Anugraha
Lucky Susanto
Garry Kuwanto
Derry Wijaya
169
11
0
03 Oct 2024
Agents' Room: Narrative Generation through Multi-step Collaboration
Fantine Huot
Reinald Kim Amplayo
Jennimaria Palomaki
Alice Shoshana Jakobovits
Elizabeth Clark
Mirella Lapata
102
16
0
03 Oct 2024
Better Instruction-Following Through Minimum Bayes Risk
Ian Wu
Patrick Fernandes
Amanda Bertsch
Seungone Kim
Sina Pakazad
Graham Neubig
135
11
0
03 Oct 2024
CALF: Benchmarking Evaluation of LFQA Using Chinese Examinations
Yuchen Fan
Xin Zhong
Heng Zhou
Yuchen Zhang
Mingyu Liang
Chengxing Xie
Ermo Hua
Ning Ding
Bowen Zhou
ALM
ELM
49
0
0
02 Oct 2024
On The Adaptation of Unlimiformer for Decoder-Only Transformers
Kian Ahrabian
Alon Benhaim
Barun Patra
Jay Pujara
Saksham Singhal
Xia Song
68
0
0
02 Oct 2024
GADFA: Generator-Assisted Decision-Focused Approach for Opinion Expressing Timing Identification
Chung-Chi Chen
Hiroya Takamura
Ichiro Kobayashi
Yusuke Miyao
62
0
0
02 Oct 2024
Frozen Large Language Models Can Perceive Paralinguistic Aspects of Speech
Wonjune Kang
Junteng Jia
Chunyang Wu
Wei Zhou
Egor Lakomkin
...
Leda Sari
Suyoun Kim
Ke Li
Jay Mahadeokar
Ozlem Kalinli
AuLLM
129
6
0
02 Oct 2024
Can visual language models resolve textual ambiguity with visual cues? Let visual puns tell you!
Jiwan Chung
Seungwon Lim
Jaehyun Jeon
Seungbeen Lee
Youngjae Yu
76
1
0
01 Oct 2024
Efficient Technical Term Translation: A Knowledge Distillation Approach for Parenthetical Terminology Translation
Jiyoon Myung
Jihyeon Park
Jungki Son
Kyungro Lee
Joohyung Han
71
1
0
01 Oct 2024
A Hitchhikers Guide to Fine-Grained Face Forgery Detection Using Common Sense Reasoning
Niki Maria Foteinopoulou
Enjie Ghorbel
Djamila Aouada
136
4
0
01 Oct 2024
Self-controller: Controlling LLMs with Multi-round Step-by-step Self-awareness
Xiao Peng
Xufan Geng
LLMAG
92
1
0
01 Oct 2024
DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation
Yi-Hao Peng
Faria Huq
Yue Jiang
Jason Wu
Amanda Li
Jeffrey P. Bigham
Amy Pavel
DiffM
81
5
0
30 Sep 2024
Analysing Zero-Shot Readability-Controlled Sentence Simplification
Abdullah Barayan
Jose Camacho-Collados
Fernando Alva-Manchego
83
3
0
30 Sep 2024
BSharedRAG: Backbone Shared Retrieval-Augmented Generation for the E-commerce Domain
Kaisi Guan
Qian Cao
Yuchong Sun
Xiting Wang
Ruihua Song
84
1
0
30 Sep 2024
Beyond Scores: A Modular RAG-Based System for Automatic Short Answer Scoring with Feedback
Menna Fateen
Bo Wang
Tsunenori Mine
AI4Ed
91
6
0
30 Sep 2024
Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues
Lei Sun
Jinming Zhao
Qin Jin
64
2
0
29 Sep 2024
CERD: A Comprehensive Chinese Rhetoric Dataset for Rhetorical Understanding and Generation in Essays
Nuowei Liu
Xinhao Chen
Hongyi Wu
Changzhi Sun
Man Lan
Yuanbin Wu
Xiaopeng Bai
Shaoguang Mao
Yan Xia
65
1
0
29 Sep 2024
A Critical Look at Meta-evaluating Summarisation Evaluation Metrics
Xiang Dai
Sarvnaz Karimi
Biaoyan Fang
64
0
0
29 Sep 2024
Previous
1
2
3
...
15
16
17
...
69
70
71
Next