Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.09675
Cited By
v1
v2
v3 (latest)
BERTScore: Evaluating Text Generation with BERT
21 April 2019
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERTScore: Evaluating Text Generation with BERT"
50 / 3,520 papers shown
Title
Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts
Fan Gao
Hang Jiang
Rui Yang
Qingcheng Zeng
Jinghui Lu
Moritz Blum
Dairui Liu
Tianwei She
Yuang Jiang
Irene Li
ELM
ALM
LM&MA
93
9
0
21 Aug 2023
FairMonitor: A Four-Stage Automatic Framework for Detecting Stereotypes and Biases in Large Language Models
Yanhong Bai
Jiabao Zhao
Jinxin Shi
Tingjiang Wei
Xingjiao Wu
Liangbo He
58
0
0
21 Aug 2023
PACE: Improving Prompt with Actor-Critic Editing for Large Language Model
Yihong Dong
Kangcheng Luo
Xue Jiang
Zhi Jin
Ge Li
LRM
KELM
107
9
0
19 Aug 2023
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models
Yilin Wen
Zifeng Wang
Jimeng Sun
ReLM
97
78
0
17 Aug 2023
CMD: a framework for Context-aware Model self-Detoxification
Zecheng Tang
Keyan Zhou
Juntao Li
Yuyang Ding
Pinzheng Wang
Bowen Yan
Minzhang
MU
65
5
0
16 Aug 2023
MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation
Junru Lu
Siyu An
Mingbao Lin
Gabriele Pergola
Yulan He
Di Yin
Xing Sun
Yunsheng Wu
127
40
0
16 Aug 2023
Development and Evaluation of Three Chatbots for Postpartum Mood and Anxiety Disorders
X. Yao
M. Mikhelson
S. C. Watkins
Eunsol Choi
Edison Thomaz
K. D. Barbaro
AI4MH
94
1
0
14 Aug 2023
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
Chi-Min Chan
Weize Chen
Yusheng Su
Jianxuan Yu
Wei Xue
Shan Zhang
Jie Fu
Zhiyuan Liu
ELM
LLMAG
ALM
99
504
0
14 Aug 2023
Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage
Dario Cioni
Lorenzo Berlincioni
Federico Becattini
A. Bimbo
DiffM
61
10
0
14 Aug 2023
Can Knowledge Graphs Simplify Text?
Anthony Colas
Haodi Ma
Xuanli He
Yang Bai
D. Wang
83
4
0
14 Aug 2023
VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Yonatan Bitton
Hritik Bansal
Jack Hessel
Rulin Shao
Wanrong Zhu
Anas Awadalla
Josh Gardner
Rohan Taori
L. Schimdt
VLM
129
82
0
12 Aug 2023
Three Ways of Using Large Language Models to Evaluate Chat
Ondvrej Plátek
Vojtvech Hudevcek
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
ALM
54
6
0
12 Aug 2023
A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment
Ying Zhao
Yu Bowen
Binyuan Hui
Haiyang Yu
Fei Huang
Yongbin Li
N. Zhang
125
25
0
10 Aug 2023
LLaMA-E: Empowering E-commerce Authoring with Object-Interleaved Instruction Following
Kaize Shi
Xueyao Sun
Dingxian Wang
Yinlin Fu
Guandong Xu
Qing Li
86
4
0
09 Aug 2023
Adapting Foundation Models for Information Synthesis of Wireless Communication Specifications
Manikanta Kotaru
134
10
0
08 Aug 2023
ALens: An Adaptive Domain-Oriented Abstract Writing Training Tool for Novice Researchers
Chen Cheng
Ziang Li
Zhenhui Peng
Quan Li
70
0
0
08 Aug 2023
Learning Evaluation Models from Large Language Models for Sequence Generation
Chenglong Wang
Hang Zhou
Kai-Chun Chang
Tongran Liu
Chunliang Zhang
Quan Du
Tong Xiao
Yue Zhang
Jingbo Zhu
ELM
160
4
0
08 Aug 2023
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies
Liangming Pan
Michael Stephen Saxon
Wenda Xu
Deepak Nathani
Xinyi Wang
William Yang Wang
KELM
LRM
116
216
0
06 Aug 2023
Towards Multiple References Era -- Addressing Data Leakage and Limited Reference Diversity in NLG Evaluation
Xianfeng Zeng
Yanjun Liu
Fandong Meng
Jie Zhou
55
0
0
06 Aug 2023
PromptSum: Parameter-Efficient Controllable Abstractive Summarization
Mathieu Ravaut
Hailin Chen
Ruochen Zhao
Chengwei Qin
Shafiq Joty
Nancy Chen
54
2
0
06 Aug 2023
System-Initiated Transitions from Chit-Chat to Task-Oriented Dialogues with Transition Info Extractor and Transition Sentence Generator
Ye Liu
Stefan Ultes
Wolfgang Minker
Wolfgang Maier
86
4
0
06 Aug 2023
Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data
Chaoyi Wu
Xiaoman Zhang
Ya Zhang
Yanfeng Wang
Weidi Xie
MedIm
LM&MA
101
168
0
04 Aug 2023
Redundancy Aware Multi-Reference Based Gainwise Evaluation of Extractive Summarization
Mousumi Akter
Shubhra (Santu) Karmaker
60
1
0
04 Aug 2023
Wider and Deeper LLM Networks are Fairer LLM Evaluators
Xinghua Zhang
Yu Bowen
Haiyang Yu
Yangyu Lv
Tingwen Liu
Fei Huang
Hongbo Xu
Yongbin Li
ALM
146
90
0
03 Aug 2023
Learning Implicit Entity-object Relations by Bidirectional Generative Alignment for Multimodal NER
Feng Chen
Jiajia Liu
Kaixiang Ji
Wang Ren
Jian Wang
Jingdong Wang
56
10
0
03 Aug 2023
Multimodal Neurons in Pretrained Text-Only Transformers
Sarah Schwettmann
Neil Chowdhury
Samuel J. Klein
David Bau
Antonio Torralba
MILM
92
32
0
03 Aug 2023
Towards Effective Ancient Chinese Translation: Dataset, Model, and Evaluation
Geyang Guo
Jiarong Yang
Fengyuan Lu
Jiaxin Qin
Tianyi Tang
Wayne Xin Zhao
26
9
0
01 Aug 2023
Reinforcement Learning for Generative AI: State of the Art, Opportunities and Open Research Challenges
Giorgio Franceschelli
Mirco Musolesi
AI4CE
139
22
0
31 Jul 2023
HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution
Ehsan Kamalloo
A. Jafari
Xinyu Crystina Zhang
Nandan Thakur
Jimmy J. Lin
70
44
0
31 Jul 2023
Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering
Vaibhav Adlakha
Parishad BehnamGhader
Xing Han Lù
Nicholas Meade
Siva Reddy
107
129
0
31 Jul 2023
No that's not what I meant: Handling Third Position Repair in Conversational Question Answering
Vevake Balaraman
Arash Eshghi
Ioannis Konstas
Ioannis V. Papaioannou
KELM
23
4
0
31 Jul 2023
Camoscio: an Italian Instruction-tuned LLaMA
Andrea Santilli
Emanuele Rodolà
88
27
0
31 Jul 2023
LP-MusicCaps: LLM-Based Pseudo Music Captioning
Seungheon Doh
Keunwoo Choi
Jongpil Lee
Juhan Nam
72
82
0
31 Jul 2023
Uncertainty in Natural Language Generation: From Theory to Applications
Joris Baan
Nico Daheim
Evgenia Ilia
Dennis Ulmer
Haau-Sing Li
Raquel Fernández
Barbara Plank
Rico Sennrich
Chrysoula Zerva
Wilker Aziz
UQLM
158
45
0
28 Jul 2023
Reasoning before Responding: Integrating Commonsense-based Causality Explanation for Empathetic Response Generation
Yahui Fu
K. Inoue
Chenhui Chu
Tatsuya Kawahara
LRM
80
13
0
28 Jul 2023
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
79
61
0
27 Jul 2023
Med-Flamingo: a Multimodal Medical Few-shot Learner
Michael Moor
Qian Huang
Shirley Wu
Michihiro Yasunaga
C. Zakka
Yashodhara Dalmia
E. Reis
Pranav Rajpurkar
J. Leskovec
LM&MA
MedIm
93
273
0
27 Jul 2023
What Makes a Good Paraphrase: Do Automated Evaluations Work?
A. Moskvina
Bhushan Kotnis
C. Catacata
Michael Janz
Nasrin Saef
26
0
0
27 Jul 2023
Metric-Based In-context Learning: A Case Study in Text Simplification
Subhadra Vadlamannati
Gözde Gül Sahin
73
2
0
27 Jul 2023
Controllable Generation of Dialogue Acts for Dialogue Systems via Few-Shot Response Generation and Ranking
Angela Ramirez
Karik Agarwal
Juraj Juraska
Utkarsh Garg
M. Walker
78
6
0
26 Jul 2023
This is not correct! Negation-aware Evaluation of Language Generation Systems
Miriam Anschütz
Diego Miguel Lozano
Georg Groh
95
11
0
26 Jul 2023
Trustworthiness of Children Stories Generated by Large Language Models
Prabin Bhandari
H. M. Brennan
73
2
0
25 Jul 2023
Kefa: A Knowledge Enhanced and Fine-grained Aligned Speaker for Navigation Instruction Generation
Haitian Zeng
Xiaohan Wang
Wenguan Wang
Yi Yang
80
7
0
25 Jul 2023
Prot2Text: Multimodal Protein's Function Generation with GNNs and Transformers
Hadi Abdine
Michail Chatzianastasis
Costas Bouyioukos
Michalis Vazirgiannis
71
45
0
25 Jul 2023
Guidance in Radiology Report Summarization: An Empirical Evaluation and Error Analysis
Jan Trienes
Paul Youssef
Jorg Schlotterer
Christin Seifert
51
0
0
24 Jul 2023
Lost In Translation: Generating Adversarial Examples Robust to Round-Trip Translation
Neel Bhandari
Pin-Yu Chen
AAML
SILM
88
3
0
24 Jul 2023
On the Effectiveness of Offline RL for Dialogue Response Generation
Paloma Sodhi
Felix Wu
Ethan R. Elenberg
Kilian Q. Weinberger
Ryan T. McDonald
OffRL
82
5
0
23 Jul 2023
PSentScore: Evaluating Sentiment Polarity in Dialogue Summarization
Yongxin Zhou
Fabien Ringeval
Franccois Portet
89
1
0
23 Jul 2023
A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical Tasks
Yanis Labrak
Mickael Rouvier
Richard Dufour
LM&MA
84
29
0
22 Jul 2023
Incorporating Human Translator Style into English-Turkish Literary Machine Translation
Zeynep Yi̇rmi̇beşoğlu
Olgun Dursun
Harun Dalli
Mehmet Şahin
Ena Hodzik
Sabri Gürses
Tunga Güngör
60
0
0
21 Jul 2023
Previous
1
2
3
...
41
42
43
...
69
70
71
Next