Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.09675
Cited By
v1
v2
v3 (latest)
BERTScore: Evaluating Text Generation with BERT
21 April 2019
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERTScore: Evaluating Text Generation with BERT"
50 / 3,519 papers shown
Title
A Pilot Empirical Study on When and How to Use Knowledge Graphs as Retrieval Augmented Generation
Xujie Yuan
Yongxu Liu
Shimin Di
Shiwen Wu
Libin Zheng
Rui Meng
Lei Chen
Xiaofang Zhou
Jian Yin
146
0
0
28 Feb 2025
A Survey of Uncertainty Estimation Methods on Large Language Models
Zhiqiu Xia
Jinxuan Xu
Yuqian Zhang
Hang Liu
104
3
0
28 Feb 2025
LLM as a Broken Telephone: Iterative Generation Distorts Information
Amr Mohamed
Mingmeng Geng
Michalis Vazirgiannis
Guokan Shang
147
2
0
27 Feb 2025
EdiText: Controllable Coarse-to-Fine Text Editing with Diffusion Language Models
Che Hyun Lee
Heeseung Kim
Jiheum Yeom
Sungroh Yoon
DiffM
118
1
0
27 Feb 2025
Multi2: Multi-Agent Test-Time Scalable Framework for Multi-Document Processing
Juntai Cao
Xiang Zhang
Raymond Li
Chuyuan Li
Shafiq Joty
Shafiq Joty
Giuseppe Carenini
172
2
0
27 Feb 2025
Advancements in Natural Language Processing for Automatic Text Summarization
Nevidu Jayatilleke
Ruvan Weerasinghe
Nipuna Senanayake
368
1
0
27 Feb 2025
Winning Big with Small Models: Knowledge Distillation vs. Self-Training for Reducing Hallucination in QA Agents
A. Lewis
Michael White
Jing Liu
T. Koike-Akino
K. Parsons
Yanjie Wang
HILM
165
0
0
26 Feb 2025
Agent-centric Information Access
Evangelos Kanoulas
Panagiotis Eustratiadis
Yongkang Li
Yougang Lyu
Vaishali Pal
Gabrielle Poerwawinata
Jingfen Qiao
Zihan Wang
AIFin
53
1
0
26 Feb 2025
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
Zhengping Jiang
Anqi Liu
Benjamin Van Durme
168
3
0
26 Feb 2025
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
Jiawei Huang
Bingcong Li
Christoph Dann
Niao He
OffRL
266
3
0
26 Feb 2025
Less or More: Towards Glanceable Explanations for LLM Recommendations Using Ultra-Small Devices
Xinru Wang
Mengjie Yu
Hannah Nguyen
Michael Iuzzolino
Tianyi Wang
...
Ting Zhang
Naveen Sendhilnathan
Hrvoje Benko
Haijun Xia
Tanya R. Jonker
81
0
0
26 Feb 2025
IndicEval-XL: Bridging Linguistic Diversity in Code Generation Across Indic Languages
Ujjwal Singh
Aditi Sharma
Nikhil Gupta
Deepakshi
Vivek Kumar Jha
ELM
45
0
0
26 Feb 2025
Evaluating LLMs and Pre-trained Models for Text Summarization Across Diverse Datasets
Tohida Rehman
Soumabha Ghosh
Kuntal Das
Souvik Bhattacharjee
Debarshi Kumar Sanyal
S. Chattopadhyay
121
0
0
26 Feb 2025
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors
Jakub Macina
Nico Daheim
Ido Hakimi
Manu Kapur
Iryna Gurevych
Mrinmaya Sachan
ELM
124
4
0
26 Feb 2025
Stay Focused: Problem Drift in Multi-Agent Debate
Jonas Becker
Lars Benedikt Kaesberg
Andreas Stephan
Jan Philip Wahle
Terry Ruas
Bela Gipp
143
2
0
26 Feb 2025
Evidence-Driven Marker Extraction for Social Media Suicide Risk Detection
Carter Adams
Caleb Carter
Jackson Simmons
110
0
0
26 Feb 2025
CaseGen: A Benchmark for Multi-Stage Legal Case Documents Generation
Haitao Li
Jiaying Ye
Yiran Hu
Jia Chen
Qingyao Ai
...
Junjie Chen
Yuxiao Chen
Cheng Luo
Quan Zhou
Yixiao Liu
AILaw
ELM
126
2
0
25 Feb 2025
Independent Mobility GPT (IDM-GPT): A Self-Supervised Multi-Agent Large Language Model Framework for Customized Traffic Mobility Analysis Using Machine Learning Models
Fengze Yang
Xiaoyue Cathy Liu
Lingjiu Lu
Bingzhang Wang
Chenxi
90
1
0
25 Feb 2025
NUTSHELL: A Dataset for Abstract Generation from Scientific Talks
Maike Züfle
Sara Papi
Beatrice Savoldi
Marco Gaido
L. Bentivogli
Jan Niehues
86
2
0
24 Feb 2025
StatLLM: A Dataset for Evaluating the Performance of Large Language Models in Statistical Analysis
Xinyi Song
Lina Lee
Kexin Xie
Xueying Liu
Xinwei Deng
Yili Hong
ALM
ELM
449
1
0
24 Feb 2025
PosterSum: A Multimodal Benchmark for Scientific Poster Summarization
Rohit Saxena
Pasquale Minervini
Frank Keller
VLM
104
2
0
24 Feb 2025
Is Relevance Propagated from Retriever to Generator in RAG?
Fangzheng Tian
Debasis Ganguly
Craig Macdonald
RALM
94
2
0
24 Feb 2025
Towards Conditioning Clinical Text Generation for User Control
Osman Alperen Koras
Rabi Bahnan
Jens Kleesiek
Amin Dada
70
0
0
24 Feb 2025
All-in-one: Understanding and Generation in Multimodal Reasoning with the MAIA Benchmark
Davide Testa
Giovanni Bonetta
Raffaella Bernardi
Alessandro Bondielli
Alessandro Lenci
Alessio Miaschi
Lucia Passaro
Bernardo Magnini
VGen
LRM
88
0
0
24 Feb 2025
Lost in Space: Optimizing Tokens for Grammar-Constrained Decoding
Sil Hamilton
David Mimno
104
0
0
24 Feb 2025
Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Rylan Schaeffer
Punit Singh Koura
Binh Tang
R. Subramanian
Aaditya K. Singh
...
Vedanuj Goswami
Sergey Edunov
Dieuwke Hupkes
Sanmi Koyejo
Sharan Narang
ALM
144
1
0
24 Feb 2025
Bridging Information Gaps with Comprehensive Answers: Improving the Diversity and Informativeness of Follow-Up Questions
Zhe Liu
Taekyu Kang
Haoran Wang
S. Alavi
Vered Shwartz
102
0
0
24 Feb 2025
Mutual Reinforcement of LLM Dialogue Synthesis and Summarization Capabilities for Few-Shot Dialogue Summarization
Yen-Ju Lu
Ting-Yao Hu
H. Koppula
Hadi Pouransari
Jen-Hao Rick Chang
...
Xiang Kong
Qi Zhu
Simon Wang
Oncel Tuzel
Raviteja Vemulapalli
72
0
0
24 Feb 2025
BP-GPT: Auditory Neural Decoding Using fMRI-prompted LLM
Xiaoyu Chen
Changde Du
Che Liu
Yizhe Wang
Huiguang He
140
0
0
24 Feb 2025
Reasoning About Persuasion: Can LLMs Enable Explainable Propaganda Detection?
Maram Hasanain
Md. Arid Hasan
Mohamed Bayan Kmainasi
Elisa Sartori
Ali Ezzat Shahroor
Giovanni Da San Martino
Firoj Alam
89
0
0
23 Feb 2025
Fine-Grained Video Captioning through Scene Graph Consolidation
Sanghyeok Chu
Seonguk Seo
Bohyung Han
110
1
0
23 Feb 2025
Towards Fully-Automated Materials Discovery via Large-Scale Synthesis Dataset and Expert-Level LLM-as-a-Judge
Heegyu Kim
Taeyang Jeon
Seungtaek Choi
Jihoon Hong
Dongwon Jeon
...
Jisu Bae
Chihoon Lee
Yunseo Kim
Jinsung Park
Hyunsouk Cho
ELM
125
0
1
23 Feb 2025
Code Summarization Beyond Function Level
Vladimir Makharev
Vladimir Ivanov
101
0
0
23 Feb 2025
OrderSum: Semantic Sentence Ordering for Extractive Summarization
Taewan Kwon
Sangyong Lee
53
0
0
22 Feb 2025
Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents
Ivoline Ngong
Swanand Kadhe
Hao Wang
K. Murugesan
Justin D. Weisz
Amit Dhurandhar
Karthikeyan N. Ramamurthy
74
5
0
22 Feb 2025
eC-Tab2Text: Aspect-Based Text Generation from e-Commerce Product Tables
Luis Antonio Gutiérrez Guanilo
Mir Tafseer Nayeem
Cristian López
Davood Rafiei
LMTD
159
0
0
21 Feb 2025
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation
Zhaopeng Feng
Jiayuan Su
Jiamei Zheng
Jiahan Ren
Yan Zhang
Jian Wu
Hongwei Wang
Zuozhu Liu
ELM
273
1
0
21 Feb 2025
Enhancing RWKV-based Language Models for Long-Sequence Text Generation
Xinghan Pan
134
0
0
21 Feb 2025
Are Rules Meant to be Broken? Understanding Multilingual Moral Reasoning as a Computational Pipeline with UniMoral
Shivani Kumar
David Jurgens
LRM
126
1
0
21 Feb 2025
On Synthesizing Data for Context Attribution in Question Answering
Gorjan Radevski
Kiril Gashteovski
Shahbaz Syed
Christopher Malon
Sebastien Nicolas
...
Masafumi Enomoto
Kunihiro Takeoka
Masafumi Oyamada
Goran Glavaš
Carolin (Haas) Lawrence
39
0
0
21 Feb 2025
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation
SeongYeub Chu
JongWoo Kim
MunYong Yi
136
4
0
21 Feb 2025
RAG-Optimized Tibetan Tourism LLMs: Enhancing Accuracy and Personalization
Jinhu Qi
Shuai Yan
Yibo Zhang
Wentao Zhang
Rong Jin
Yihan Hu
Ke Wang
3DV
136
2
0
21 Feb 2025
Understand User Opinions of Large Language Models via LLM-Powered In-the-Moment User Experience Interviews
Mengqiao Liu
Tevin Wang
Cassandra A. Cohen
Sarah Li
Chenyan Xiong
LRM
118
0
0
21 Feb 2025
Generative Video Semantic Communication via Multimodal Semantic Fusion with Large Model
Hang Yin
Li Qiao
Yu Ma
Shuo Sun
Kan Li
Zhen Gao
Dusit Niyato
DiffM
VGen
476
0
0
20 Feb 2025
A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond
Shreya Shukla
Jose Torres
Abhijit Mishra
Jacek Gwizdka
Shounak Roychowdhury
118
0
0
20 Feb 2025
Prompting a Weighting Mechanism into LLM-as-a-Judge in Two-Step: A Case Study
Wenwen Xie
Gray Gwizdz
Dongji Feng
134
0
0
20 Feb 2025
Mind the Style Gap: Meta-Evaluation of Style and Attribute Transfer Metrics
Amalie Brogaard Pauli
Isabelle Augenstein
Ira Assent
143
0
0
20 Feb 2025
Batayan: A Filipino NLP benchmark for evaluating Large Language Models
Jann Railey Montalan
Jimson Paulo Layacan
David Demitri Africa
Richell Isaiah Flores
Michael T. Lopez II
Theresa Denise Magsajo
Anjanette Cayabyab
William-Chandra Tjhi
67
0
0
19 Feb 2025
HPSS: Heuristic Prompting Strategy Search for LLM Evaluators
Bosi Wen
Pei Ke
Yufei Sun
C. Wang
Xiaotao Gu
Jinfeng Zhou
Jie Tang
Hongning Wang
Minlie Huang
17
0
0
18 Feb 2025
G-Refer: Graph Retrieval-Augmented Large Language Model for Explainable Recommendation
Yuhan Li
Xinni Zhang
Linhao Luo
Heng Chang
Yuxiang Ren
Irwin King
Jiajian Li
122
8
0
18 Feb 2025
Previous
1
2
3
...
8
9
10
...
69
70
71
Next