v1v2v3 (latest)

BERTScore: Evaluating Text Generation with BERT

21 April 2019

Papers citing "BERTScore: Evaluating Text Generation with BERT"

50 / 3,519 papers shown

Title
CRScore: Grounding Automated Evaluation of Code Review Comments in Code Claims and Smells Atharva Naik Marcus Alenius Daniel Fried Carolyn Rose 110 1 0 29 Sep 2024
DENEB: A Hallucination-Robust Automatic Evaluation Metric for Image Captioning Kazuki Matsuda Yuiga Wada Komei Sugiura 61 1 0 28 Sep 2024
Edit-Constrained Decoding for Sentence Simplification Tatsuya Zetsu Yuki Arase Tomoyuki Kajiwara 53 0 0 28 Sep 2024
SciDoc2Diagrammer-MAF: Towards Generation of Scientific Diagrams from Documents guided by Multi-Aspect Feedback Refinement Ishani Mondal Zongxia Li Yufang Hou Anandhavelu Natarajan Aparna Garimella Jordan Boyd-Graber 65 4 0 28 Sep 2024
Show and Guide: Instructional-Plan Grounded Vision and Language Model Diogo Glória-Silva David Semedo João Magalhães 46 0 0 27 Sep 2024
LLMs4Synthesis: Leveraging Large Language Models for Scientific Synthesis Hamed Babaei Giglou Jennifer D'Souza Sören Auer 75 5 0 27 Sep 2024
A Survey on the Honesty of Large Language Models Siheng Li Cheng Yang Taiqiang Wu Chufan Shi Yuji Zhang ... Jie Zhou Yujiu Yang Ngai Wong Xixin Wu Wai Lam HILM 108 6 0 27 Sep 2024
Rehearsing Answers to Probable Questions with Perspective-Taking Yung-Yu Shih Ziwei Xu Hiroya Takamura Yun-Nung Chen Chung-Chi Chen 70 1 0 27 Sep 2024
Co-Trained Retriever-Generator Framework for Question Generation in Earnings Calls Yining Juan Chung-Chi Chen Hen-Hsen Huang Hsin-Hsi Chen RALM 66 0 0 27 Sep 2024
Exploring Language Model Generalization in Low-Resource Extractive QA Saptarshi Sengupta Wenpeng Yin Preslav Nakov Shreya Ghosh Suhang Wang 96 1 0 27 Sep 2024
Multimodal Pragmatic Jailbreak on Text-to-image Models Tong Liu Zhixin Lai Jiawen Wang Gengyuan Zhang Shuo Chen Philip Torr Vera Demberg Volker Tresp Jindong Gu 71 5 0 27 Sep 2024
EgoLM: Multi-Modal Language Model of Egocentric Motions Fangzhou Hong Vladimir Guzov Hyo Jin Kim Yuting Ye Richard Newcombe Ziwei Liu Lingni Ma 78 4 0 26 Sep 2024
Evaluation of Large Language Models for Summarization Tasks in the Medical Domain: A Narrative Review Emma Croxford Yanjun Gao Nicholas Pellegrino Karen K. Wong Graham Wills Elliot First Frank J. Liao Cherodeep Goswami Brian Patterson Majid Afshar HILM ELM LM&MA 129 1 0 26 Sep 2024
Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect Guokan Shang Hadi Abdine Yousef Khoubrane Amr Mohamed Yassine Abbahaddou ... Xuguang Ren Eric Moulines Preslav Nakov Michalis Vazirgiannis Eric Xing 85 6 0 26 Sep 2024
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models Shaoxiong Ji Zihao Li Indraneil Paul Jaakko Paavola Peiqin Lin ... Dayyán O'Brien Hengyu Luo Hinrich Schütze Jörg Tiedemann Barry Haddow CLL 120 7 0 26 Sep 2024
Explanation Bottleneck Models Shinýa Yamaguchi Kosuke Nishida LRM BDL 134 2 0 26 Sep 2024
AXCEL: Automated eXplainable Consistency Evaluation using LLMs P Aditya Sreekar Sahil Verma Suransh Chopra Sarik Ghazarian Abhishek Persad Narayanan Sadagopan LRM 42 1 0 25 Sep 2024
Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated Texts Taehun Cha Donghun Lee HILM 62 1 0 25 Sep 2024
Domain-Independent Automatic Generation of Descriptive Texts for Time-Series Data Kota Dohi Aoi Ito Harsh Purohit Tomoya Nishida Takashi Endo Yohei Kawaguchi 50 3 0 25 Sep 2024
Overview of the First Shared Task on Clinical Text Generation: RRG24 and "Discharge Me!" Justin Xu Zhihong Chen Andrew Johnston Louis Blankemeier Maya Varma ... Ankit Modi Robert Lloyd Benjamin Hopkins Curtis Langlotz Jean-Benoit Delbrouck LM&MA 95 26 0 25 Sep 2024
DiaSynth: Synthetic Dialogue Generation Framework for Low Resource Dialogue Applications Sathya Krishnan Suresh Wu Mengjun Tushar Pranav Eng Siong Chng 45 0 0 25 Sep 2024
Do the Right Thing, Just Debias! Multi-Category Bias Mitigation Using LLMs Amartya Roy Danush Khanna Devanshu Mahapatra Vasanthakumar Avirup Das Kripabandhu Ghosh 21 0 0 24 Sep 2024
FMDLlama: Financial Misinformation Detection based on Large Language Models Zhiwei Liu Xin Zhang Kailai Yang Qianqian Xie Jimin Huang Sophia Ananiadou ALM 72 3 0 24 Sep 2024
Learning When to Retrieve, What to Rewrite, and How to Respond in Conversational QA Nirmal Roy Leonardo F. R. Ribeiro Rexhina Blloshmi Kevin Small RALM 75 4 0 23 Sep 2024
Using Similarity to Evaluate Factual Consistency in Summaries Yuxuan Ye Edwin Simpson Raul Santos Rodriguez HILM 43 2 0 23 Sep 2024
Advancing Video Quality Assessment for AIGC Xinli Yue Jianhui Sun Han Kong Liangchao Yao Tianyi Wang ... Jing Lv Fan Xia Yuetang Deng Qian Wang Lingchen Zhao VGen EGVM 82 0 0 23 Sep 2024
LINKAGE: Listwise Ranking among Varied-Quality References for Non-Factoid QA Evaluation via LLMs Sihui Yang Keping Bi Wanqing Cui Jiafeng Guo Xueqi Cheng 100 3 0 23 Sep 2024
Parse Trees Guided LLM Prompt Compression Wenhao Mao Chengbin Hou Tianyu Zhang Xinyu Lin Ke Tang Hairong Lv 59 0 0 23 Sep 2024
Can pre-trained language models generate titles for research papers? Tohida Rehman Debarshi Kumar Sanyal S. Chattopadhyay 97 3 0 22 Sep 2024
Beyond Persuasion: Towards Conversational Recommender System with Credible Explanations Peixin Qin Chen Huang Yang Deng Wenqiang Lei Tat-Seng Chua LRM 128 4 0 22 Sep 2024
Can AI writing be salvaged? Mitigating Idiosyncrasies and Improving Human-AI Alignment in the Writing Process through Edits Tuhin Chakrabarty Philippe Laban Chien-Sheng Wu 129 13 0 22 Sep 2024
SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information Jiashuo Sun Jihai Zhang Yucheng Zhou Zhaochen Su Xiaoye Qu Yu Cheng 91 13 0 21 Sep 2024
YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models Abhilash Nandy Yash Agarwal Ashish Patwa Millon Madhur Das Aman Bansal Ankit Raj Pawan Goyal Niloy Ganguly 61 0 0 20 Sep 2024
SLaVA-CXR: Small Language and Vision Assistant for Chest X-ray Report Automation Jinge Wu Yunsoo Kim Daqian Shi David Cliffton Fenglin Liu Honghan Wu 38 1 0 20 Sep 2024
FIHA: Autonomous Hallucination Evaluation in Vision-Language Models with Davidson Scene Graphs Bowen Yan Zhengsong Zhang Liqiang Jing Eftekhar Hossain Xinya Du 116 3 0 20 Sep 2024
Kalahi: A handcrafted, grassroots cultural LLM evaluation suite for Filipino Jann Railey Montalan Jian Gang Ngui Wei Qi Leong Yosephine Susanto Hamsawardhini Rengarajan William-Chandra Tjhi Alham Fikri Aji 84 3 0 20 Sep 2024
Guided Profile Generation Improves Personalization with LLMs Jiarui Zhang 80 7 0 19 Sep 2024
HeadCT-ONE: Enabling Granular and Controllable Automated Evaluation of Head CT Radiology Report Generation J. N. Acosta Xiaoman Zhang Siddhant Dogra Hong-Yu Zhou Seyedmehdi Payabvash Guido J. Falcone Eric K. Oermann Pranav Rajpurkar MedIm 46 0 0 19 Sep 2024
TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning Shivam Shandilya Menglin Xia Supriyo Ghosh Huiqiang Jiang Jue Zhang Qianhui Wu Victor Rühle 82 7 0 19 Sep 2024
CLAIR-A: Leveraging Large Language Models to Judge Audio Captions Tsung-Han Wu Joseph E. Gonzalez Trevor Darrell David M. Chan 127 2 0 19 Sep 2024
Pay Attention to What Matters Pedro Luiz Silva Antonio De Domenico Ali Maatouk Fadhel Ayed ALM 46 1 0 19 Sep 2024
Text2Traj2Text: Learning-by-Synthesis Framework for Contextual Captioning of Human Movement Trajectories Hikaru Asano Ryo Yonetani Taiki Sekii Hiroki Ouchi 95 0 0 19 Sep 2024
Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework Yuping Wu Hao Li Hongbo Zhu Goran Nenadic Xiao-Jun Zeng 106 1 0 18 Sep 2024
CREAM: Comparison-Based Reference-Free ELO-Ranked Automatic Evaluation for Meeting Summarization Ziwei Gong Lin Ai Harshsaiprasad Deshpande Alexander Johnson Emmy Phung Zehui Wu Ahmad Emami Julia Hirschberg 105 2 0 17 Sep 2024
Exploring Fine-tuned Generative Models for Keyphrase Selection: A Case Study for Russian Anna Glazkova Dmitry A. Morozov 55 1 0 16 Sep 2024
A Benchmark Dataset with Larger Context for Non-Factoid Question Answering over Islamic Text Faiza Qamar Seemab Latif R. Latif 62 1 0 15 Sep 2024
Guiding Vision-Language Model Selection for Visual Question-Answering Across Tasks, Domains, and Knowledge Types Neelabh Sinha Vinija Jain Aman Chadha 70 3 0 14 Sep 2024
NovAScore: A New Automated Metric for Evaluating Document Level Novelty Lin Ai Ziwei Gong Harshsaiprasad Deshpande Alexander Johnson Emmy Phung Ahmad Emami Julia Hirschberg 42 1 0 14 Sep 2024
Protecting Copyright of Medical Pre-trained Language Models: Training-Free Backdoor Model Watermarking Cong Kong Rui Xu Weixi Chen Jiawei Chen Z. Yin AAML MedIm 48 0 0 14 Sep 2024
Electrocardiogram Report Generation and Question Answering via Retrieval-Augmented Self-Supervised Modeling Jialu Tang Tong Xia Yuan Lu Cecilia Mascolo Aaqib Saeed 75 5 0 13 Sep 2024