ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2208.03133
  4. Cited By
Out of the BLEU: how should we assess quality of the Code Generation
  models?

Out of the BLEU: how should we assess quality of the Code Generation models?

5 August 2022
Mikhail Evtikhiev
Egor Bogomolov
Yaroslav Sokolov
T. Bryksin
    ALM
ArXivPDFHTML

Papers citing "Out of the BLEU: how should we assess quality of the Code Generation models?"

12 / 12 papers shown
Title
BLEUBERI: BLEU is a surprisingly effective reward for instruction following
BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Yapei Chang
Yekyung Kim
Michael Krumdick
Amir Zadeh
Chuan Li
Chris Tanner
Mohit Iyyer
ALM
22
0
0
16 May 2025
Bridging LLM-Generated Code and Requirements: Reverse Generation technique and SBC Metric for Developer Insights
Bridging LLM-Generated Code and Requirements: Reverse Generation technique and SBC Metric for Developer Insights
Ahilan Ayyachamy Nadar Ponnusamy
66
0
0
11 Feb 2025
SyntheT2C: Generating Synthetic Data for Fine-Tuning Large Language Models on the Text2Cypher Task
SyntheT2C: Generating Synthetic Data for Fine-Tuning Large Language Models on the Text2Cypher Task
Ziije Zhong
Linqing Zhong
Zhaoze Sun
Qingyun Jin
Zengchang Qin
Xiaofan Zhang
63
7
0
28 Jan 2025
CRScore: Grounding Automated Evaluation of Code Review Comments in Code Claims and Smells
CRScore: Grounding Automated Evaluation of Code Review Comments in Code Claims and Smells
Atharva Naik
Marcus Alenius
Daniel Fried
Carolyn Rose
42
0
0
29 Sep 2024
Retrieval-augmented code completion for local projects using large
  language models
Retrieval-augmented code completion for local projects using large language models
Marko Hostnik
Marko Robnik-Sikonja
RALM
35
0
0
09 Aug 2024
Automating the Correctness Assessment of AI-generated Code for Security
  Contexts
Automating the Correctness Assessment of AI-generated Code for Security Contexts
Domenico Cotroneo
Alessio Foggia
Cristina Improta
Pietro Liguori
R. Natella
31
8
0
28 Oct 2023
Bias Testing and Mitigation in LLM-based Code Generation
Bias Testing and Mitigation in LLM-based Code Generation
Dong Huang
Qingwen Bu
Jie M. Zhang
Xiaofei Xie
Junjie Chen
Heming Cui
48
20
0
03 Sep 2023
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
Shuyan Zhou
Uri Alon
Sumit Agarwal
Graham Neubig
ELM
ALM
40
99
0
10 Feb 2023
Who Evaluates the Evaluators? On Automatic Metrics for Assessing
  AI-based Offensive Code Generators
Who Evaluates the Evaluators? On Automatic Metrics for Assessing AI-based Offensive Code Generators
Pietro Liguori
Cristina Improta
R. Natella
B. Cukic
Domenico Cotroneo
ELM
36
16
0
12 Dec 2022
Reading Between the Lines: Modeling User Behavior and Costs in
  AI-Assisted Programming
Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming
Hussein Mozannar
Gagan Bansal
Adam Fourney
Eric Horvitz
49
109
0
25 Oct 2022
Don't Complete It! Preventing Unhelpful Code Completion for Productive
  and Sustainable Neural Code Completion Systems
Don't Complete It! Preventing Unhelpful Code Completion for Productive and Sustainable Neural Code Completion Systems
Zhensu Sun
Xiaoning Du
Fu Song
Shangwen Wang
Mingze Ni
Li Li
29
10
0
13 Sep 2022
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding
  and Generation
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Shuai Lu
Daya Guo
Shuo Ren
Junjie Huang
Alexey Svyatkovskiy
...
Nan Duan
Neel Sundaresan
Shao Kun Deng
Shengyu Fu
Shujie Liu
ELM
204
853
0
09 Feb 2021
1