ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.16437
  4. Cited By
Reasoning Runtime Behavior of a Program with LLM: How Far Are We?
v1v2 (latest)

Reasoning Runtime Behavior of a Program with LLM: How Far Are We?

25 March 2024
Junkai Chen
Zhiyuan Pan
Xing Hu
Zhenhao Li
Ge Li
Xin Xia
    LRM
ArXiv (abs)PDFHTML

Papers citing "Reasoning Runtime Behavior of a Program with LLM: How Far Are We?"

19 / 19 papers shown
Title
CodeCrash: Stress Testing LLM Reasoning under Structural and Semantic Perturbations
CodeCrash: Stress Testing LLM Reasoning under Structural and Semantic Perturbations
Man Ho Lam
Chaozheng Wang
Jen-tse Huang
Michael R. Lyu
LRM
72
1
0
19 Apr 2025
The Hitchhiker's Guide to Program Analysis, Part II: Deep Thoughts by LLMs
The Hitchhiker's Guide to Program Analysis, Part II: Deep Thoughts by LLMs
Haonan Li
Hang Zhang
Kexin Pei
Zhiyun Qian
106
1
0
16 Apr 2025
L0-Reasoning Bench: Evaluating Procedural Correctness in Language Models via Simple Program Execution
L0-Reasoning Bench: Evaluating Procedural Correctness in Language Models via Simple Program Execution
Simeng Sun
Cheng-Ping Hsieh
Faisal Ladhak
Erik Arakelyan
Santiago Akle Serano
Boris Ginsburg
ReLMELMLRM
472
0
0
28 Mar 2025
Can LLMs Reason About Program Semantics? A Comprehensive Evaluation of LLMs on Formal Specification Inference
Can LLMs Reason About Program Semantics? A Comprehensive Evaluation of LLMs on Formal Specification Inference
Thanh Le-Cong
Bach Le
Toby Murray
LRM
99
1
0
22 Feb 2025
Turbulence: Systematically and Automatically Testing Instruction-Tuned Large Language Models for Code
Turbulence: Systematically and Automatically Testing Instruction-Tuned Large Language Models for Code
Shahin Honarvar
Mark van der Wilk
Alastair Donaldson
184
8
0
28 Jan 2025
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code
  Completion
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion
Yangruibo Ding
Zijian Wang
Wasi Uddin Ahmad
Hantian Ding
Ming Tan
...
M. K. Ramanathan
Ramesh Nallapati
Parminder Bhatia
Dan Roth
Bing Xiang
ELM
86
130
0
17 Oct 2023
Large Language Models for Test-Free Fault Localization
Large Language Models for Test-Free Fault Localization
Aidan Z. H. Yang
Ruben Martins
Claire Le Goues
Vincent J. Hellendoorn
LRM
61
97
0
03 Oct 2023
LEVER: Learning to Verify Language-to-Code Generation with Execution
LEVER: Learning to Verify Language-to-Code Generation with Execution
Ansong Ni
Srini Iyer
Dragomir R. Radev
Ves Stoyanov
Wen-tau Yih
Sida I. Wang
Xi Lin
78
226
0
16 Feb 2023
Learning Performance-Improving Code Edits
Learning Performance-Improving Code Edits
Alex Shypula
Aman Madaan
Yiming Yang
Uri Alon
Jacob R. Gardner
Milad Hashemi
Graham Neubig
Parthasarathy Ranganathan
Osbert Bastani
Amir Yazdanbakhsh
SyDa
76
89
0
15 Feb 2023
Selection-Inference: Exploiting Large Language Models for Interpretable
  Logical Reasoning
Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning
Antonia Creswell
Murray Shanahan
I. Higgins
ReLMLRM
107
364
0
19 May 2022
A Comprehensive Survey of Few-shot Learning: Evolution, Applications,
  Challenges, and Opportunities
A Comprehensive Survey of Few-shot Learning: Evolution, Applications, Challenges, and Opportunities
Yisheng Song
Ting-Yuan Wang
S. Mondal
J. P. Sahoo
SLR
118
375
0
13 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&RoLRMAI4CEReLM
823
9,576
0
28 Jan 2022
Recent Advances in Natural Language Processing via Large Pre-Trained
  Language Models: A Survey
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey
Bonan Min
Hayley L Ross
Elior Sulem
Amir Pouran Ben Veyseh
Thien Huu Nguyen
Oscar Sainz
Eneko Agirre
Ilana Heinz
Dan Roth
LM&MAVLMAI4CE
162
1,080
0
01 Nov 2021
Program Synthesis with Large Language Models
Program Synthesis with Large Language Models
Jacob Austin
Augustus Odena
Maxwell Nye
Maarten Bosma
Henryk Michalewski
...
Ellen Jiang
Carrie J. Cai
Michael Terry
Quoc V. Le
Charles Sutton
ELMAIMatReCodALM
200
2,004
0
16 Aug 2021
QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering
  and Reading Comprehension
QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension
Anna Rogers
Matt Gardner
Isabelle Augenstein
120
167
0
27 Jul 2021
Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELMALM
233
5,635
0
07 Jul 2021
XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis
  and Beyond
XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond
Francesco Barbieri
Luis Espinosa Anke
Jose Camacho-Collados
221
222
0
25 Apr 2021
Measuring and Improving Consistency in Pretrained Language Models
Measuring and Improving Consistency in Pretrained Language Models
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Abhilasha Ravichander
Eduard H. Hovy
Hinrich Schütze
Yoav Goldberg
HILM
329
367
0
01 Feb 2021
GraphCodeBERT: Pre-training Code Representations with Data Flow
GraphCodeBERT: Pre-training Code Representations with Data Flow
Daya Guo
Shuo Ren
Shuai Lu
Zhangyin Feng
Duyu Tang
...
Dawn Drain
Neel Sundaresan
Jian Yin
Daxin Jiang
M. Zhou
145
1,143
0
17 Sep 2020
1