Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.02883
Cited By
CoIR: A Comprehensive Benchmark for Code Information Retrieval Models
3 July 2024
Xiangyang Li
Kuicai Dong
Yi Quan Lee
Wei Xia
Yichun Yin
Xinyi Dai
Yasheng Wang
Ruiming Tang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CoIR: A Comprehensive Benchmark for Code Information Retrieval Models"
19 / 19 papers shown
Title
SweRank: Software Issue Localization with Code Ranking
R. Reddy
Tarun Suresh
JaeHyeok Doo
Yong-Jin Liu
Xuan-Phi Nguyen
Yingbo Zhou
Semih Yavuz
Caiming Xiong
Heng Ji
Shafiq R. Joty
26
0
0
07 May 2025
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents
Nandan Thakur
Jimmy J. Lin
Sam Havens
Michael Carbin
Omar Khattab
Andrew Drozdov
41
2
0
17 Apr 2025
Code-Craft: Hierarchical Graph-Based Code Summarization for Enhanced Context Retrieval
David Sounthiraraj
Jared Hancock
Yassin Kortam
Ashok Javvaji
Prabhat Singh
Shaila Shankar
21
0
0
11 Apr 2025
LoRACode: LoRA Adapters for Code Embeddings
Saumya Chaturvedi
Aman Chadha
Laurent Bindschaedler
63
0
0
07 Mar 2025
Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol
Roham Koohestani
Philippe de Bekker
M. Izadi
VLM
45
0
0
07 Mar 2025
Collapse of Dense Retrievers: Short, Early, and Literal Biases Outranking Factual Evidence
Mohsen Fayyaz
Ali Modarressi
Hinrich Schuetze
Nanyun Peng
57
0
0
06 Mar 2025
Granite Embedding Models
Parul Awasthy
Aashka Trivedi
Yulong Li
Mihaela A. Bornea
David D. Cox
...
Sukriti Sharma
Avirup Sil
Kate Soule
Arafat Sultan
Radu Florian
RALM
64
1
0
27 Feb 2025
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Tianyu Zheng
Ge Zhang
Tianhao Shen
Xueling Liu
Bill Yuchen Lin
Jie Fu
Wenhu Chen
Xiang Yue
SyDa
91
102
0
08 Jan 2025
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Benjamin Warner
Antoine Chaffin
Benjamin Clavié
Orion Weller
Oskar Hallström
...
Tom Aarsen
Nathan Cooper
Griffin Adams
Jeremy Howard
Iacopo Poli
88
77
0
18 Dec 2024
CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and Reranking
Tarun Suresh
R. Reddy
Yifei Xu
Zach Nussbaum
Andriy Mulyar
Brandon Duderstadt
Heng Ji
89
3
0
01 Dec 2024
CodeXEmbed: A Generalist Embedding Model Family for Multiligual and Multi-task Code Retrieval
Y. Liu
Rui Meng
Shafiq R. Joty
Silvio Savarese
Caiming Xiong
Yingbo Zhou
Semih Yavuz
92
3
0
19 Nov 2024
A Deep Dive Into Large Language Model Code Generation Mistakes: What and Why?
QiHong Chen
Jiawei Li
Jiecheng Deng
Jiachen Yu
Justin Tian Jin Chen
Iftekhar Ahmed
56
0
0
03 Nov 2024
Retrieval-Augmented Generation for AI-Generated Content: A Survey
Penghao Zhao
Hailin Zhang
Qinhan Yu
Zhengren Wang
Yunteng Geng
Fangcheng Fu
Ling Yang
Wentao Zhang
Jie Jiang
Bin Cui
3DV
115
228
0
29 Feb 2024
Repetition Improves Language Model Embeddings
Jacob Mitchell Springer
Suhas Kotha
Daniel Fried
Graham Neubig
Aditi Raghunathan
45
29
0
23 Feb 2024
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation
Jianlv Chen
Shitao Xiao
Peitian Zhang
Kun Luo
Defu Lian
Zheng Liu
115
328
0
05 Feb 2024
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
Shitao Xiao
Zheng Liu
Yingxia Shao
Zhao Cao
RALM
118
109
0
24 May 2022
Measuring Coding Challenge Competence With APPS
Dan Hendrycks
Steven Basart
Saurav Kadavath
Mantas Mazeika
Akul Arora
...
Collin Burns
Samir Puranik
Horace He
D. Song
Jacob Steinhardt
ELM
AIMat
ALM
208
624
0
20 May 2021
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Nandan Thakur
Nils Reimers
Andreas Rucklé
Abhishek Srivastava
Iryna Gurevych
VLM
231
966
0
17 Apr 2021
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Shuai Lu
Daya Guo
Shuo Ren
Junjie Huang
Alexey Svyatkovskiy
...
Nan Duan
Neel Sundaresan
Shao Kun Deng
Shengyu Fu
Shujie Liu
ELM
198
1,105
0
09 Feb 2021
1