Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2201.10005
Cited By
Text and Code Embeddings by Contrastive Pre-Training
24 January 2022
Arvind Neelakantan
Tao Xu
Raul Puri
Alec Radford
Jesse Michael Han
Jerry Tworek
Qiming Yuan
Nikolas Tezak
Jong Wook Kim
Chris Hallacy
Johannes Heidecke
Pranav Shyam
Boris Power
Tyna Eloundou Nekoul
Girish Sastry
Gretchen Krueger
David Schnurr
F. Such
K. Hsu
Madeleine Thompson
Tabarak Khan
Toki Sherbakov
Joanne Jang
Peter Welinder
Lilian Weng
SSL
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Text and Code Embeddings by Contrastive Pre-Training"
50 / 245 papers shown
Title
Towards High-Fidelity Synthetic Multi-platform Social Media Datasets via Large Language Models
Henry Tari
Nojus Sereiva
Rishabh Kaushal
T. Bertaglia
Adriana Iamnitchi
30
0
0
02 May 2025
Prompt Injection Attack to Tool Selection in LLM Agents
Jiawen Shi
Zenghui Yuan
Guiyao Tie
Pan Zhou
Neil Zhenqiang Gong
Lichao Sun
LLMAG
51
0
0
28 Apr 2025
AlphaFuse: Learn ID Embeddings for Sequential Recommendation in Null Space of Language Embeddings
Guoqing Hu
An Zhang
Shuo Liu
Zhibo Cai
Xun Yang
X. Wang
34
0
0
27 Apr 2025
VeriDebug: A Unified LLM for Verilog Debugging via Contrastive Embedding and Guided Correction
N. Wang
Bingkun Yao
Jie Zhou
Yuchen Hu
Xi Wang
Nan Guan
Zhe Jiang
36
0
0
27 Apr 2025
MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation
Chanhee Park
Hyeonseok Moon
Chanjun Park
Heuiseok Lim
RALM
58
0
0
23 Apr 2025
A Large-scale Class-level Benchmark Dataset for Code Generation with LLMs
Musfiqur Rahman
SayedHassan Khatoonabadi
Emad Shihab
ALM
34
0
0
22 Apr 2025
MIEB: Massive Image Embedding Benchmark
Chenghao Xiao
Isaac Chung
Imene Kerboua
Jamie Stirling
Xin Zhang
Márton Kardos
Roman Solomatin
Noura Al Moubayed
K. Enevoldsen
Niklas Muennighoff
VLM
37
0
0
14 Apr 2025
AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery
Amirhossein Abaskohi
A. Ramesh
Shailesh Nanisetty
Chirag Goel
David Vazquez
Christopher Pal
Spandana Gella
Giuseppe Carenini
I. Laradji
34
0
0
10 Apr 2025
Towards Distribution Matching between Collaborative and Language Spaces for Generative Recommendation
Yi-cui Zhang
Yiwen Zhang
Y. X. R. Wang
Tong Chen
Hongzhi Yin
28
0
0
10 Apr 2025
Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling
Hengran Zhang
Keping Bi
J. Guo
Xiaojie Sun
Shihao Liu
Daiting Shi
Dawei Yin
Xueqi Cheng
RALM
129
0
0
07 Apr 2025
Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data
Waris Gill
Justin Cechmanek
Tyler Hutcherson
Srijith Rajamohan
Jen Agarwal
Muhammad Ali Gulzar
Manvinder Singh
Benoit Dion
35
0
0
03 Apr 2025
Patience is all you need! An agentic system for performing scientific literature review
David Brett
Anniek Myatt
25
0
0
28 Mar 2025
A Survey on Knowledge-Oriented Retrieval-Augmented Generation
Mingyue Cheng
Yucong Luo
Jie Ouyang
Q. Liu
Huijie Liu
...
Bohou Zhang
Jiawei Cao
Jie Ma
Daoyu Wang
Enhong Chen
3DV
70
3
0
11 Mar 2025
AutoIOT: LLM-Driven Automated Natural Language Programming for AIoT Applications
Leming Shen
Qiang Yang
Yuanqing Zheng
Mo Li
43
1
0
07 Mar 2025
ATEB: Evaluating and Improving Advanced NLP Tasks for Text Embedding Models
Simeng Han
Frank Palma Gomez
Tu Vu
Zefei Li
Daniel Matthew Cer
Hansi Zeng
Chris Tar
Arman Cohan
Gustavo Hernández Ábrego
51
1
0
24 Feb 2025
Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models
A. Narayan
D. Biderman
Sabri Eyuboglu
Avner May
Scott W. Linderman
James Zou
Christopher Ré
50
1
0
21 Feb 2025
A Survey of Model Architectures in Information Retrieval
Zhichao Xu
Fengran Mo
Zhiqi Huang
Crystina Zhang
Puxuan Yu
Bei Wang
Jimmy J. Lin
Vivek Srikumar
KELM
3DV
50
2
0
21 Feb 2025
Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation
Shuo Tang
Xianghe Pang
Zexi Liu
Bohan Tang
Rui Ye
Xiaowen Dong
Y. Wang
Yanfeng Wang
S. Chen
SyDa
LLMAG
127
3
0
21 Feb 2025
Learning More Effective Representations for Dense Retrieval through Deliberate Thinking Before Search
Yifan Ji
Zhipeng Xu
Zhenghao Liu
Yukun Yan
S. Yu
Y. Li
Zhiyuan Liu
Yu Gu
Ge Yu
Maosong Sun
RALM
58
0
0
18 Feb 2025
The Odyssey of the Fittest: Can Agents Survive and Still Be Good?
Dylan Waldner
Risto Miikkulainen
51
0
0
08 Feb 2025
Consistent estimation of generative model representations in the data kernel perspective space
Aranyak Acharyya
M. Trosset
Carey E. Priebe
Hayden Helm
DiffM
60
3
0
20 Jan 2025
Multi-task retriever fine-tuning for domain-specific and efficient RAG
Patrice Béchard
Orlando Marquez Ayala
35
0
0
08 Jan 2025
QuIM-RAG: Advancing Retrieval-Augmented Generation with Inverted Question Matching for Enhanced QA Performance
Binita Saha
Utsha Saha
Muhammad Zubair Malik
RALM
3DV
56
2
0
06 Jan 2025
Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning
Alex Beutel
Kai Y. Xiao
Johannes Heidecke
Lilian Weng
AAML
43
3
0
24 Dec 2024
AUEB-Archimedes at RIRAG-2025: Is obligation concatenation really all you need?
Ioannis Chasandras
Odysseas S. Chlapanis
Ion Androutsopoulos
74
0
0
16 Dec 2024
NoteContrast: Contrastive Language-Diagnostic Pretraining for Medical Text
Prajwal Kailas
Max Homilius
Rahul C. Deo
Calum A. MacRae
99
0
0
16 Dec 2024
SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters
Jianping Jiang
Weiye Xiao
Zhengyu Lin
H. Zhang
Tianxiang Ren
Yang Gao
Zhiqian Lin
Zhongang Cai
Lei Yang
Ziwei Liu
86
3
0
29 Nov 2024
A Comparative Study of Text Retrieval Models on DaReCzech
Jakub Stetina
Martin Fajcik
Michal Stefanik
Michal Hradis
76
0
0
19 Nov 2024
Advancing Large Language Models for Spatiotemporal and Semantic Association Mining of Similar Environmental Events
Yuanyuan Tian
Wenwen Li
Lei Hu
X. Chen
Michael Brook
Michael Brubaker
Fan Zhang
A. Liljedahl
KELM
81
1
0
19 Nov 2024
CodeXEmbed: A Generalist Embedding Model Family for Multiligual and Multi-task Code Retrieval
Y. Liu
Rui Meng
Shafiq R. Joty
Silvio Savarese
Caiming Xiong
Yingbo Zhou
Semih Yavuz
92
3
0
19 Nov 2024
Sentiment Analysis of Cyberbullying Data in Social Media
Arvapalli Sai Susmitha
Pradeep Pujari
26
0
0
08 Nov 2024
Investigating Idiomaticity in Word Representations
Wei He
Tiago Kramer Vieira
Marcos García
Carolina Scarton
M. Idiart
Aline Villavicencio
34
1
0
04 Nov 2024
CmdCaliper: A Semantic-Aware Command-Line Embedding Model and Dataset for Security Research
Sian-Yao Huang
Cheng-Lin Yang
C. Lin
Chun-Ying Huang
35
1
0
02 Nov 2024
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Vipul Gupta
Candace Ross
David Pantoja
R. Passonneau
Megan Ung
Adina Williams
70
1
0
26 Oct 2024
Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence Embeddings for Automatic Dialog Flow Extraction
Sergio Burdisso
S. Madikeri
P. Motlícek
32
1
0
24 Oct 2024
Link, Synthesize, Retrieve: Universal Document Linking for Zero-Shot Information Retrieval
Dae Yon Hwang
Bilal Taha
Harshit Pande
Yaroslav Nechaev
SyDa
31
0
0
24 Oct 2024
FinQAPT: Empowering Financial Decisions with End-to-End LLM-driven Question Answering Pipeline
Kuldeep Singh
Simerjot Kaur
Charese Smiley
AIFin
26
2
0
17 Oct 2024
LAR-ECHR: A New Legal Argument Reasoning Task and Dataset for Cases of the European Court of Human Rights
Odysseas S. Chlapanis
D. Galanis
Ion Androutsopoulos
AILaw
ELM
26
0
0
17 Oct 2024
Preference Diffusion for Recommendation
Shuo Liu
An Zhang
Guoqing Hu
Hong Qian
Tat-Seng Chua
53
1
0
17 Oct 2024
AutoPersuade: A Framework for Evaluating and Explaining Persuasive Arguments
Till Raphael Saenger
Musashi Hinck
Justin Grimmer
Brandon M Stewart
30
2
0
11 Oct 2024
Exploring the Meaningfulness of Nearest Neighbor Search in High-Dimensional Space
Zhonghan Chen
Ruiyuan Zhang
Xi Zhao
Xiaojun Cheng
Xiaofang Zhou
47
0
0
08 Oct 2024
Crafting Personalized Agents through Retrieval-Augmented Generation on Editable Memory Graphs
Zheng Wang
Zhongyang Li
Zeren Jiang
Dandan Tu
Wei Shi
42
7
0
28 Sep 2024
Generative AI Is Not Ready for Clinical Use in Patient Education for Lower Back Pain Patients, Even With Retrieval-Augmented Generation
Yi-Fei Zhao
A. Bove
David Thompson
James Hill
Yi Xu
Yufan Ren
Andrea Hassman
Leming Zhou
Yanshan Wang
16
0
0
23 Sep 2024
Past Meets Present: Creating Historical Analogy with Large Language Models
Nianqi Li
Siyu Yuan
Jiangjie Chen
Jiaqing Liang
Feng Wei
Zujie Liang
Deqing Yang
Yanghua Xiao
30
0
0
23 Sep 2024
Beyond Persuasion: Towards Conversational Recommender System with Credible Explanations
Peixin Qin
Chen Huang
Yang Deng
Wenqiang Lei
Tat-Seng Chua
LRM
27
3
0
22 Sep 2024
Enhancing Large Language Models with Domain-specific Retrieval Augment Generation: A Case Study on Long-form Consumer Health Question Answering in Ophthalmology
Aidan Gilson
Xuguang Ai
Thilaka Arunachalam
Ziyou Chen
Ki Xiong Cheong
...
Zhiyong Lu
Hua Xu
Ron A. Adelman
Yih-Chung Tham
Qingyu Chen
RALM
29
1
0
20 Sep 2024
Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles
Qiujing Lu
Xuanhan Wang
Yiwei Jiang
Guangming Zhao
Mingyue Ma
Shuo Feng
48
7
0
10 Sep 2024
MessIRve: A Large-Scale Spanish Information Retrieval Dataset
Francisco Valentini
Viviana Cotik
D. Furman
Ivan Bercovich
Edgar Altszyler
Juan Manuel Pérez
51
1
0
09 Sep 2024
Leveraging LLMs for Influence Path Planning in Proactive Recommendation
Mingze Wang
Shuxian Bi
W. Wang
Chongming Gao
Yangyang Li
Fuli Feng
36
0
0
07 Sep 2024
Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models?
Yixuan Tang
Yi Yang
33
3
0
04 Sep 2024
1
2
3
4
5
Next