Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.03281
Cited By
Towards General Text Embeddings with Multi-stage Contrastive Learning
7 August 2023
Zehan Li
Xin Zhang
Yanzhao Zhang
Dingkun Long
Pengjun Xie
Meishan Zhang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Towards General Text Embeddings with Multi-stage Contrastive Learning"
50 / 260 papers shown
Title
M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
Masahiro Yasuda
Shunsuke Tsubaki
Keisuke Imoto
VLM
95
7
0
04 Jun 2024
CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks
Maciej Besta
Lorenzo Paleari
Marcin Copik
Robert Gerstenberger
Aleš Kubíček
...
Eric Schreiber
Torsten Hoefler
Tomasz Lehmann
H. Niewiadomski
Torsten Hoefler
171
7
0
04 Jun 2024
CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
M. Rusanovsky
Or Hirschorn
S. Avidan
76
3
0
01 Jun 2024
Towards Ontology-Enhanced Representation Learning for Large Language Models
Francesco Ronzano
Jay Nanavati
67
5
0
30 May 2024
From Zero to Hero: Cold-Start Anomaly Detection
Tal Reiss
George Kour
Naama Zwerdling
Ateret Anaby-Tavor
Yedid Hoshen
90
1
0
30 May 2024
Don't Forget to Connect! Improving RAG with Graph-based Reranking
Jialin Dong
Bahare Fatemi
Bryan Perozzi
Lin F. Yang
Anton Tsitsulin
121
29
0
28 May 2024
Recent advances in text embedding: A Comprehensive Review of Top-Performing Methods on the MTEB Benchmark
Hongliu Cao
AI4TS
109
15
0
27 May 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
Mohammad Shoeybi
Bryan Catanzaro
Ming-Yu Liu
RALM
308
205
0
27 May 2024
Crafting Interpretable Embeddings by Asking LLMs Questions
Vinamra Benara
Chandan Singh
John X. Morris
Richard Antonello
Ion Stoica
Alexander G. Huth
Jianfeng Gao
69
6
0
26 May 2024
The correlation between nativelike selection and prototypicality: a multilingual onomasiological case study using semantic embedding
Huasheng Zhang
47
0
0
22 May 2024
TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models
Pengzhou Cheng
Yidong Ding
Tianjie Ju
Zongru Wu
Wei Du
Ping Yi
Zhuosheng Zhang
Gongshen Liu
SILM
AAML
98
29
0
22 May 2024
Question-Based Retrieval using Atomic Units for Enterprise RAG
Vatsal Raina
Mark Gales
72
14
0
20 May 2024
INDUS: Effective and Efficient Language Models for Scientific Applications
Bishwaranjan Bhattacharjee
Aashka Trivedi
Masayasu Muraoka
Muthukumaran Ramasubramanian
Takuma Udagawa
...
Peter W. J. Staar
S. Vahidinia
Ryan McGranaghan
A. Mehrabian
Tsendgar Lee
AI4CE
97
6
0
17 May 2024
FinTextQA: A Dataset for Long-form Financial Question Answering
Jian Chen
Peilin Zhou
Yining Hua
Yingxin Loh
Kehui Chen
Ziyuan Li
Bing Zhu
Junwei Liang
56
18
0
16 May 2024
Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training
Junqin Huang
Zhongjie Hu
Zihao Jing
Mengya Gao
Yichao Wu
MoE
VLM
72
6
0
11 May 2024
Arctic-Embed: Scalable, Efficient, and Accurate Text Embedding Models
Luke Merrick
Danmei Xu
Gaurav Nuti
Daniel Campos
75
27
0
08 May 2024
URL: Universal Referential Knowledge Linking via Task-instructed Representation Compression
Zhuoqun Li
Hongyu Lin
Tianshu Wang
Boxi Cao
Yaojie Lu
Weixiang Zhou
Hao Wang
Zhenyu Zeng
Le Sun
Xianpei Han
73
1
0
24 Apr 2024
Multi-view Content-aware Indexing for Long Document Retrieval
Kuicai Dong
Derrick-Goh-Xin Deik
Yi Quan Lee
Hao Zhang
Xiangyang Li
Cong Zhang
Yong Liu
78
3
0
23 Apr 2024
LongEmbed: Extending Embedding Models for Long Context Retrieval
Dawei Zhu
Liang Wang
Nan Yang
Yifan Song
Wenhao Wu
Furu Wei
Sujian Li
RALM
95
27
0
18 Apr 2024
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Navonil Majumder
Chia-Yu Hung
Deepanway Ghosal
Wei-Ning Hsu
Rada Mihalcea
Soujanya Poria
143
61
0
15 Apr 2024
ToNER: Type-oriented Named Entity Recognition with Generative Language Model
Guochao Jiang
Ziqin Luo
Yuchen Shi
Dixuan Wang
Jiaqing Liang
Deqing Yang
106
12
0
14 Apr 2024
Event-enhanced Retrieval in Real-time Search
Yanan Zhang
Xiaoling Bai
Tianhua Zhou
91
1
0
09 Apr 2024
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Parishad BehnamGhader
Vaibhav Adlakha
Marius Mosbach
Dzmitry Bahdanau
Nicolas Chapados
Siva Reddy
113
242
0
09 Apr 2024
Gecko: Versatile Text Embeddings Distilled from Large Language Models
Jinhyuk Lee
Zhuyun Dai
Xiaoqi Ren
Blair Chen
Daniel Cer
...
Aditya Kusupati
Prateek Jain
Siddhartha Reddy Jonnalagadda
Ming-Wei Chang
Iftekhar Naim
RALM
VLM
SyDa
105
51
0
29 Mar 2024
NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data
Manuel Tonneau
Pedro Vitor Quinta de Castro
Karim Lasri
I. Farouq
Lakshminarayanan Subramanian
Victor Orozco-Olvera
Samuel Fraiberger
100
12
0
28 Mar 2024
BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models
Haitao Li
Qingyao Ai
Jia Chen
Qian Dong
Zhijing Wu
Yiqun Liu
Chong Chen
Qi Tian
AILaw
102
14
0
27 Mar 2024
SMART: Submodular Data Mixture Strategy for Instruction Tuning
Kowndinya Renduchintala
S. Bhatia
Ganesh Ramakrishnan
90
5
0
13 Mar 2024
OffensiveLang: A Community Based Implicit Offensive Language Dataset
Amit Das
Mostafa Rahgouy
Dongji Feng
Zheng Zhang
Tathagata Bhattacharya
...
Aman Chadha
Mary J. Sandage
Lauramarie Pope
Gerry V. Dozier
Cheryl Seals
107
2
0
04 Mar 2024
GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning
Aivin V. Solatorio
70
24
0
26 Feb 2024
CLAP: Learning Transferable Binary Code Representations with Natural Language Supervision
Hao Wang
Zeyu Gao
Chao Zhang
Zihan Sha
Mingyang Sun
Yuchen Zhou
Wenyu Zhu
Wenju Sun
Han Qiu
Xiangwei Xiao
92
22
0
26 Feb 2024
OAG-Bench: A Human-Curated Benchmark for Academic Graph Mining
Fanjin Zhang
Shijie Shi
Yifan Zhu
Bo Chen
Yukuo Cen
...
Huihui Yuan
Jian Song
Xiaoyan Li
Yuxiao Dong
Jie Tang
95
20
0
24 Feb 2024
Repetition Improves Language Model Embeddings
Jacob Mitchell Springer
Suhas Kotha
Daniel Fried
Graham Neubig
Aditi Raghunathan
95
33
0
23 Feb 2024
Triple-Encoders: Representations That Fire Together, Wire Together
Justus-Jonas Erker
Florian Mai
Nils Reimers
Gerasimos Spanakis
Iryna Gurevych
85
2
0
19 Feb 2024
FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation
Shuai Wang
Ekaterina Khramtsova
Shengyao Zhuang
Guido Zuccon
96
13
0
19 Feb 2024
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Xing Han Lù
Zdeněk Kasner
Siva Reddy
98
77
0
08 Feb 2024
Multimodal Rationales for Explainable Visual Question Answering
Kun Li
G. Vosselman
Michael Ying Yang
132
2
0
06 Feb 2024
UNSEE: Unsupervised Non-contrastive Sentence Embeddings
Ömer Veysel Çagatan
SSL
69
0
0
27 Jan 2024
Tracing the Genealogies of Ideas with Large Language Model Embeddings
Lucian Li
AI4CE
26
0
0
13 Jan 2024
Data-CUBE: Data Curriculum for Instruction-based Sentence Representation Learning
Yingqian Min
Kun Zhou
Dawei Gao
Wayne Xin Zhao
He Hu
Yaliang Li
80
1
0
07 Jan 2024
Are we describing the same sound? An analysis of word embedding spaces of expressive piano performance
S. Peter
Shreyan Chowdhury
Carlos Eduardo Cancino-Chacón
Gerhard Widmer
46
0
0
31 Dec 2023
Improving Text Embeddings with Large Language Models
Liang Wang
Nan Yang
Xiaolong Huang
Linjun Yang
Rangan Majumder
Furu Wei
SyDa
133
190
0
31 Dec 2023
WaveCoder: Widespread And Versatile Enhancement For Code Large Language Models By Instruction Tuning
Zhaojian Yu
Xin Zhang
Ning Shang
Yangyu Huang
Can Xu
Yishujie Zhao
Wenxiang Hu
Qiufeng Yin
SyDa
133
28
0
20 Dec 2023
Language-Conditioned Semantic Search-Based Policy for Robotic Manipulation Tasks
Jannik Sheikh
Andrew Melnik
G. C. Nandi
R. Haschke
OffRL
LM&Ro
55
3
0
10 Dec 2023
PECANN: Parallel Efficient Clustering with Graph-Based Approximate Nearest Neighbor Search
Shangdi Yu
Joshua Engels
Yihao Huang
Julian Shun
87
3
0
06 Dec 2023
Social Bias Probing: Fairness Benchmarking for Language Models
Marta Marchiori Manerba
Karolina Stañczak
Riccardo Guidotti
Isabelle Augenstein
115
20
0
15 Nov 2023
Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models
Tingyu Xie
Qi Li
Yan Zhang
Zuozhu Liu
Hongwei Wang
92
19
0
15 Nov 2023
Unveiling Safety Vulnerabilities of Large Language Models
George Kour
Marcel Zalmanovici
Naama Zwerdling
Esther Goldbraich
Ora Nova Fandina
Ateret Anaby-Tavor
Orna Raz
E. Farchi
AAML
86
18
0
07 Nov 2023
RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models
Zefan Wang
Zichuan Liu
Yingying Zhang
Aoxiao Zhong
Lunting Fan
Lingfei Wu
Qingsong Wen
93
32
0
25 Oct 2023
NERetrieve: Dataset for Next Generation Named Entity Recognition and Retrieval
Uri Katz
Matan Vetzler
Amir D. N. Cohen
Yoav Goldberg
95
10
0
22 Oct 2023
Search-Adaptor: Embedding Customization for Information Retrieval
Jinsung Yoon
Sercan O. Arik
Yanfei Chen
Tomas Pfister
76
2
0
12 Oct 2023
Previous
1
2
3
4
5
6
Next