ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.17428
  4. Cited By
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

27 May 2024
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
M. Shoeybi
Bryan Catanzaro
Ming-Yu Liu
    RALM
ArXivPDFHTML

Papers citing "NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models"

50 / 112 papers shown
Title
TestNUC: Enhancing Test-Time Computing Approaches through Neighboring Unlabeled Data Consistency
TestNUC: Enhancing Test-Time Computing Approaches through Neighboring Unlabeled Data Consistency
Henry Peng Zou
Zhengyao Gu
Yue Zhou
Yankai Chen
Weizhi Zhang
Liancheng Fang
Yibo Wang
Yangning Li
Kay Liu
Philip S. Yu
79
0
0
26 Feb 2025
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers
Xueguang Ma
Xi Lin
Barlas Oğuz
Jimmy Lin
Wen-tau Yih
Xilun Chen
RALM
88
3
0
25 Feb 2025
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents
Qiuchen Wang
Ruixue Ding
Zehui Chen
Weiqi Wu
Shihang Wang
Pengjun Xie
Feng Zhao
60
1
0
25 Feb 2025
ATEB: Evaluating and Improving Advanced NLP Tasks for Text Embedding Models
ATEB: Evaluating and Improving Advanced NLP Tasks for Text Embedding Models
Simeng Han
Frank Palma Gomez
Tu Vu
Zefei Li
Daniel Cer
Hansi Zeng
Chris Tar
Arman Cohan
Gustavo Hernández Ábrego
59
1
0
24 Feb 2025
Towards Foundation Models for Mixed Integer Linear Programming
Towards Foundation Models for Mixed Integer Linear Programming
Sirui Li
Janardhan Kulkarni
Ishai Menache
Cathy Wu
Beibin Li
57
4
0
24 Feb 2025
Large Language Models are Powerful Electronic Health Record Encoders
Large Language Models are Powerful Electronic Health Record Encoders
S. Hegselmann
Georg von Arnim
Tillmann Rheude
Noel Kronenberg
David Sontag
Gerhard Hindricks
R. Eils
Benjamin Wild
LM&MA
49
1
0
24 Feb 2025
A Survey of Model Architectures in Information Retrieval
A Survey of Model Architectures in Information Retrieval
Zhichao Xu
Fengran Mo
Zhiqi Huang
Crystina Zhang
Puxuan Yu
Bei Wang
Jimmy J. Lin
Vivek Srikumar
KELM
3DV
67
2
0
21 Feb 2025
Is This Collection Worth My LLM's Time? Automatically Measuring Information Potential in Text Corpora
Is This Collection Worth My LLM's Time? Automatically Measuring Information Potential in Text Corpora
Tristan Karch
Luca Engel
Philippe Schwaller
Frédéric Kaplan
82
0
0
19 Feb 2025
Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment
Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment
Haoyuan Wu
Haisheng Zheng
Yuan Pu
Bei Yu
61
1
0
18 Feb 2025
Following the Autoregressive Nature of LLM Embeddings via Compression and Alignment
Following the Autoregressive Nature of LLM Embeddings via Compression and Alignment
Jingcheng Deng
Zhongtao Jiang
Liang Pang
Liwei Chen
Kun Xu
Zihao Wei
Huawei Shen
Xueqi Cheng
57
1
0
17 Feb 2025
FinMTEB: Finance Massive Text Embedding Benchmark
FinMTEB: Finance Massive Text Embedding Benchmark
Yixuan Tang
Yi Yang
AIFin
66
0
0
16 Feb 2025
Uncertainty-Aware Step-wise Verification with Generative Reward Models
Uncertainty-Aware Step-wise Verification with Generative Reward Models
Zihuiwen Ye
Luckeciano C. Melo
Younesse Kaddar
Phil Blunsom
Shivalika Singh
Yarin Gal
LRM
49
1
0
16 Feb 2025
When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks
When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks
Felix Drinkall
J. Pierrehumbert
Stefan Zohren
58
0
0
04 Feb 2025
Al-Khwarizmi: Discovering Physical Laws with Foundation Models
Al-Khwarizmi: Discovering Physical Laws with Foundation Models
Christopher E. Mower
Haitham Bou-Ammar
AI4CE
79
1
0
03 Feb 2025
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
Ziyan Jiang
Rui Meng
Xinyi Yang
Semih Yavuz
Yingbo Zhou
Wenhu Chen
MLLM
VLM
51
20
0
03 Jan 2025
LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models
LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models
Hieu Man
Nghia Trung Ngo
Viet Dac Lai
Ryan Rossi
Franck Dernoncourt
T. Nguyen
193
0
0
01 Jan 2025
Zero-Indexing Internet Search Augmented Generation for Large Language Models
Zero-Indexing Internet Search Augmented Generation for Large Language Models
Guangxin He
Zonghong Dai
Jiangcheng Zhu
Binqiang Zhao
Qicheng Hu
Chenyue Li
You Peng
Chen Wang
Binhang Yuan
69
0
0
31 Dec 2024
Boosting LLM via Learning from Data Iteratively and Selectively
Boosting LLM via Learning from Data Iteratively and Selectively
Qi Jia
Siyu Ren
Ziheng Qin
Fuzhao Xue
Jinjie Ni
Yang You
36
0
0
23 Dec 2024
Cannot or Should Not? Automatic Analysis of Refusal Composition in
  IFT/RLHF Datasets and Refusal Behavior of Black-Box LLMs
Cannot or Should Not? Automatic Analysis of Refusal Composition in IFT/RLHF Datasets and Refusal Behavior of Black-Box LLMs
Alexander von Recum
Christoph Schnabl
Gabor Hollbeck
Silas Alberti
Philip Blinde
Marvin von Hagen
92
2
0
22 Dec 2024
GME: Improving Universal Multimodal Retrieval by Multimodal LLMs
GME: Improving Universal Multimodal Retrieval by Multimodal LLMs
Xin Zhang
Yanzhao Zhang
Wen Xie
Mingxin Li
Ziqi Dai
Dingkun Long
Pengjun Xie
Meishan Zhang
Wenjie Li
Hao Fei
116
8
0
22 Dec 2024
LLMs are Also Effective Embedding Models: An In-depth Overview
LLMs are Also Effective Embedding Models: An In-depth Overview
Chongyang Tao
Tao Shen
Shen Gao
Junshuo Zhang
Zhen Li
Zhengwei Tao
Shuai Ma
83
7
0
17 Dec 2024
Token Prepending: A Training-Free Approach for Eliciting Better Sentence
  Embeddings from LLMs
Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs
Yuchen Fu
Zifeng Cheng
Zhiwei Jiang
Zhonghui Wang
Yafeng Yin
Zhengliang Li
Qing Gu
LLMAG
77
1
0
16 Dec 2024
Adaptive Two-Phase Finetuning LLMs for Japanese Legal Text Retrieval
Adaptive Two-Phase Finetuning LLMs for Japanese Legal Text Retrieval
Quang Hoang Trung
Nguyen Van Hoang Phuc
Le Trung Hoang
Quang Huu Hieu
Vo Nguyen Le Duy
AILaw
RALM
81
0
0
03 Dec 2024
Improved Large Language Model Jailbreak Detection via Pretrained
  Embeddings
Improved Large Language Model Jailbreak Detection via Pretrained Embeddings
Erick Galinkin
Martin Sablotny
76
0
0
02 Dec 2024
Advanced System Integration: Analyzing OpenAPI Chunking for
  Retrieval-Augmented Generation
Advanced System Integration: Analyzing OpenAPI Chunking for Retrieval-Augmented Generation
Robin D. Pesl
Jerin G. Mathew
Massimo Mecella
Marco Aiello
78
1
0
29 Nov 2024
CodeXEmbed: A Generalist Embedding Model Family for Multiligual and
  Multi-task Code Retrieval
CodeXEmbed: A Generalist Embedding Model Family for Multiligual and Multi-task Code Retrieval
Y. Liu
Rui Meng
Chenyu You
Silvio Savarese
Caiming Xiong
Yingbo Zhou
Semih Yavuz
94
3
0
19 Nov 2024
MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs
MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs
Sheng-Chieh Lin
Chankyu Lee
M. Shoeybi
Jimmy J. Lin
Bryan Catanzaro
Ming-Yu Liu
70
12
0
04 Nov 2024
RARe: Retrieval Augmented Retrieval with In-Context Examples
RARe: Retrieval Augmented Retrieval with In-Context Examples
Atula Tejaswi
Yoonsang Lee
Sujay Sanghavi
Eunsol Choi
RALM
LRM
25
1
0
26 Oct 2024
Large Language Models Are Overparameterized Text Encoders
Large Language Models Are Overparameterized Text Encoders
Thennal D K
Tim Fischer
Chris Biemann
46
2
0
18 Oct 2024
Understanding the Role of LLMs in Multimodal Evaluation Benchmarks
Understanding the Role of LLMs in Multimodal Evaluation Benchmarks
Botian Jiang
Lei Li
Xiaonan Li
Zhaowei Li
Xiachong Feng
Lingpeng Kong
Qiang Liu
Xipeng Qiu
41
2
0
16 Oct 2024
On Debiasing Text Embeddings Through Context Injection
On Debiasing Text Embeddings Through Context Injection
Thomas Uriot
37
0
0
14 Oct 2024
Advancing Academic Knowledge Retrieval via LLM-enhanced Representation
  Similarity Fusion
Advancing Academic Knowledge Retrieval via LLM-enhanced Representation Similarity Fusion
Wei Dai
Peng Fu
Chunjing Gan
41
0
0
14 Oct 2024
Diagnosing Hate Speech Classification: Where Do Humans and Machines
  Disagree, and Why?
Diagnosing Hate Speech Classification: Where Do Humans and Machines Disagree, and Why?
Xilin Yang
22
1
0
14 Oct 2024
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory
Di Wu
Hongwei Wang
W. Yu
Yuwei Zhang
Kai-Wei Chang
Dong Yu
RALM
KELM
46
13
0
14 Oct 2024
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
S. Yu
C. Tang
Bokai Xu
Junbo Cui
Junhao Ran
...
Zhenghao Liu
Shuo Wang
Xu Han
Zhiyuan Liu
Maosong Sun
VLM
39
23
0
14 Oct 2024
Detecting Training Data of Large Language Models via Expectation Maximization
Detecting Training Data of Large Language Models via Expectation Maximization
Gyuwan Kim
Yang Li
Evangelia Spiliopoulou
Jie Ma
Miguel Ballesteros
William Yang Wang
MIALM
95
4
2
10 Oct 2024
Exploring the Meaningfulness of Nearest Neighbor Search in
  High-Dimensional Space
Exploring the Meaningfulness of Nearest Neighbor Search in High-Dimensional Space
Zhonghan Chen
Ruiyuan Zhang
Xi Zhao
Xiaojun Cheng
Xiaofang Zhou
52
0
0
08 Oct 2024
Are Large Language Models Good Classifiers? A Study on Edit Intent
  Classification in Scientific Document Revisions
Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions
Qian Ruan
Ilia Kuznetsov
Iryna Gurevych
40
2
0
02 Oct 2024
Open-World Evaluation for Retrieving Diverse Perspectives
Open-World Evaluation for Retrieving Diverse Perspectives
Hung-Ting Chen
Eunsol Choi
35
0
0
26 Sep 2024
Making Text Embedders Few-Shot Learners
Making Text Embedders Few-Shot Learners
Chaofan Li
Minghao Qin
Shitao Xiao
Jianlyu Chen
Kun Luo
Yingxia Shao
Defu Lian
Zheng Liu
35
23
0
24 Sep 2024
jina-embeddings-v3: Multilingual Embeddings With Task LoRA
jina-embeddings-v3: Multilingual Embeddings With Task LoRA
Saba Sturua
Isabelle Mohr
Mohammad Kalim Akram
Michael Gunther
Bo Wang
...
Feng Wang
Georgios Mastrapas
Andreas Koukounas
Nan Wang
Han Xiao
RALM
45
25
0
16 Sep 2024
Ruri: Japanese General Text Embeddings
Ruri: Japanese General Text Embeddings
Hayato Tsukagoshi
Ryohei Sasano
29
1
0
12 Sep 2024
Pooling And Attention: What Are Effective Designs For LLM-Based
  Embedding Models?
Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models?
Yixuan Tang
Yi Yang
33
3
0
04 Sep 2024
Evaluating Computational Representations of Character: An Austen
  Character Similarity Benchmark
Evaluating Computational Representations of Character: An Austen Character Similarity Benchmark
Funing Yang
Carolyn Jane Anderson
40
0
0
28 Aug 2024
ULLME: A Unified Framework for Large Language Model Embeddings with
  Generation-Augmented Learning
ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning
Hieu Man
Nghia Trung Ngo
Franck Dernoncourt
Thien Huu Nguyen
AI4TS
46
4
0
06 Aug 2024
Language-Conditioned Offline RL for Multi-Robot Navigation
Language-Conditioned Offline RL for Multi-Robot Navigation
Steven D. Morad
Ajay Shankar
J. Blumenkamp
Amanda Prorok
LM&Ro
OffRL
48
6
0
29 Jul 2024
mGTE: Generalized Long-Context Text Representation and Reranking Models
  for Multilingual Text Retrieval
mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval
Xin Zhang
Yanzhao Zhang
Dingkun Long
Wen Xie
Ziqi Dai
...
Pengjun Xie
Fei Huang
Meishan Zhang
Wenjie Li
Min Zhang
42
78
0
29 Jul 2024
Fine-Tuning Large Language Models for Stock Return Prediction Using
  Newsflow
Fine-Tuning Large Language Models for Stock Return Prediction Using Newsflow
Tian Guo
E. Hauptmann
AIFin
41
3
0
25 Jul 2024
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Peng Xu
Ming-Yu Liu
Xianchao Wu
Zihan Liu
M. Shoeybi
Mohammad Shoeybi
Bryan Catanzaro
RALM
52
14
0
19 Jul 2024
Human-like Episodic Memory for Infinite Context LLMs
Human-like Episodic Memory for Infinite Context LLMs
Z. Fountas
Martin A Benfeghoul
Adnan Oomerjee
Fenia Christopoulou
Gerasimos Lampouras
Haitham Bou-Ammar
Jun Wang
31
18
0
12 Jul 2024
Previous
123
Next