Title
Unsupervised Text Representation Learning via Instruction-Tuning for Zero-Shot Dense Retrieval Qiuhai Zeng Zimeng Qiu Dae Yon Hwang Xin He William M. Campbell RALM 52 0 0 24 Sep 2024
IRSC: A Zero-shot Evaluation Benchmark for Information Retrieval through Semantic Comprehension in Retrieval-Augmented Generation Scenarios Hai Lin Shaoxiong Zhan Junyou Su Haitao Zheng Hui Wang RALM 56 1 0 24 Sep 2024
Making Text Embedders Few-Shot Learners Chaofan Li Minghao Qin Shitao Xiao Jianlyu Chen Kun Luo Yingxia Shao Defu Lian Zheng Liu 111 37 0 24 Sep 2024
Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely Siyun Zhao Yuqing Yang Zilong Wang Zhiyuan He Luna Qiu Lili Qiu SyDa RALM 3DV 118 42 0 23 Sep 2024
Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling Benjamin Clavié Antoine Chaffin Griffin Adams 48 4 0 23 Sep 2024
A Multimodal Dense Retrieval Approach for Speech-Based Open-Domain Question Answering Georgios Sidiropoulos Evangelos Kanoulas RALM 68 0 0 20 Sep 2024
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models Orion Weller Benjamin Van Durme Dawn J Lawrie Ashwin Paranjape Yuhao Zhang Jack Hessel LRM RALM 97 25 0 17 Sep 2024
GenCRF: Generative Clustering and Reformulation Framework for Enhanced Intent-Driven Information Retrieval Wonduk Seo Haojie Zhang Yueyang Zhang Changhao Zhang Songyao Duan Lixin Su Daiting Shi Jiashu Zhao Dawei Yin 67 1 0 17 Sep 2024
jina-embeddings-v3: Multilingual Embeddings With Task LoRA Saba Sturua Isabelle Mohr Mohammad Kalim Akram Michael Gunther Bo Wang ... Feng Wang Georgios Mastrapas Andreas Koukounas Nan Wang Han Xiao RALM 123 36 0 16 Sep 2024
A Benchmark Dataset with Larger Context for Non-Factoid Question Answering over Islamic Text Faiza Qamar Seemab Latif R. Latif 62 1 0 15 Sep 2024
Ruri: Japanese General Text Embeddings Hayato Tsukagoshi Ryohei Sasano 56 1 0 12 Sep 2024
Enhancing Q&A Text Retrieval with Ranking Models: Benchmarking, fine-tuning and deploying Rerankers for RAG Gabriel de Souza P. Moreira Ronay Ak Benedikt Schifferer Mengyao Xu Radek Osmulski Even Oldridge 60 7 0 12 Sep 2024
On the Vulnerability of Applying Retrieval-Augmented Generation within Knowledge-Intensive Application Domains Xun Xian Ganghua Wang Xuan Bi Jayanth Srinivasa Ashish Kundu Charles Fleming Mingyi Hong Jie Ding SILM 70 2 0 12 Sep 2024
MessIRve: A Large-Scale Spanish Information Retrieval Dataset Francisco Valentini Viviana Cotik D. Furman Ivan Bercovich Edgar Altszyler Juan Manuel Pérez 79 2 0 09 Sep 2024
RAG based Question-Answering for Contextual Response Prediction System Sriram Veturi Saurabh Vaichal Reshma Lal Jagadheesh Nafis Irtiza Tripto Nian Yan RALM 70 7 0 05 Sep 2024
Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models? Yixuan Tang Yi Yang 74 4 0 04 Sep 2024
A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networks Nicholas Monath Will Grathwohl Michael Boratko Rob Fergus Andrew McCallum Manzil Zaheer 64 0 0 03 Sep 2024
Know When to Fuse: Investigating Non-English Hybrid Retrieval in the Legal Domain Antoine Louis Gijs van Dijck Gerasimos Spanakis 52 0 0 02 Sep 2024
Masked Mixers for Language Generation and Retrieval Benjamin L. Badger 167 0 0 02 Sep 2024
ContextCite: Attributing Model Generation to Context Benjamin Cohen-Wang Harshay Shah Kristian Georgiev Aleksander Madry LRM 93 30 0 01 Sep 2024
Understanding the User: An Intent-Based Ranking Dataset Abhijit Anand Jurek Leonhardt Venktesh V Avishek Anand 57 0 0 30 Aug 2024
Jina-ColBERT-v2: A General-Purpose Multilingual Late Interaction Retriever Rohan Jha Bo Wang Michael Gunther Georgios Mastrapas Saba Sturua Isabelle Mohr Andreas Koukounas Mohammad Kalim Akram Nan Wang Han Xiao 68 6 0 29 Aug 2024
Genetic Approach to Mitigate Hallucination in Generative IR Hrishikesh Kulkarni Nazli Goharian O. Frieder Sean MacAvaney HILM 62 2 0 25 Aug 2024
Large Language Models as Foundations for Next-Gen Dense Retrieval: A Comprehensive Empirical Assessment Kun Luo Minghao Qin Zheng Liu Shitao Xiao Jun Zhao Kang Liu 76 13 0 22 Aug 2024
Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks Yiyi Chen Russa Biswas Heather Lent Johannes Bjerva AAML 92 5 0 21 Aug 2024
Hindi-BEIR : A Large Scale Retrieval Benchmark in Hindi Arkadeep Acharya Rudra Murthy Vishwajeet Kumar Jaydeep Sen 80 1 0 18 Aug 2024
W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering Jinming Nian Zhiyuan Peng Qifan Wang Yi Fang RALM 141 2 0 15 Aug 2024
WeKnow-RAG: An Adaptive Approach for Retrieval-Augmented Generation Integrating Web Search and Knowledge Graphs Weijian Xie Xuefeng Liang Yuhui Liu Kaihua Ni Hong Cheng Zetian Hu 3DV RALM 113 3 0 14 Aug 2024
ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning Hieu Man Nghia Trung Ngo Franck Dernoncourt Thien Huu Nguyen AI4TS 71 5 0 06 Aug 2024
Developing PUGG for Polish: A Modern Approach to KBQA, MRC, and IR Dataset Construction Albert Sawczyn Katsiaryna Viarenich Konrad Wojtasik Aleksandra Domogała Marcin Oleksy Maciej Piasecki Tomasz Kajdanowicz 59 0 0 05 Aug 2024
Generative Retrieval with Few-shot Indexing Arian Askari Chuan Meng Mohammad Aliannejadi Zhaochun Ren Evangelos Kanoulas Suzan Verberne RALM 111 3 0 04 Aug 2024
An Encoding--Searching Separation Perspective on Bi-Encoder Neural Search Danbinaerin Han Akiko Aizawa Sihun Lee 59 0 0 02 Aug 2024
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework Kunlun Zhu Yifan Luo Dingling Xu Ruobing Wang Shi Yu ... Yishan Li Zhiyuan Liu Xu Han Zhiyuan Liu Maosong Sun 223 21 0 02 Aug 2024
Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models Zhengxuan Wu Yuhao Zhang Linquan Wei Yumo Xu Rujun Han Yi Liu Jifan Chen Bonan Min Zhiheng Huang 90 0 0 31 Jul 2024
Learning Effective Representations for Retrieval Using Self-Distillation with Adaptive Relevance Margins Lukas Gienapp Niklas Deckers Martin Potthast Harrisen Scells 17 1 0 31 Jul 2024
Introducing a new hyper-parameter for RAG: Context Window Utilization Kush Juvekar A. Purwar 76 4 0 29 Jul 2024
mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval Xin Zhang Yanzhao Zhang Dingkun Long Wen Xie Ziqi Dai ... Pengjun Xie Fei Huang Meishan Zhang Wenjie Li Min Zhang 134 107 0 29 Jul 2024
QAEA-DR: A Unified Text Augmentation Framework for Dense Retrieval Hongming Tan Shaoxiong Zhan Hai Lin Hai-Tao Zheng Wai Kin Chan RALM 110 2 0 29 Jul 2024
Open Sentence Embeddings for Portuguese with the Serafim PT* encoders family Luís Gomes António Branco Joao Silva João Rodrigues Rodrigo Santos 3DV 53 0 0 28 Jul 2024
Embedding And Clustering Your Data Can Improve Contrastive Pretraining Luke Merrick 65 5 0 26 Jul 2024
Enhancing LLM's Cognition via Structurization Kai-Chun Liu Zhihang Fu Chao Chen Wei Zhang Rongxin Jiang Fan Zhou Yao-Shen Chen Yue-bo Wu Jieping Ye 81 1 0 23 Jul 2024
NV-Retriever: Improving text embedding models with effective hard-negative mining Gabriel de Souza P. Moreira Radek Osmulski Mengyao Xu Ronay Ak Benedikt Schifferer Even Oldridge RALM 138 47 0 22 Jul 2024
Retrieval-Enhanced Machine Learning: Synthesis and Opportunities To Eun Kim Alireza Salemi Andrew Drozdov Fernando Diaz Hamed Zamani 122 8 0 17 Jul 2024
Crafting the Path: Robust Query Rewriting for Information Retrieval Ingeol Baek Jimin Lee Joonho Yang Hwanhee Lee 87 5 0 17 Jul 2024
Conversational Query Reformulation with the Guidance of Retrieved Documents Jeonghyun Park Hwanhee Lee 113 0 0 17 Jul 2024
Numbers Matter! Bringing Quantity-awareness to Retrieval Systems Satya Almasian Milena Bruseva Michael Gertz 58 0 0 14 Jul 2024
CompAct: Compressing Retrieved Documents Actively for Question Answering Chanwoong Yoon Taewhoo Lee Hyeon Hwang Minbyul Jeong Jaewoo Kang KELM RALM MQ 111 18 0 12 Jul 2024
Neural Networks Meet Elliptic Curve Cryptography: A Novel Approach to Secure Communication Mina Cecilie Wøien Ferhat Ozgur Catak Murat Kuzlu Umit Cali 26 1 0 11 Jul 2024
LitSearch: A Retrieval Benchmark for Scientific Literature Search Anirudh Ajith Mengzhou Xia Alexis Chevalier Tanya Goyal Danqi Chen Tianyu Gao RALM 89 14 0 10 Jul 2024
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective Yu-An Liu Ruqing Zhang Jiafeng Guo Maarten de Rijke Yixing Fan Xueqi Cheng 118 11 0 09 Jul 2024