Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.18443
Cited By
BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers
29 April 2024
Ran Xu
Wenqi Shi
Yue Yu
Yuchen Zhuang
Yanqiao Zhu
M. D. Wang
Joyce C. Ho
Chao Zhang
Carl Yang
LM&MA
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers"
32 / 32 papers shown
Title
SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains
Ran Xu
Hui Liu
Sreyashi Nag
Zhenwei Dai
Yaochen Xie
...
Chen Luo
Yang Li
Joyce C. Ho
Carl Yang
Qi He
RALM
149
11
0
28 Jan 2025
AutoMIR: Effective Zero-Shot Medical Information Retrieval without Relevance Labels
Lei Li
Xiangxu Zhang
Xiao Zhou
Zheng Liu
VLM
RALM
88
2
0
26 Oct 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
Mohammad Shoeybi
Bryan Catanzaro
Ming-Yu Liu
RALM
203
187
0
27 May 2024
MedCPT: Contrastive Pre-trained Transformers with Large-scale PubMed Search Logs for Zero-shot Biomedical Information Retrieval
Qiao Jin
Won Kim
Qingyu Chen
Donald C. Comeau
Lana Yeganova
John Wilbur
Zhiyong Lu
LM&MA
MedIm
63
111
0
02 Jul 2023
Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding
Yu Zhang
Hao Cheng
Zhihong Shen
Xiaodong Liu
Yejiang Wang
Jianfeng Gao
57
14
0
23 May 2023
Text Embeddings by Weakly-Supervised Contrastive Pre-training
Liang Wang
Nan Yang
Xiaolong Huang
Binxing Jiao
Linjun Yang
Daxin Jiang
Rangan Majumder
Furu Wei
VLM
200
593
0
07 Dec 2022
Task-aware Retrieval with Instructions
Akari Asai
Timo Schick
Patrick Lewis
Xilun Chen
Gautier Izacard
Sebastian Riedel
Hannaneh Hajishirzi
Wen-tau Yih
69
96
0
16 Nov 2022
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
Shitao Xiao
Zheng Liu
Yingxia Shao
Bo Zhao
RALM
225
122
0
24 May 2022
SGPT: GPT Sentence Embeddings for Semantic Search
Niklas Muennighoff
RALM
120
183
0
17 Feb 2022
Text and Code Embeddings by Contrastive Pre-Training
Arvind Neelakantan
Tao Xu
Raul Puri
Alec Radford
Jesse Michael Han
...
Tabarak Khan
Toki Sherbakov
Joanne Jang
Peter Welinder
Lilian Weng
SSL
AI4TS
343
438
0
24 Jan 2022
Improving Biomedical Information Retrieval with Neural Retrievers
Man Luo
Arindam Mitra
Tejas Gokhale
Chitta Baral
56
34
0
19 Jan 2022
Unsupervised Dense Information Retrieval with Contrastive Learning
Gautier Izacard
Mathilde Caron
Lucas Hosseini
Sebastian Riedel
Piotr Bojanowski
Armand Joulin
Edouard Grave
RALM
152
892
0
16 Dec 2021
Large Dual Encoders Are Generalizable Retrievers
Jianmo Ni
Chen Qu
Jing Lu
Zhuyun Dai
Gustavo Hernández Ábrego
...
Vincent Zhao
Yi Luan
Keith B. Hall
Ming-Wei Chang
Yinfei Yang
DML
134
450
0
15 Dec 2021
GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval
Kexin Wang
Nandan Thakur
Nils Reimers
Iryna Gurevych
VLM
118
157
0
14 Dec 2021
GooAQ: Open Question Answering with Diverse Answer Types
Daniel Khashabi
Amos Ng
Tushar Khot
Ashish Sabharwal
Hannaneh Hajishirzi
Chris Callison-Burch
52
54
0
18 Apr 2021
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Nandan Thakur
Nils Reimers
Andreas Rucklé
Abhishek Srivastava
Iryna Gurevych
VLM
393
1,030
0
17 Apr 2021
Effective Transfer Learning for Identifying Similar Questions: Matching User Questions to COVID-19 FAQs
Clara H. McCreery
Namit Katariya
A. Kannan
Manish Chablani
X. Amatriain
MedIm
OOD
26
74
0
04 Aug 2020
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
Lee Xiong
Chenyan Xiong
Ye Li
Kwok-Fung Tang
Jialin Liu
Paul N. Bennett
Junaid Ahmed
Arnold Overwijk
107
1,218
0
01 Jul 2020
Fact or Fiction: Verifying Scientific Claims
David Wadden
Shanchuan Lin
Kyle Lo
Lucy Lu Wang
Madeleine van Zuylen
Arman Cohan
Hannaneh Hajishirzi
HAI
112
450
0
30 Apr 2020
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
134
2,414
0
23 Apr 2020
CORD-19: The COVID-19 Open Research Dataset
Lucy Lu Wang
Kyle Lo
Yoganand Chandrasekhar
Russell Reas
Jiangjiang Yang
...
Boya Xie
Douglas A. Raymond
Daniel S. Weld
Oren Etzioni
Sebastian Kohlmeier
84
806
0
22 Apr 2020
SPECTER: Document-level Representation Learning using Citation-informed Transformers
Arman Cohan
Sergey Feldman
Iz Beltagy
Doug Downey
Daniel S. Weld
AI4TS
77
549
0
15 Apr 2020
MedDialog: Two Large-scale Medical Dialogue Datasets
Xuehai He
Shu Chen
Zeqian Ju
Xiangyu Dong
Hongchao Fang
...
Ruisi Zhang
Ruoyu Zhang
Meng Zhou
Penghui Zhu
P. Xie
LM&MA
MedIm
42
176
0
07 Apr 2020
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Guillaume Wenzek
Marie-Anne Lachaux
Alexis Conneau
Vishrav Chaudhary
Francisco Guzmán
Armand Joulin
Edouard Grave
81
654
0
01 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
367
20,053
0
23 Oct 2019
Learning Dense Representations for Entity Retrieval
D. Gillick
Sayali Kulkarni
L. Lansing
Alessandro Presta
Jason Baldridge
Eugene Ie
Diego Garcia-Olano
RALM
64
206
0
23 Sep 2019
PubMedQA: A Dataset for Biomedical Research Question Answering
Qiao Jin
Bhuwan Dhingra
Zhengping Liu
William W. Cohen
Xinghua Lu
353
883
0
13 Sep 2019
ELI5: Long Form Question Answering
Angela Fan
Yacine Jernite
Ethan Perez
David Grangier
Jason Weston
Michael Auli
AI4MH
ELM
82
616
0
22 Jul 2019
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Zhilin Yang
Peng Qi
Saizheng Zhang
Yoshua Bengio
William W. Cohen
Ruslan Salakhutdinov
Christopher D. Manning
RALM
147
2,635
0
25 Sep 2018
FEVER: a large-scale dataset for Fact Extraction and VERification
James Thorne
Andreas Vlachos
Christos Christodoulopoulos
Arpit Mittal
HILM
121
1,645
0
14 Mar 2018
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset
Payal Bajaj
Daniel Fernando Campos
Nick Craswell
Li Deng
Jianfeng Gao
...
Mir Rosenberg
Xia Song
Alina Stoica
Saurabh Tiwary
Tong Wang
RALM
135
2,721
0
28 Nov 2016
A large annotated corpus for learning natural language inference
Samuel R. Bowman
Gabor Angeli
Christopher Potts
Christopher D. Manning
280
4,278
0
21 Aug 2015
1