Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.13230
Cited By
v1
v2
v3 (latest)
Starbucks-v2: Improved Training for 2D Matryoshka Embeddings
17 October 2024
Shengyao Zhuang
Shuai Wang
Fabio Zheng
Bevan Koopman
Guido Zuccon
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Starbucks-v2: Improved Training for 2D Matryoshka Embeddings"
17 / 17 papers shown
Title
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Guilherme Penedo
Hynek Kydlícek
Loubna Ben Allal
Anton Lozhkov
Margaret Mitchell
Colin Raffel
Leandro von Werra
Thomas Wolf
117
259
0
25 Jun 2024
MatFormer: Nested Transformer for Elastic Inference
Devvrit
Sneha Kudugunta
Aditya Kusupati
Tim Dettmers
Kaifeng Chen
...
Yulia Tsvetkov
Hannaneh Hajishirzi
Sham Kakade
Ali Farhadi
Prateek Jain
97
30
0
11 Oct 2023
AdANNS: A Framework for Adaptive Semantic Search
Aniket Rege
Aditya Kusupati
S. SharanRanjit
Alan Fan
Qingqing Cao
Sham Kakade
Prateek Jain
Ali Farhadi
73
6
0
30 May 2023
Neural Rankers for Effective Screening Prioritisation in Medical Systematic Review Literature Search
Shuai Wang
Harrisen Scells
Bevan Koopman
Guido Zuccon
65
24
0
18 Dec 2022
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
Shitao Xiao
Zheng Liu
Yingxia Shao
Bo Zhao
RALM
274
124
0
24 May 2022
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Tianyu Gao
Xingcheng Yao
Danqi Chen
AILaw
SSL
274
3,407
0
18 Apr 2021
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Nandan Thakur
Nils Reimers
Andreas Rucklé
Abhishek Srivastava
Iryna Gurevych
VLM
425
1,050
0
17 Apr 2021
Condenser: a Pre-training Architecture for Dense Retrieval
Luyu Gao
Jamie Callan
AI4CE
59
263
0
16 Apr 2021
Overview of the TREC 2020 deep learning track
Nick Craswell
Bhaskar Mitra
Emine Yilmaz
Daniel Fernando Campos
124
387
0
15 Feb 2021
Dense Passage Retrieval for Open-Domain Question Answering
Vladimir Karpukhin
Barlas Oğuz
Sewon Min
Patrick Lewis
Ledell Yu Wu
Sergey Edunov
Danqi Chen
Wen-tau Yih
RALM
198
3,778
0
10 Apr 2020
Overview of the TREC 2019 deep learning track
Nick Craswell
Bhaskar Mitra
Emine Yilmaz
Daniel Fernando Campos
E. Voorhees
234
494
0
17 Mar 2020
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,295
0
27 Aug 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,114
0
11 Oct 2018
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation
Daniel Cer
Mona T. Diab
Eneko Agirre
I. Lopez-Gazpio
Lucia Specia
430
1,887
0
31 Jul 2017
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
Adina Williams
Nikita Nangia
Samuel R. Bowman
524
4,492
0
18 Apr 2017
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset
Payal Bajaj
Daniel Fernando Campos
Nick Craswell
Li Deng
Jianfeng Gao
...
Mir Rosenberg
Xia Song
Alina Stoica
Saurabh Tiwary
Tong Wang
RALM
142
2,741
0
28 Nov 2016
A large annotated corpus for learning natural language inference
Samuel R. Bowman
Gabor Angeli
Christopher Potts
Christopher D. Manning
321
4,287
0
21 Aug 2015
1