ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.13230
  4. Cited By
Starbucks-v2: Improved Training for 2D Matryoshka Embeddings
v1v2v3 (latest)

Starbucks-v2: Improved Training for 2D Matryoshka Embeddings

17 October 2024
Shengyao Zhuang
Shuai Wang
Fabio Zheng
Bevan Koopman
Guido Zuccon
ArXiv (abs)PDFHTML

Papers citing "Starbucks-v2: Improved Training for 2D Matryoshka Embeddings"

17 / 17 papers shown
Title
The FineWeb Datasets: Decanting the Web for the Finest Text Data at
  Scale
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Guilherme Penedo
Hynek Kydlícek
Loubna Ben Allal
Anton Lozhkov
Margaret Mitchell
Colin Raffel
Leandro von Werra
Thomas Wolf
117
259
0
25 Jun 2024
MatFormer: Nested Transformer for Elastic Inference
MatFormer: Nested Transformer for Elastic Inference
Devvrit
Sneha Kudugunta
Aditya Kusupati
Tim Dettmers
Kaifeng Chen
...
Yulia Tsvetkov
Hannaneh Hajishirzi
Sham Kakade
Ali Farhadi
Prateek Jain
97
30
0
11 Oct 2023
AdANNS: A Framework for Adaptive Semantic Search
AdANNS: A Framework for Adaptive Semantic Search
Aniket Rege
Aditya Kusupati
S. SharanRanjit
Alan Fan
Qingqing Cao
Sham Kakade
Prateek Jain
Ali Farhadi
73
6
0
30 May 2023
Neural Rankers for Effective Screening Prioritisation in Medical
  Systematic Review Literature Search
Neural Rankers for Effective Screening Prioritisation in Medical Systematic Review Literature Search
Shuai Wang
Harrisen Scells
Bevan Koopman
Guido Zuccon
65
24
0
18 Dec 2022
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked
  Auto-Encoder
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
Shitao Xiao
Zheng Liu
Yingxia Shao
Bo Zhao
RALM
274
124
0
24 May 2022
SimCSE: Simple Contrastive Learning of Sentence Embeddings
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Tianyu Gao
Xingcheng Yao
Danqi Chen
AILawSSL
274
3,407
0
18 Apr 2021
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information
  Retrieval Models
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Nandan Thakur
Nils Reimers
Andreas Rucklé
Abhishek Srivastava
Iryna Gurevych
VLM
425
1,050
0
17 Apr 2021
Condenser: a Pre-training Architecture for Dense Retrieval
Condenser: a Pre-training Architecture for Dense Retrieval
Luyu Gao
Jamie Callan
AI4CE
59
263
0
16 Apr 2021
Overview of the TREC 2020 deep learning track
Overview of the TREC 2020 deep learning track
Nick Craswell
Bhaskar Mitra
Emine Yilmaz
Daniel Fernando Campos
124
387
0
15 Feb 2021
Dense Passage Retrieval for Open-Domain Question Answering
Dense Passage Retrieval for Open-Domain Question Answering
Vladimir Karpukhin
Barlas Oğuz
Sewon Min
Patrick Lewis
Ledell Yu Wu
Sergey Edunov
Danqi Chen
Wen-tau Yih
RALM
198
3,778
0
10 Apr 2020
Overview of the TREC 2019 deep learning track
Overview of the TREC 2019 deep learning track
Nick Craswell
Bhaskar Mitra
Emine Yilmaz
Daniel Fernando Campos
E. Voorhees
234
494
0
17 Mar 2020
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,295
0
27 Aug 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,114
0
11 Oct 2018
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and
  Cross-lingual Focused Evaluation
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation
Daniel Cer
Mona T. Diab
Eneko Agirre
I. Lopez-Gazpio
Lucia Specia
430
1,887
0
31 Jul 2017
A Broad-Coverage Challenge Corpus for Sentence Understanding through
  Inference
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
Adina Williams
Nikita Nangia
Samuel R. Bowman
524
4,492
0
18 Apr 2017
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset
Payal Bajaj
Daniel Fernando Campos
Nick Craswell
Li Deng
Jianfeng Gao
...
Mir Rosenberg
Xia Song
Alina Stoica
Saurabh Tiwary
Tong Wang
RALM
142
2,741
0
28 Nov 2016
A large annotated corpus for learning natural language inference
A large annotated corpus for learning natural language inference
Samuel R. Bowman
Gabor Angeli
Christopher Potts
Christopher D. Manning
321
4,293
0
21 Aug 2015
1