Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.13695
Cited By
Multilingual De-Duplication Strategies: Applying scalable similarity search with monolingual & multilingual embedding models
19 June 2024
Stefan Pasch
Dimitirios Petridis
Jannic Cutura
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multilingual De-Duplication Strategies: Applying scalable similarity search with monolingual & multilingual embedding models"
2 / 2 papers shown
Title
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
242
595
0
14 Jul 2021
Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks
Nandan Thakur
Nils Reimers
Johannes Daxenberger
Iryna Gurevych
208
243
0
16 Oct 2020
1