ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.14878
  4. Cited By
Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment

Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment

20 July 2024
Yongxin Huang
Kexin Wang
Goran Glavaš
Iryna Gurevych
ArXivPDFHTML

Papers citing "Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment"

50 / 51 papers shown
Title
Multilingual Sentence-T5: Scalable Sentence Encoders for Multilingual
  Applications
Multilingual Sentence-T5: Scalable Sentence Encoders for Multilingual Applications
Chihiro Yano
Akihiko Fukuchi
Shoko Fukasawa
Hideyuki Tachibana
Yotaro Watanabe
41
4
0
26 Mar 2024
Breaking the Curse of Multilinguality with Cross-lingual Expert Language
  Models
Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models
Terra Blevins
Tomasz Limisiewicz
Suchin Gururangan
Margaret Li
Hila Gonen
Noah A. Smith
Luke Zettlemoyer
65
25
0
19 Jan 2024
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer
  Learning
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Clifton A. Poth
Hannah Sterz
Indraneil Paul
Sukannya Purkayastha
Leon Arne Engländer
Timo Imhof
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
Jonas Pfeiffer
73
52
0
18 Nov 2023
Leveraging Multi-lingual Positive Instances in Contrastive Learning to
  Improve Sentence Embedding
Leveraging Multi-lingual Positive Instances in Contrastive Learning to Improve Sentence Embedding
Kaiyan Zhao
Qiyu Wu
Xin-Qiang Cai
Yoshimasa Tsuruoka
30
7
0
16 Sep 2023
SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic
  Classification in 200+ Languages and Dialects
SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects
David Ifeoluwa Adelani
Hannah Liu
Xiaoyu Shen
Nikita Vassilyev
Jesujoba Oluwadara Alabi
Yanke Mao
Haonan Gao
Annie En-Shiun Lee
ELM
76
77
0
14 Sep 2023
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Sneha Kudugunta
Isaac Caswell
Biao Zhang
Xavier Garcia
Christopher A. Choquette-Choo
...
Derrick Xin
Aditya Kusupati
Romi Stella
Ankur Bapna
Orhan Firat
93
135
0
09 Sep 2023
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122
  Language Variants
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
Lucas Bandarkar
Davis Liang
Benjamin Muller
Mikel Artetxe
Satya Narayan Shukla
Don Husa
Naman Goyal
Abhinandan Krishnan
Luke Zettlemoyer
Madian Khabsa
88
153
0
31 Aug 2023
SONAR: Sentence-Level Multimodal and Language-Agnostic Representations
SONAR: Sentence-Level Multimodal and Language-Agnostic Representations
Paul-Ambroise Duquenne
Holger Schwenk
Benoît Sagot
AI4TS
VLM
95
68
0
22 Aug 2023
Learning Multilingual Sentence Representations with Cross-lingual
  Consistency Regularization
Learning Multilingual Sentence Representations with Cross-lingual Consistency Regularization
Pengzhi Gao
Liwen Zhang
Zhongjun He
Hua Wu
Haifeng Wang
39
7
0
12 Jun 2023
Glot500: Scaling Multilingual Corpora and Language Models to 500
  Languages
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
Ayyoob Imani
Peiqin Lin
Amir Hossein Kargaran
Silvia Severini
Masoud Jalili Sabet
...
Chunlan Ma
Helmut Schmid
André F. T. Martins
François Yvon
Hinrich Schütze
ALM
LRM
80
105
0
20 May 2023
Beyond Contrastive Learning: A Variational Generative Model for
  Multilingual Retrieval
Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval
John Wieting
J. Clark
William W. Cohen
Graham Neubig
Taylor Berg-Kirkpatrick
78
6
0
21 Dec 2022
English Contrastive Learning Can Learn Universal Cross-lingual Sentence
  Embeddings
English Contrastive Learning Can Learn Universal Cross-lingual Sentence Embeddings
Yau-Shian Wang
Ashley Wu
Graham Neubig
SSL
78
31
0
11 Nov 2022
No Language Left Behind: Scaling Human-Centered Machine Translation
No Language Left Behind: Scaling Human-Centered Machine Translation
Nllb team
Marta R. Costa-jussá
James Cross
Onur cCelebi
Maha Elbayad
...
Alexandre Mourachko
C. Ropers
Safiyyah Saleem
Holger Schwenk
Jeff Wang
MoE
220
1,260
0
11 Jul 2022
Bitext Mining Using Distilled Sentence Representations for Low-Resource
  Languages
Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages
Kevin Heffernan
Onur cCelebi
Holger Schwenk
119
55
0
25 May 2022
Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
  Multilingual Language Models
Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models
Terra Blevins
Hila Gonen
Luke Zettlemoyer
LRM
85
31
0
24 May 2022
Lifting the Curse of Multilinguality by Pre-training Modular
  Transformers
Lifting the Curse of Multilinguality by Pre-training Modular Transformers
Jonas Pfeiffer
Naman Goyal
Xi Lin
Xian Li
James Cross
Sebastian Riedel
Mikel Artetxe
LRM
83
143
0
12 May 2022
Evaluation Benchmarks for Spanish Sentence Representations
Evaluation Benchmarks for Spanish Sentence Representations
Vladimir Araujo
Andrés Carvallo
Souvik Kundu
J. Canete
Marcelo Mendoza
Robert E. Mercer
Felipe Bravo-Marquez
Marie-Francine Moens
Alvaro Soto
ELM
45
10
0
15 Apr 2022
WECHSEL: Effective initialization of subword embeddings for
  cross-lingual transfer of monolingual language models
WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models
Benjamin Minixhofer
Fabian Paischer
Navid Rekabsaz
69
84
0
13 Dec 2021
Towards a Unified View of Parameter-Efficient Transfer Learning
Towards a Unified View of Parameter-Efficient Transfer Learning
Junxian He
Chunting Zhou
Xuezhe Ma
Taylor Berg-Kirkpatrick
Graham Neubig
AAML
129
935
0
08 Oct 2021
A Simple and Effective Method To Eliminate the Self Language Bias in
  Multilingual Representations
A Simple and Effective Method To Eliminate the Self Language Bias in Multilingual Representations
Ziyi Yang
Yinfei Yang
Daniel Cer
Eric F. Darve
46
24
0
10 Sep 2021
Specializing Multilingual Language Models: An Empirical Study
Specializing Multilingual Language Models: An Empirical Study
Ethan C. Chau
Noah A. Smith
61
27
0
16 Jun 2021
The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual
  Machine Translation
The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation
Naman Goyal
Cynthia Gao
Vishrav Chaudhary
Peng-Jen Chen
Guillaume Wenzek
Da Ju
Sanjan Krishnan
MarcÁurelio Ranzato
Francisco Guzman
Angela Fan
93
584
0
06 Jun 2021
Lightweight Cross-Lingual Sentence Representation Learning
Lightweight Cross-Lingual Sentence Representation Learning
Zhuoyuan Mao
Prakhar Gupta
Pei Wang
Chenhui Chu
Martin Jaggi
Sadao Kurohashi
VLM
91
9
0
28 May 2021
SimCSE: Simple Contrastive Learning of Sentence Embeddings
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Tianyu Gao
Xingcheng Yao
Danqi Chen
AILaw
SSL
261
3,392
0
18 Apr 2021
SICKNL: A Dataset for Dutch Natural Language Inference
SICKNL: A Dataset for Dutch Natural Language Inference
G. Wijnholds
M. Moortgat
114
26
0
14 Jan 2021
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts
Jonas Pfeiffer
Ivan Vulić
Iryna Gurevych
Sebastian Ruder
72
132
0
31 Dec 2020
German's Next Language Model
German's Next Language Model
Branden Chan
Stefan Schweter
Timo Möller
87
272
0
21 Oct 2020
Language-agnostic BERT Sentence Embedding
Language-agnostic BERT Sentence Embedding
Fangxiaoyu Feng
Yinfei Yang
Daniel Cer
N. Arivazhagan
Wei Wang
162
906
0
03 Jul 2020
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer
Jonas Pfeiffer
Ivan Vulić
Iryna Gurevych
Sebastian Ruder
99
626
0
30 Apr 2020
Making Monolingual Sentence Embeddings Multilingual using Knowledge
  Distillation
Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation
Nils Reimers
Iryna Gurevych
100
1,023
0
21 Apr 2020
LAReQA: Language-agnostic answer retrieval from a multilingual pool
LAReQA: Language-agnostic answer retrieval from a multilingual pool
Uma Roy
Noah Constant
Rami Al-Rfou
Aditya Barua
Aaron B. Phillips
Yinfei Yang
RALM
37
56
0
11 Apr 2020
Are All Good Word Vector Spaces Isomorphic?
Are All Good Word Vector Spaces Isomorphic?
Ivan Vulić
Sebastian Ruder
Anders Søgaard
VLM
50
65
0
08 Apr 2020
KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language
  Understanding
KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding
Jiyeon Ham
Yo Joong Choe
Kyubyong Park
Ilji Choi
Hyungjoon Soh
50
78
0
07 Apr 2020
AraBERT: Transformer-based Model for Arabic Language Understanding
AraBERT: Transformer-based Model for Arabic Language Understanding
Wissam Antoun
Fady Baly
Hazem M. Hajj
105
969
0
28 Feb 2020
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
Holger Schwenk
Guillaume Wenzek
Sergey Edunov
Edouard Grave
Armand Joulin
82
261
0
10 Nov 2019
CamemBERT: a Tasty French Language Model
CamemBERT: a Tasty French Language Model
Louis Martin
Benjamin Muller
Pedro Ortiz Suarez
Yoann Dupont
Laurent Romary
Eric Villemonte de la Clergerie
Djamé Seddah
Benoît Sagot
102
972
0
10 Nov 2019
Unsupervised Cross-lingual Representation Learning at Scale
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
212
6,555
0
05 Nov 2019
Evaluation of Sentence Representations in Polish
Evaluation of Sentence Representations in Polish
Slawomir Dadas
Michal Perelkiewicz
Rafal Poswiata
164
15
0
25 Oct 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,193
0
27 Aug 2019
Bilingual Lexicon Induction with Semi-supervision in Non-Isometric
  Embedding Spaces
Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces
Barun Patra
Joel Ruben Antony Moniz
Sarthak Garg
Matthew R. Gormley
Graham Neubig
76
101
0
19 Aug 2019
STRASS: A Light and Effective Method for Extractive Summarization Based
  on Sentence Embeddings
STRASS: A Light and Effective Method for Extractive Summarization Based on Sentence Embeddings
Léo Bouscarrat
Antoine Bonnefoy
Thomas Peel
Cecile Pereira
AILaw
43
19
0
16 Jul 2019
Multilingual Universal Sentence Encoder for Semantic Retrieval
Multilingual Universal Sentence Encoder for Semantic Retrieval
Yinfei Yang
Daniel Cer
Amin Ahmad
Mandy Guo
Jax Law
...
Steve Yuan
Chris Tar
Yun-hsuan Sung
B. Strope
R. Kurzweil
3DV
73
477
0
09 Jul 2019
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual
  Transfer and Beyond
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
Mikel Artetxe
Holger Schwenk
3DV
154
1,014
0
26 Dec 2018
Margin-based Parallel Corpus Mining with Multilingual Sentence
  Embeddings
Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings
Mikel Artetxe
Holger Schwenk
61
202
0
03 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.7K
94,770
0
11 Oct 2018
XNLI: Evaluating Cross-lingual Sentence Representations
XNLI: Evaluating Cross-lingual Sentence Representations
Alexis Conneau
Guillaume Lample
Ruty Rinott
Adina Williams
Samuel R. Bowman
Holger Schwenk
Veselin Stoyanov
ELM
59
1,381
0
13 Sep 2018
On the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary Induction
Anders Søgaard
Sebastian Ruder
Ivan Vulić
58
261
0
09 May 2018
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and
  Cross-lingual Focused Evaluation
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation
Daniel Cer
Mona T. Diab
Eneko Agirre
I. Lopez-Gazpio
Lucia Specia
428
1,881
0
31 Jul 2017
Efficient Natural Language Response Suggestion for Smart Reply
Efficient Natural Language Response Suggestion for Smart Reply
Matthew Henderson
Rami Al-Rfou
B. Strope
Yun-hsuan Sung
László Lukács
Ruiqi Guo
Sanjiv Kumar
Balint Miklos
R. Kurzweil
157
428
0
01 May 2017
A Broad-Coverage Challenge Corpus for Sentence Understanding through
  Inference
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
Adina Williams
Nikita Nangia
Samuel R. Bowman
520
4,476
0
18 Apr 2017
12
Next