Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.00359
Cited By
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
1 November 2019
Guillaume Wenzek
Marie-Anne Lachaux
Alexis Conneau
Vishrav Chaudhary
Francisco Guzmán
Armand Joulin
Edouard Grave
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data"
21 / 171 papers shown
Title
XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
Shuming Ma
Jian Yang
Haoyang Huang
Zewen Chi
Li Dong
...
Akiko Eriguchi
Saksham Singhal
Xia Song
Arul Menezes
Furu Wei
LRM
26
33
0
31 Dec 2020
Neural Machine Translation: A Review of Methods, Resources, and Tools
Zhixing Tan
Shuo Wang
Zonghan Yang
Gang Chen
Xuancheng Huang
Maosong Sun
Yang Liu
3DV
AI4TS
35
106
0
31 Dec 2020
Language Models not just for Pre-training: Fast Online Neural Noisy Channel Modeling
Shruti Bhosale
Kyra Yee
Sergey Edunov
Michael Auli
50
7
0
13 Nov 2020
FLERT: Document-Level Features for Named Entity Recognition
Stefan Schweter
Alan Akbik
32
111
0
13 Nov 2020
Multilingual AMR-to-Text Generation
Angela Fan
Claire Gardent
17
32
0
10 Nov 2020
Rethinking Evaluation in ASR: Are Our Models Robust Enough?
Tatiana Likhomanenko
Qiantong Xu
Vineel Pratap
Paden Tomasello
Jacob Kahn
Gilad Avidov
R. Collobert
Gabriel Synnaeve
39
98
0
22 Oct 2020
German's Next Language Model
Branden Chan
Stefan Schweter
Timo Möller
36
265
0
21 Oct 2020
Multi-task Learning for Multilingual Neural Machine Translation
Yiren Wang
Chengxiang Zhai
Hany Awadalla
37
68
0
06 Oct 2020
STIL -- Simultaneous Slot Filling, Translation, Intent Classification, and Language Identification: Initial Results using mBART on MultiATIS++
Jack G. M. FitzGerald
29
13
0
02 Oct 2020
Nearest Neighbor Machine Translation
Urvashi Khandelwal
Angela Fan
Dan Jurafsky
Luke Zettlemoyer
M. Lewis
RALM
18
282
0
01 Oct 2020
Unsupervised Cross-lingual Representation Learning for Speech Recognition
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
SSL
70
755
0
24 Jun 2020
Cross-lingual Retrieval for Iterative Self-Supervised Training
C. Tran
Y. Tang
Xian Li
Jiatao Gu
RALM
30
73
0
16 Jun 2020
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
Pedro Ortiz Suarez
Laurent Romary
Benoît Sagot
28
227
0
11 Jun 2020
From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers
Anne Lauscher
Vinit Ravishankar
Ivan Vulić
Goran Glavaš
34
56
0
01 May 2020
MUSS: Multilingual Unsupervised Sentence Simplification by Mining Paraphrases
Louis Martin
Angela Fan
Eric Villemonte de la Clergerie
Antoine Bordes
Benoît Sagot
33
36
0
01 May 2020
SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings
Masoud Jalili Sabet
Philipp Dufter
François Yvon
Hinrich Schütze
23
228
0
18 Apr 2020
On the Language Neutrality of Pre-trained Multilingual Representations
Jindrich Libovický
Rudolf Rosa
Alexander Fraser
25
101
0
09 Apr 2020
XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation
Yaobo Liang
Nan Duan
Yeyun Gong
Ning Wu
Fenfei Guo
...
Shuguang Liu
Fan Yang
Daniel Fernando Campos
Rangan Majumder
Ming Zhou
ELM
VLM
63
343
0
03 Apr 2020
Multilingual Denoising Pre-training for Neural Machine Translation
Yinhan Liu
Jiatao Gu
Naman Goyal
Xian Li
Sergey Edunov
Marjan Ghazvininejad
M. Lewis
Luke Zettlemoyer
AI4CE
AIMat
70
1,777
0
22 Jan 2020
CamemBERT: a Tasty French Language Model
Louis Martin
Benjamin Muller
Pedro Ortiz Suarez
Yoann Dupont
Laurent Romary
Eric Villemonte de la Clergerie
Djamé Seddah
Benoît Sagot
42
956
0
10 Nov 2019
Speech Intention Understanding in a Head-final Language: A Disambiguation Utilizing Intonation-dependency
Won Ik Cho
Hyeon Seung Lee
J. Yoon
Seokhwan Kim
N. Kim
41
5
0
10 Nov 2018
Previous
1
2
3
4