ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.08473
  4. Cited By
COCO-LM: Correcting and Contrasting Text Sequences for Language Model
  Pretraining
v1v2 (latest)

COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

16 February 2021
Yu Meng
Chenyan Xiong
Payal Bajaj
Saurabh Tiwary
Paul N. Bennett
Jiawei Han
Xia Song
ArXiv (abs)PDFHTML

Papers citing "COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining"

50 / 58 papers shown
Title
SimCSE: Simple Contrastive Learning of Sentence Embeddings
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Tianyu Gao
Xingcheng Yao
Danqi Chen
AILawSSL
274
3,407
0
18 Apr 2021
How Many Data Points is a Prompt Worth?
How Many Data Points is a Prompt Worth?
Teven Le Scao
Alexander M. Rush
VLM
155
302
0
15 Mar 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple
  and Efficient Sparsity
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
88
2,208
0
11 Jan 2021
Studying Strategically: Learning to Mask for Closed-book QA
Studying Strategically: Learning to Mask for Closed-book QA
Qinyuan Ye
Belinda Z. Li
Sinong Wang
Benjamin Bolte
Hao Ma
Wen-tau Yih
Xiang Ren
Madian Khabsa
OffRL
75
12
0
31 Dec 2020
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
399
1,971
0
31 Dec 2020
CLEAR: Contrastive Learning for Sentence Representation
CLEAR: Contrastive Learning for Sentence Representation
Zhuofeng Wu
Sinong Wang
Jiatao Gu
Madian Khabsa
Fei Sun
Hao Ma
SSL
71
323
0
31 Dec 2020
Pre-Training Transformers as Energy-Based Cloze Models
Pre-Training Transformers as Energy-Based Cloze Models
Kevin Clark
Minh-Thang Luong
Quoc V. Le
Christopher D. Manning
61
80
0
15 Dec 2020
Supervised Contrastive Learning for Pre-trained Language Model
  Fine-tuning
Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning
Beliz Gunel
Jingfei Du
Alexis Conneau
Ves Stoyanov
60
506
0
03 Nov 2020
CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for
  Natural Language Understanding
CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding
Yanru Qu
Dinghan Shen
Yelong Shen
Sandra Sajeev
Jiawei Han
Weizhu Chen
185
69
0
16 Oct 2020
Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for
  Pairwise Sentence Scoring Tasks
Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks
Nandan Thakur
Nils Reimers
Johannes Daxenberger
Iryna Gurevych
289
248
0
16 Oct 2020
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot
  Learners
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
Timo Schick
Hinrich Schütze
128
973
0
15 Sep 2020
Better Fine-Tuning by Reducing Representational Collapse
Better Fine-Tuning by Reducing Representational Collapse
Armen Aghajanyan
Akshat Shrivastava
Anchit Gupta
Naman Goyal
Luke Zettlemoyer
S. Gupta
AAML
74
210
0
06 Aug 2020
Demystifying Contrastive Self-Supervised Learning: Invariances,
  Augmentations and Dataset Biases
Demystifying Contrastive Self-Supervised Learning: Invariances, Augmentations and Dataset Biases
Senthil Purushwalkam
Abhinav Gupta
SSL
79
219
0
28 Jul 2020
Approximate Nearest Neighbor Negative Contrastive Learning for Dense
  Text Retrieval
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
Lee Xiong
Chenyan Xiong
Ye Li
Kwok-Fung Tang
Jialin Liu
Paul N. Bennett
Junaid Ahmed
Arnold Overwijk
139
1,231
0
01 Jul 2020
Knowledge-Aware Language Model Pretraining
Knowledge-Aware Language Model Pretraining
Corby Rosset
Chenyan Xiong
M. Phan
Xia Song
Paul N. Bennett
Saurabh Tiwary
KELM
72
82
0
29 Jun 2020
Rethinking Positional Encoding in Language Pre-training
Rethinking Positional Encoding in Language Pre-training
Guolin Ke
Di He
Tie-Yan Liu
74
297
0
28 Jun 2020
Pre-training via Paraphrasing
Pre-training via Paraphrasing
M. Lewis
Marjan Ghazvininejad
Gargi Ghosh
Armen Aghajanyan
Sida I. Wang
Luke Zettlemoyer
AIMat
87
160
0
26 Jun 2020
MC-BERT: Efficient Language Pre-Training via a Meta Controller
MC-BERT: Efficient Language Pre-Training via a Meta Controller
Zhenhui Xu
Linyuan Gong
Guolin Ke
Di He
Shuxin Zheng
Liwei Wang
Jiang Bian
Tie-Yan Liu
BDL
63
18
0
10 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
161
2,747
0
05 Jun 2020
Understanding Contrastive Representation Learning through Alignment and
  Uniformity on the Hypersphere
Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere
Tongzhou Wang
Phillip Isola
SSL
160
1,854
0
20 May 2020
CERT: Contrastive Self-supervised Learning for Language Understanding
CERT: Contrastive Self-supervised Learning for Language Understanding
Hongchao Fang
Sicheng Wang
Meng Zhou
Jiayuan Ding
P. Xie
ELMSSL
72
341
0
16 May 2020
Sparse, Dense, and Attentional Representations for Text Retrieval
Sparse, Dense, and Attentional Representations for Text Retrieval
Y. Luan
Jacob Eisenstein
Kristina Toutanova
M. Collins
66
408
0
01 May 2020
UniLMv2: Pseudo-Masked Language Models for Unified Language Model
  Pre-Training
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
Hangbo Bao
Li Dong
Furu Wei
Wenhui Wang
Nan Yang
...
Yu Wang
Songhao Piao
Jianfeng Gao
Ming Zhou
H. Hon
AI4CE
88
394
0
28 Feb 2020
Transformers as Soft Reasoners over Language
Transformers as Soft Reasoners over Language
Peter Clark
Oyvind Tafjord
Kyle Richardson
ReLMOffRLLRM
99
359
0
14 Feb 2020
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
375
18,778
0
13 Feb 2020
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
Adam Roberts
Colin Raffel
Noam M. Shazeer
KELM
121
891
0
10 Feb 2020
REALM: Retrieval-Augmented Language Model Pre-Training
REALM: Retrieval-Augmented Language Model Pre-Training
Kelvin Guu
Kenton Lee
Zora Tung
Panupong Pasupat
Ming-Wei Chang
RALM
137
2,114
0
10 Feb 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
608
4,893
0
23 Jan 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural
  Language Inference
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
348
1,617
0
21 Jan 2020
Momentum Contrast for Unsupervised Visual Representation Learning
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
207
12,085
0
13 Nov 2019
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language
  Generation, Translation, and Comprehension
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
M. Lewis
Yinhan Liu
Naman Goyal
Marjan Ghazvininejad
Abdel-rahman Mohamed
Omer Levy
Veselin Stoyanov
Luke Zettlemoyer
AIMatVLM
260
10,848
0
29 Oct 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
445
20,298
0
23 Oct 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSLAIMat
371
6,463
0
26 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
331
1,914
0
17 Sep 2019
How Contextual are Contextualized Word Representations? Comparing the
  Geometry of BERT, ELMo, and GPT-2 Embeddings
How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings
Kawin Ethayarajh
86
875
0
02 Sep 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,295
0
27 Aug 2019
StructBERT: Incorporating Language Structures into Pre-training for Deep
  Language Understanding
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
Wei Wang
Bin Bi
Ming Yan
Chen Henry Wu
Zuyi Bao
Jiangnan Xia
Liwei Peng
Luo Si
59
264
0
13 Aug 2019
Representation Degeneration Problem in Training Natural Language
  Generation Models
Representation Degeneration Problem in Training Natural Language Generation Models
Jun Gao
Di He
Xu Tan
Tao Qin
Liwei Wang
Tie-Yan Liu
62
270
0
28 Jul 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
668
24,528
0
26 Jul 2019
SpanBERT: Improving Pre-training by Representing and Predicting Spans
SpanBERT: Improving Pre-training by Representing and Predicting Spans
Mandar Joshi
Danqi Chen
Yinhan Liu
Daniel S. Weld
Luke Zettlemoyer
Omer Levy
147
1,967
0
24 Jul 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
232
8,444
0
19 Jun 2019
What Does BERT Look At? An Analysis of BERT's Attention
What Does BERT Look At? An Analysis of BERT's Attention
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
218
1,601
0
11 Jun 2019
Visualizing and Measuring the Geometry of BERT
Visualizing and Measuring the Geometry of BERT
Andy Coenen
Emily Reif
Ann Yuan
Been Kim
Adam Pearce
F. Viégas
Martin Wattenberg
MILM
78
418
0
06 Jun 2019
Unified Language Model Pre-training for Natural Language Understanding
  and Generation
Unified Language Model Pre-training for Natural Language Understanding and Generation
Li Dong
Nan Yang
Wenhui Wang
Furu Wei
Xiaodong Liu
Yu Wang
Jianfeng Gao
M. Zhou
H. Hon
ELMAI4CE
227
1,559
0
08 May 2019
MASS: Masked Sequence to Sequence Pre-training for Language Generation
MASS: Masked Sequence to Sequence Pre-training for Language Generation
Kaitao Song
Xu Tan
Tao Qin
Jianfeng Lu
Tie-Yan Liu
117
966
0
07 May 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
VLMFaML
111
3,156
0
01 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,114
0
11 Oct 2018
Representation Learning with Contrastive Predictive Coding
Representation Learning with Contrastive Predictive Coding
Aaron van den Oord
Yazhe Li
Oriol Vinyals
DRLSSL
330
10,349
0
10 Jul 2018
A Simple Method for Commonsense Reasoning
A Simple Method for Commonsense Reasoning
Trieu H. Trinh
Quoc V. Le
LRMReLM
95
434
0
07 Jun 2018
Neural Network Acceptability Judgments
Neural Network Acceptability Judgments
Alex Warstadt
Amanpreet Singh
Samuel R. Bowman
233
1,411
0
31 May 2018
12
Next