ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.16958
  4. Cited By
MixCE: Training Autoregressive Language Models by Mixing Forward and
  Reverse Cross-Entropies

MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies

26 May 2023
Shiyue Zhang
Shijie Wu
Ozan Irsoy
Steven Lu
Joey Tianyi Zhou
Mark Dredze
David S. Rosenberg
ArXivPDFHTML

Papers citing "MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies"

11 / 11 papers shown
Title
A Measure-Theoretic Characterization of Tight Language Models
A Measure-Theoretic Characterization of Tight Language Models
Li Du
Lucas Torroba Hennigen
Tiago Pimentel
Clara Meister
Jason Eisner
Ryan Cotterell
56
32
0
20 Dec 2022
Contrastive Decoding: Open-ended Text Generation as Optimization
Contrastive Decoding: Open-ended Text Generation as Optimization
Xiang Lisa Li
Ari Holtzman
Daniel Fried
Percy Liang
Jason Eisner
Tatsunori Hashimoto
Luke Zettlemoyer
M. Lewis
70
348
0
27 Oct 2022
Locally Typical Sampling
Locally Typical Sampling
Clara Meister
Tiago Pimentel
Gian Wiher
Ryan Cotterell
168
88
0
01 Feb 2022
Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural
  Machine Translation
Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural Machine Translation
Bryan Eikema
Wilker Aziz
43
132
0
20 May 2020
Don't Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood
  Training
Don't Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training
Margaret Li
Stephen Roller
Ilia Kulikov
Sean Welleck
Y-Lan Boureau
Kyunghyun Cho
Jason Weston
57
181
0
10 Nov 2019
On NMT Search Errors and Model Errors: Cat Got Your Tongue?
On NMT Search Errors and Model Errors: Cat Got Your Tongue?
Felix Stahlberg
Bill Byrne
LRM
49
152
0
27 Aug 2019
Unifying Human and Statistical Evaluation for Natural Language
  Generation
Unifying Human and Statistical Evaluation for Natural Language Generation
Tatsunori B. Hashimoto
Hugh Zhang
Percy Liang
49
223
0
04 Apr 2019
The Concrete Distribution: A Continuous Relaxation of Discrete Random
  Variables
The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables
Chris J. Maddison
A. Mnih
Yee Whye Teh
BDL
111
2,523
0
02 Nov 2016
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
166
2,814
0
26 Sep 2016
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
280
20,491
0
10 Sep 2014
Neural Machine Translation by Jointly Learning to Align and Translate
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
AIMat
388
27,205
0
01 Sep 2014
1