ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.01038
  4. Cited By
fairseq: A Fast, Extensible Toolkit for Sequence Modeling

fairseq: A Fast, Extensible Toolkit for Sequence Modeling

1 April 2019
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
    VLMFaML
ArXiv (abs)PDFHTML

Papers citing "fairseq: A Fast, Extensible Toolkit for Sequence Modeling"

50 / 87 papers shown
Title
Multimodal Machine Translation with Visual Scene Graph Pruning
Multimodal Machine Translation with Visual Scene Graph Pruning
Chenyu Lu
Shiliang Sun
Jing Zhao
N. Zhang
Tengfei Song
Hao Yang
214
0
0
26 May 2025
Systematic Generalization in Language Models Scales with Information Entropy
Systematic Generalization in Language Models Scales with Information Entropy
Sondre Wold
Lucas Georges Gabriel Charpentier
Étienne Simon
201
0
0
19 May 2025
Lightweight End-to-end Text-to-speech Synthesis for low resource on-device applications
Lightweight End-to-end Text-to-speech Synthesis for low resource on-device applications
Biel Tura Vecino
Adam Gabry's
Daniel Mątwicki
Andrzej Pomirski
Tom Iddon
Marius Cotescu
Jaime Lorenzo-Trueba
186
3
0
12 May 2025
Self-Vocabularizing Training for Neural Machine Translation
Self-Vocabularizing Training for Neural Machine Translation
Pin-Jie Lin
Ernie Chang
Yangyang Shi
Vikas Chandra
113
0
0
18 Mar 2025
FourierNAT: A Fourier-Mixing-Based Non-Autoregressive Transformer for Parallel Sequence Generation
FourierNAT: A Fourier-Mixing-Based Non-Autoregressive Transformer for Parallel Sequence Generation
Andrew Kiruluta
Eric Lundy
Andreas Lemos
AI4TS
102
0
0
04 Mar 2025
IPO: Your Language Model is Secretly a Preference Classifier
IPO: Your Language Model is Secretly a Preference Classifier
Shivank Garg
Ayush Singh
Shweta Singh
Paras Chopra
440
1
0
22 Feb 2025
Deterministic Reversible Data Augmentation for Neural Machine Translation
Deterministic Reversible Data Augmentation for Neural Machine Translation
Jiashu Yao
Heyan Huang
Zeming Liu
Yuhang Guo
153
0
0
21 Feb 2025
Tensor Product Attention Is All You Need
Tensor Product Attention Is All You Need
Yifan Zhang
Yifeng Liu
Huizhuo Yuan
Zhen Qin
Yang Yuan
Q. Gu
Andrew Chi-Chih Yao
161
15
0
11 Jan 2025
Registering Source Tokens to Target Language Spaces in Multilingual Neural Machine Translation
Registering Source Tokens to Target Language Spaces in Multilingual Neural Machine Translation
Zhi Qu
Yiran Wang
Jiannan Mao
Chenchen Ding
Hideki Tanaka
Masao Utiyama
Taro Watanabe
LRM
105
0
0
06 Jan 2025
MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation
MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation
Langlin Huang
Mengyu Bu
Yang Feng
86
0
0
03 Nov 2024
Frequency matters: Modeling irregular morphological patterns in Spanish with Transformers
Frequency matters: Modeling irregular morphological patterns in Spanish with Transformers
Akhilesh Kakolu Ramarao
Kevin Tang
Dinah Baer-Henney
75
0
0
28 Oct 2024
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Shansan Gong
Shivam Agarwal
Yizhe Zhang
Jiacheng Ye
Lin Zheng
...
Peilin Zhao
W. Bi
Jiawei Han
Hao Peng
Dianbo Sui
AI4CE
123
27
0
23 Oct 2024
The Mystery of the Pathological Path-star Task for Language Models
The Mystery of the Pathological Path-star Task for Language Models
Arvid Frydenlund
LRM
97
4
0
17 Oct 2024
SCOREQ: Speech Quality Assessment with Contrastive Regression
SCOREQ: Speech Quality Assessment with Contrastive Regression
Alessandro Ragano
Jan Skoglund
Andrew Hines
104
13
0
09 Oct 2024
A Parameter Update Balancing Algorithm for Multi-task Ranking Models in Recommendation Systems
A Parameter Update Balancing Algorithm for Multi-task Ranking Models in Recommendation Systems
Jun Yuan
Guohao Cai
Zhenhua Dong
191
0
0
08 Oct 2024
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Tianjian Li
Haoran Xu
Weiting Tan
Kenton Murray
Daniel Khashabi
104
1
0
06 Oct 2024
Confidential Prompting: Protecting User Prompts from Cloud LLM Providers
Confidential Prompting: Protecting User Prompts from Cloud LLM Providers
In Gim
Caihua Li
Lin Zhong
113
3
0
27 Sep 2024
MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder
MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder
Khai-Nguyen Nguyen
Phuc Phan
Tan-Hanh Pham
Bach Phan Tat
Minh-Huong Ngo
Chris Ngo
Thanh Nguyen-Tang
Truong-Son Hy
LM&MA
90
0
0
21 Sep 2024
How to Learn in a Noisy World? Self-Correcting the Real-World Data Noise in Machine Translation
How to Learn in a Noisy World? Self-Correcting the Real-World Data Noise in Machine Translation
Yan Meng
Di Wu
Christof Monz
88
1
0
02 Jul 2024
Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation
Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation
Boxuan Lyu
Hidetaka Kamigaito
Kotaro Funakoshi
Manabu Okumura
141
0
0
17 Jun 2024
Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation
Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation
Zhi Qu
Chenchen Ding
Taro Watanabe
134
1
0
12 Jun 2024
VietMed: A Dataset and Benchmark for Automatic Speech Recognition of Vietnamese in the Medical Domain
VietMed: A Dataset and Benchmark for Automatic Speech Recognition of Vietnamese in the Medical Domain
Khai-Nguyen Nguyen
LM&MA
77
10
0
08 Apr 2024
Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts
Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts
Weilin Cai
Juyong Jiang
Le Qin
Junwei Cui
Sunghun Kim
Jiayi Huang
150
9
0
07 Apr 2024
ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition
ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition
Haris Riaz
Razvan-Gabriel Dumitru
Mihai Surdeanu
MU
127
0
0
26 Mar 2024
Understanding Emergent Abilities of Language Models from the Loss Perspective
Understanding Emergent Abilities of Language Models from the Loss Perspective
Zhengxiao Du
Aohan Zeng
Yuxiao Dong
Jie Tang
UQCVLRM
121
55
0
23 Mar 2024
Non-autoregressive Sequence-to-Sequence Vision-Language Models
Non-autoregressive Sequence-to-Sequence Vision-Language Models
Kunyu Shi
Qi Dong
Luis Goncalves
Zhuowen Tu
Stefano Soatto
VLM
119
3
0
04 Mar 2024
EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation
EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation
Yuqiao Wen
Behzad Shayegh
Chenyang Huang
Yanshuai Cao
Lili Mou
117
5
0
29 Feb 2024
Continuously Learning New Words in Automatic Speech Recognition
Continuously Learning New Words in Automatic Speech Recognition
Christian Huber
Alexander Waibel
SSLCLL
115
1
0
09 Jan 2024
Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Chameleon: a Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models
Wenqi Jiang
Marco Zeller
R. Waleffe
Torsten Hoefler
Gustavo Alonso
110
17
0
15 Oct 2023
Learning Evaluation Models from Large Language Models for Sequence Generation
Learning Evaluation Models from Large Language Models for Sequence Generation
Chenglong Wang
Hang Zhou
Kai-Chun Chang
Tongran Liu
Chunliang Zhang
Quan Du
Tong Xiao
Yue Zhang
Jingbo Zhu
ELM
128
4
0
08 Aug 2023
Testing the Predictions of Surprisal Theory in 11 Languages
Testing the Predictions of Surprisal Theory in 11 Languages
Ethan Gotlieb Wilcox
Tiago Pimentel
Clara Meister
Ryan Cotterell
R. Levy
LRM
118
70
0
07 Jul 2023
Sheffield's Submission to the AmericasNLP Shared Task on Machine Translation into Indigenous Languages
Sheffield's Submission to the AmericasNLP Shared Task on Machine Translation into Indigenous Languages
Edward Gow-Smith
Danae Sánchez Villegas
77
9
0
16 Jun 2023
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Yonggan Fu
Yang Zhang
Kaizhi Qian
Zhifan Ye
Zhongzhi Yu
Cheng-I Jeff Lai
Yingyan Lin
152
9
0
02 Nov 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Dianbo Sui
3DV
142
9
0
14 Oct 2022
Multi-Step Deductive Reasoning Over Natural Language: An Empirical Study on Out-of-Distribution Generalisation
Multi-Step Deductive Reasoning Over Natural Language: An Empirical Study on Out-of-Distribution Generalisation
Qiming Bao
A. Peng
Tim Hartill
N. Tan
Zhenyun Deng
Michael Witbrock
Jiamou Liu
ReLMOODNAILRM
123
14
0
28 Jul 2022
UniSAr: A Unified Structure-Aware Autoregressive Language Model for
  Text-to-SQL
UniSAr: A Unified Structure-Aware Autoregressive Language Model for Text-to-SQL
Longxu Dou
Yan Gao
Mingyang Pan
Dingzirui Wang
Wanxiang Che
Dechen Zhan
Jian-Guang Lou
86
26
0
15 Mar 2022
Revisiting Low Resource Status of Indian Languages in Machine
  Translation
Revisiting Low Resource Status of Indian Languages in Machine Translation
Jerin Philip
Shashank Siripragada
Vinay P. Namboodiri
C. V. Jawahar
65
28
0
11 Aug 2020
PowerNorm: Rethinking Batch Normalization in Transformers
PowerNorm: Rethinking Batch Normalization in Transformers
Sheng Shen
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
BDL
86
16
0
17 Mar 2020
PhoBERT: Pre-trained language models for Vietnamese
PhoBERT: Pre-trained language models for Vietnamese
Dat Quoc Nguyen
A. Nguyen
226
356
0
02 Mar 2020
How Decoding Strategies Affect the Verifiability of Generated Text
How Decoding Strategies Affect the Verifiability of Generated Text
Luca Massarelli
Fabio Petroni
Aleksandra Piktus
Myle Ott
Tim Rocktaschel
Vassilis Plachouras
Fabrizio Silvestri
Sebastian Riedel
112
50
0
09 Nov 2019
Deja-vu: Double Feature Presentation and Iterated Loss in Deep
  Transformer Networks
Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks
Andros Tjandra
Chunxi Liu
Frank Zhang
Xiaohui Zhang
Yongqiang Wang
Gabriel Synnaeve
Satoshi Nakamura
Geoffrey Zweig
ViT
69
46
0
23 Oct 2019
Mixture Models for Diverse Machine Translation: Tricks of the Trade
Mixture Models for Diverse Machine Translation: Tricks of the Trade
T. Shen
Myle Ott
Michael Auli
MarcÁurelio Ranzato
MoE
93
151
0
20 Feb 2019
Strategies for Structuring Story Generation
Strategies for Structuring Story Generation
Angela Fan
M. Lewis
Yann N. Dauphin
82
216
0
04 Feb 2019
Pay Less Attention with Lightweight and Dynamic Convolutions
Pay Less Attention with Lightweight and Dynamic Convolutions
Felix Wu
Angela Fan
Alexei Baevski
Yann N. Dauphin
Michael Auli
89
610
0
29 Jan 2019
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual
  Transfer and Beyond
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
Mikel Artetxe
Holger Schwenk
3DV
156
1,017
0
26 Dec 2018
Wizard of Wikipedia: Knowledge-Powered Conversational agents
Wizard of Wikipedia: Knowledge-Powered Conversational agents
Emily Dinan
Stephen Roller
Kurt Shuster
Angela Fan
Michael Auli
Jason Weston
RALMKELM
141
950
0
03 Nov 2018
Adaptive Input Representations for Neural Language Modeling
Adaptive Input Representations for Neural Language Modeling
Alexei Baevski
Michael Auli
111
390
0
28 Sep 2018
Bottom-Up Abstractive Summarization
Bottom-Up Abstractive Summarization
Sebastian Gehrmann
Yuntian Deng
Alexander M. Rush
CVBM
165
689
0
31 Aug 2018
Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional
  Neural Networks for Extreme Summarization
Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization
Shashi Narayan
Shay B. Cohen
Mirella Lapata
AILaw
152
1,684
0
27 Aug 2018
Double Path Networks for Sequence to Sequence Learning
Double Path Networks for Sequence to Sequence Learning
Kaitao Song
Xu Tan
Di He
Jianfeng Lu
Tao Qin
Tie-Yan Liu
54
14
0
13 Jun 2018
12
Next