ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.00528
  4. Cited By
Examining Scaling and Transfer of Language Model Architectures for
  Machine Translation

Examining Scaling and Transfer of Language Model Architectures for Machine Translation

1 February 2022
Biao Zhang
Behrooz Ghorbani
Ankur Bapna
Yong Cheng
Xavier Garcia
Jonathan Shen
Orhan Firat
ArXivPDFHTML

Papers citing "Examining Scaling and Transfer of Language Model Architectures for Machine Translation"

19 / 19 papers shown
Title
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation
Biao Zhang
Fedor Moiseev
Joshua Ainslie
Paul Suganthan
Min Ma
Surya Bhupatiraju
Fede Lebron
Orhan Firat
Armand Joulin
Zhe Dong
AI4CE
31
0
0
08 Apr 2025
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang
Guoming Ling
Vincent S. Liang
Yupei Lin
Yandong Chen
Shanshan Zhong
Hefeng Wu
Liang Lin
LRM
54
2
0
08 Mar 2025
(Mis)Fitting: A Survey of Scaling Laws
(Mis)Fitting: A Survey of Scaling Laws
Margaret Li
Sneha Kudugunta
Luke Zettlemoyer
69
2
0
26 Feb 2025
Scaling Laws for Downstream Task Performance in Machine Translation
Scaling Laws for Downstream Task Performance in Machine Translation
Berivan Isik
Natalia Ponomareva
Hussein Hazimeh
Dimitris Paparas
Sergei Vassilvitskii
Sanmi Koyejo
108
3
0
24 Feb 2025
CrisisSense-LLM: Instruction Fine-Tuned Large Language Model for Multi-label Social Media Text Classification in Disaster Informatics
CrisisSense-LLM: Instruction Fine-Tuned Large Language Model for Multi-label Social Media Text Classification in Disaster Informatics
Kai Yin
Chengkai Liu
Ali Mostafavi
Xia Hu
59
9
0
17 Jan 2025
Registering Source Tokens to Target Language Spaces in Multilingual Neural Machine Translation
Registering Source Tokens to Target Language Spaces in Multilingual Neural Machine Translation
Zhi Qu
Yiran Wang
Jiannan Mao
Chenchen Ding
Hideki Tanaka
Masao Utiyama
Taro Watanabe
LRM
40
0
0
06 Jan 2025
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in
  the Era of Large Language Models
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
49
18
0
08 Jul 2024
A Complete Survey on LLM-based AI Chatbots
A Complete Survey on LLM-based AI Chatbots
Sumit Kumar Dam
Choong Seon Hong
Yu Qiao
Chaoning Zhang
59
51
0
17 Jun 2024
When Scaling Meets LLM Finetuning: The Effect of Data, Model and
  Finetuning Method
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Biao Zhang
Zhongtao Liu
Colin Cherry
Orhan Firat
LRM
57
126
0
27 Feb 2024
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Jiasheng Ye
Zaixiang Zheng
Yu Bao
Lihua Qian
Quanquan Gu
DiffM
54
14
0
23 Aug 2023
Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient
  Pre-LN Transformers
Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers
Zixuan Jiang
Jiaqi Gu
Hanqing Zhu
David Z. Pan
AI4CE
27
16
0
24 May 2023
Multilingual Large Language Models Are Not (Yet) Code-Switchers
Multilingual Large Language Models Are Not (Yet) Code-Switchers
Ruochen Zhang
Samuel Cahyawijaya
Jan Christian Blaise Cruz
Genta Indra Winata
Alham Fikri Aji
LRM
33
51
0
23 May 2023
When Does Monolingual Data Help Multilingual Translation: The Role of
  Domain and Model Scale
When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model Scale
Christos Baziotis
Biao Zhang
Alexandra Birch
Barry Haddow
30
2
0
23 May 2023
Scaling Laws for Multilingual Neural Machine Translation
Scaling Laws for Multilingual Neural Machine Translation
Patrick Fernandes
Behrooz Ghorbani
Xavier Garcia
Markus Freitag
Orhan Firat
38
29
0
19 Feb 2023
Is Encoder-Decoder Redundant for Neural Machine Translation?
Is Encoder-Decoder Redundant for Neural Machine Translation?
Yingbo Gao
Christian Herold
Zijian Yang
Hermann Ney
24
4
0
21 Oct 2022
Investigating Efficiently Extending Transformers for Long Input
  Summarization
Investigating Efficiently Extending Transformers for Long Input Summarization
Jason Phang
Yao-Min Zhao
Peter J. Liu
RALM
LLMAG
29
63
0
08 Aug 2022
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
249
4,489
0
23 Jan 2020
Investigating Multilingual NMT Representations at Scale
Investigating Multilingual NMT Representations at Scale
Sneha Kudugunta
Ankur Bapna
Isaac Caswell
N. Arivazhagan
Orhan Firat
LRM
144
120
0
05 Sep 2019
Multi-Way, Multilingual Neural Machine Translation with a Shared
  Attention Mechanism
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
Orhan Firat
Kyunghyun Cho
Yoshua Bengio
LRM
AIMat
217
623
0
06 Jan 2016
1