ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.08771
  4. Cited By
Cross-Attention is All You Need: Adapting Pretrained Transformers for
  Machine Translation

Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation

18 April 2021
Mozhdeh Gheini
Xiang Ren
Jonathan May
    LRM
ArXivPDFHTML

Papers citing "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation"

50 / 53 papers shown
Title
Non-Stationary Time Series Forecasting Based on Fourier Analysis and Cross Attention Mechanism
Non-Stationary Time Series Forecasting Based on Fourier Analysis and Cross Attention Mechanism
Yuqi Xiong
Yang Wen
AI4TS
31
0
0
11 May 2025
Parameter-Efficient Adaptation of Geospatial Foundation Models through Embedding Deflection
Romain Thoreau
Valerio Marsocci
Dawa Derksen
AI4CE
77
2
0
12 Mar 2025
M-SCAN: A Multistage Framework for Lumbar Spinal Canal Stenosis Grading Using Multi-View Cross Attention
Arnesh Batra
Arush Gumber
Anushk Kumar
37
0
0
03 Mar 2025
Cross-Attention Fusion of MRI and Jacobian Maps for Alzheimer's Disease Diagnosis
Shijia Zhang
Xiyu Ding
Brian Caffo
Junyu Chen
Cindy Zhang
Hadi Kharrazi
Zheyu Wang
MedIm
29
0
0
01 Mar 2025
GWRF: A Generalizable Wireless Radiance Field for Wireless Signal Propagation Modeling
Kang Yang
Yuning Chen
Wan Du
38
0
0
08 Feb 2025
Memory-Efficient Fine-Tuning of Transformers via Token Selection
Memory-Efficient Fine-Tuning of Transformers via Token Selection
Antoine Simoulin
Namyong Park
Xiaoyi Liu
Grey Yang
115
0
0
31 Jan 2025
Double-Flow GAN model for the reconstruction of perceived faces from brain activities
Double-Flow GAN model for the reconstruction of perceived faces from brain activities
Zihao Wang
Jing Zhao
Xuetong Ding
Hui Zhang
CVBM
AI4CE
24
0
0
03 Jan 2025
Cross-Linguistic Examination of Machine Translation Transfer Learning
Saughmon Boujkian
40
0
0
03 Jan 2025
Analyzing the Attention Heads for Pronoun Disambiguation in
  Context-aware Machine Translation Models
Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models
Paweł Mąka
Yusuf Can Semerci
Jan Scholtes
Gerasimos Spanakis
86
0
0
15 Dec 2024
FollowGen: A Scaled Noise Conditional Diffusion Model for Car-Following
  Trajectory Prediction
FollowGen: A Scaled Noise Conditional Diffusion Model for Car-Following Trajectory Prediction
Junwei You
Rui Gan
Weizhe Tang
Zilin Huang
Jiaxi Liu
...
Keshu Wu
Keke Long
Sicheng Fu
Sikai Chen
Bin Ran
DiffM
70
0
0
23 Nov 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Liwen Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
49
3
0
24 Oct 2024
A Novel Hybrid Parameter-Efficient Fine-Tuning Approach for Hippocampus
  Segmentation and Alzheimer's Disease Diagnosis
A Novel Hybrid Parameter-Efficient Fine-Tuning Approach for Hippocampus Segmentation and Alzheimer's Disease Diagnosis
Wangang Cheng
Guanghua He
Keli Hu
Mingyu Fang
Liang Dong
Zhong Li
Hancan Zhu
44
0
0
02 Sep 2024
SSRFlow: Semantic-aware Fusion with Spatial Temporal Re-embedding for
  Real-world Scene Flow
SSRFlow: Semantic-aware Fusion with Spatial Temporal Re-embedding for Real-world Scene Flow
Zhiyang Lu
Qinghan Chen
Zhimin Yuan
Ming Cheng
24
0
0
31 Jul 2024
See Further for Parameter Efficient Fine-tuning by Standing on the
  Shoulders of Decomposition
See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition
Chongjie Si
Xiaokang Yang
Wei Shen
45
5
0
07 Jul 2024
LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content
  Moderation of Large Language Models
LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content Moderation of Large Language Models
Hayder Elesedy
Pedro M. Esperança
Silviu Vlad Oprea
Mete Ozay
KELM
36
2
0
03 Jul 2024
A Depression Detection Method Based on Multi-Modal Feature Fusion Using
  Cross-Attention
A Depression Detection Method Based on Multi-Modal Feature Fusion Using Cross-Attention
Shengjie Li
Yinhao Xiao
33
1
0
02 Jul 2024
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for
  Sparse Architectural Large Language Models
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
Zihan Wang
Deli Chen
Damai Dai
Runxin Xu
Zhuoshu Li
Y. Wu
MoE
ALM
43
3
0
02 Jul 2024
Increasing Model Capacity for Free: A Simple Strategy for Parameter
  Efficient Fine-tuning
Increasing Model Capacity for Free: A Simple Strategy for Parameter Efficient Fine-tuning
Haobo Song
Hao Zhao
Soumajit Majumder
Tao Lin
30
3
0
01 Jul 2024
Crossfusor: A Cross-Attention Transformer Enhanced Conditional Diffusion
  Model for Car-Following Trajectory Prediction
Crossfusor: A Cross-Attention Transformer Enhanced Conditional Diffusion Model for Car-Following Trajectory Prediction
Junwei You
Haotian Shi
Keshu Wu
Keke Long
Sicheng Fu
Sikai Chen
Bin Ran
50
0
0
17 Jun 2024
SA-FedLora: Adaptive Parameter Allocation for Efficient Federated
  Learning with LoRA Tuning
SA-FedLora: Adaptive Parameter Allocation for Efficient Federated Learning with LoRA Tuning
Yuning Yang
Xiaohong Liu
Tianrun Gao
Xiaodong Xu
Guangyu Wang
40
5
0
15 May 2024
Light-VQA+: A Video Quality Assessment Model for Exposure Correction
  with Vision-Language Guidance
Light-VQA+: A Video Quality Assessment Model for Exposure Correction with Vision-Language Guidance
Xunchu Zhou
Xiaohong Liu
Yunlong Dong
Tengchuan Kou
Yixuan Gao
Zicheng Zhang
Chunyi Li
Haoning Wu
Guangtao Zhai
41
3
0
06 May 2024
Exploring the landscape of large language models: Foundations,
  techniques, and challenges
Exploring the landscape of large language models: Foundations, techniques, and challenges
M. Moradi
Ke Yan
David Colwell
Matthias Samwald
Rhona Asgari
OffRL
46
1
0
18 Apr 2024
Investigating Neural Machine Translation for Low-Resource Languages:
  Using Bavarian as a Case Study
Investigating Neural Machine Translation for Low-Resource Languages: Using Bavarian as a Case Study
Wan-Hua Her
Udo Kruschwitz
42
4
0
12 Apr 2024
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Zeyu Han
Chao Gao
Jinyang Liu
Jeff Zhang
Sai Qian Zhang
150
318
0
21 Mar 2024
Block-wise LoRA: Revisiting Fine-grained LoRA for Effective
  Personalization and Stylization in Text-to-Image Generation
Block-wise LoRA: Revisiting Fine-grained LoRA for Effective Personalization and Stylization in Text-to-Image Generation
Likun Li
Haoqi Zeng
Changpeng Yang
Haozhe Jia
Di Xu
DiffM
34
4
0
12 Mar 2024
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Controllable Prompt Tuning For Balancing Group Distributional Robustness
Hoang Phan
Andrew Gordon Wilson
Qi Lei
43
5
0
05 Mar 2024
GeoFormer: A Vision and Sequence Transformer-based Approach for
  Greenhouse Gas Monitoring
GeoFormer: A Vision and Sequence Transformer-based Approach for Greenhouse Gas Monitoring
Madhav Khirwar
Ankur Narang
32
0
0
11 Feb 2024
Dynamic Layer Tying for Parameter-Efficient Transformers
Dynamic Layer Tying for Parameter-Efficient Transformers
Tamir David Hay
Lior Wolf
33
3
0
23 Jan 2024
PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation
PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation
Nadav Benedek
Lior Wolf
32
5
0
20 Jan 2024
Robust Semi-Supervised Learning for Self-learning Open-World Classes
Robust Semi-Supervised Learning for Self-learning Open-World Classes
Wenjuan Xi
Xin Song
Weili Guo
Yang Yang
53
13
0
15 Jan 2024
A Cross-Attention Augmented Model for Event-Triggered Context-Aware
  Story Generation
A Cross-Attention Augmented Model for Event-Triggered Context-Aware Story Generation
Chen Tang
Tyler Loakman
Chenghua Lin
27
6
0
19 Nov 2023
Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization
Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization
Weiyang Liu
Zeju Qiu
Yao Feng
Yuliang Xiu
Yuxuan Xue
...
Songyou Peng
Yandong Wen
Michael J. Black
Adrian Weller
Bernhard Schölkopf
50
57
0
10 Nov 2023
MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
  Segmentation
MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image Segmentation
Cheng Chen
Juzheng Miao
Dufan Wu
Zhiling Yan
Sekeun Kim
...
Lichao Sun
Xiang Li
Tianming Liu
Pheng-Ann Heng
Quanzheng Li
MedIm
53
58
0
16 Sep 2023
IncreLoRA: Incremental Parameter Allocation Method for
  Parameter-Efficient Fine-tuning
IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning
Feiyu F. Zhang
Liangzhi Li
Jun-Cheng Chen
Zhouqian Jiang
Bowen Wang
Yiming Qian
51
33
0
23 Aug 2023
Neural Machine Translation for the Indigenous Languages of the Americas:
  An Introduction
Neural Machine Translation for the Indigenous Languages of the Americas: An Introduction
Manuel Mager
Rajat Bhatnagar
Graham Neubig
Ngoc Thang Vu
Katharina Kann
30
10
0
11 Jun 2023
Jointly Reparametrized Multi-Layer Adaptation for Efficient and Private
  Tuning
Jointly Reparametrized Multi-Layer Adaptation for Efficient and Private Tuning
Umang Gupta
Aram Galstyan
Greg Ver Steeg
11
2
0
30 May 2023
Language Models for German Text Simplification: Overcoming Parallel Data
  Scarcity through Style-specific Pre-training
Language Models for German Text Simplification: Overcoming Parallel Data Scarcity through Style-specific Pre-training
Miriam Anschütz
Joshua Oehms
Thomas Wimmer
Bartlomiej Jezierski
Georg Groh
21
22
0
22 May 2023
Cross Attention Transformers for Multi-modal Unsupervised Whole-Body PET
  Anomaly Detection
Cross Attention Transformers for Multi-modal Unsupervised Whole-Body PET Anomaly Detection
Ashay Patel
Petru-Daniel Tudosiu
W. H. Pinaya
G. Cook
Vicky Goh
Sebastien Ourselin
M. Jorge Cardoso
OOD
ViT
MedIm
28
11
0
14 Apr 2023
Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning
Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning
Vladislav Lialin
Vijeta Deshpande
Anna Rumshisky
45
167
0
28 Mar 2023
Parameter-Efficient Sparse Retrievers and Rerankers using Adapters
Parameter-Efficient Sparse Retrievers and Rerankers using Adapters
Vaishali Pal
Carlos Lassance
Hervé Déjean
S. Clinchant
135
3
0
23 Mar 2023
Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5
  for Machine Translation
Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5 for Machine Translation
Lukas Edman
Gabriele Sarti
Antonio Toral
Gertjan van Noord
Arianna Bisazza
24
11
0
28 Feb 2023
Continual Learning of Neural Machine Translation within Low Forgetting
  Risk Regions
Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions
Shuhao Gu
Bojie Hu
Yang Feng
CLL
35
11
0
03 Nov 2022
EtriCA: Event-Triggered Context-Aware Story Generation Augmented by
  Cross Attention
EtriCA: Event-Triggered Context-Aware Story Generation Augmented by Cross Attention
Chen Tang
Chenghua Lin
Hen-Hsen Huang
Frank Guerin
Zhihao Zhang
39
20
0
22 Oct 2022
UM4: Unified Multilingual Multiple Teacher-Student Model for
  Zero-Resource Neural Machine Translation
UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation
Jian Yang
Yuwei Yin
Shuming Ma
Dongdong Zhang
Shuangzhi Wu
Hongcheng Guo
Zhoujun Li
Furu Wei
38
11
0
11 Jul 2022
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning
  Tasks
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks
Tuan Dinh
Yuchen Zeng
Ruisu Zhang
Ziqian Lin
Michael Gira
Shashank Rajput
Jy-yong Sohn
Dimitris Papailiopoulos
Kangwook Lee
LMTD
42
127
0
14 Jun 2022
Know Where You're Going: Meta-Learning for Parameter-Efficient
  Fine-Tuning
Know Where You're Going: Meta-Learning for Parameter-Efficient Fine-Tuning
Mozhdeh Gheini
Xuezhe Ma
Jonathan May
44
5
0
25 May 2022
When does Parameter-Efficient Transfer Learning Work for Machine
  Translation?
When does Parameter-Efficient Transfer Learning Work for Machine Translation?
Ahmet Üstün
Asa Cooper Stickland
37
7
0
23 May 2022
HiMODE: A Hybrid Monocular Omnidirectional Depth Estimation Model
HiMODE: A Hybrid Monocular Omnidirectional Depth Estimation Model
Masum Shah Junayed
Arezoo Sadeghzadeh
Md Baharul Islam
L. Wong
Tarkan Aydin
MDE
30
9
0
11 Apr 2022
Towards Personalized Intelligence at Scale
Towards Personalized Intelligence at Scale
Yiping Kang
Ashish Mahendra
Christopher Clarke
Lingjia Tang
Jason Mars
22
1
0
13 Mar 2022
Pretrained Language Models for Text Generation: A Survey
Pretrained Language Models for Text Generation: A Survey
Junyi Li
Tianyi Tang
Wayne Xin Zhao
J. Nie
Ji-Rong Wen
AI4CE
36
127
0
14 Jan 2022
12
Next