ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.12567
  4. Cited By
Alignment Attention by Matching Key and Query Distributions

Alignment Attention by Matching Key and Query Distributions

25 October 2021
Shujian Zhang
Xinjie Fan
Huangjie Zheng
Korawat Tanwisuth
Mingyuan Zhou
    OOD
ArXivPDFHTML

Papers citing "Alignment Attention by Matching Key and Query Distributions"

9 / 9 papers shown
Title
MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference
MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference
Zhongwei Wan
H. Shen
Xin Wang
Junfeng Fang
Zheda Mai
Hao Fei
VLM
65
3
0
24 Feb 2025
Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models
Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models
Tianqi Chen
Shujian Zhang
Mingyuan Zhou
DiffM
83
3
0
17 Sep 2024
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Shentao Yang
Shujian Zhang
Congying Xia
Yihao Feng
Caiming Xiong
Mi Zhou
29
23
0
01 Jun 2023
A Prototype-Oriented Framework for Unsupervised Domain Adaptation
A Prototype-Oriented Framework for Unsupervised Domain Adaptation
Korawat Tanwisuth
Xinjie Fan
Huangjie Zheng
Shujian Zhang
A. Leon-Garcia
Bo Chen
Mingyuan Zhou
55
102
0
22 Oct 2021
Bayesian Attention Modules
Bayesian Attention Modules
Xinjie Fan
Shujian Zhang
Bo Chen
Mingyuan Zhou
117
59
0
20 Oct 2020
Calibration of Pre-trained Transformers
Calibration of Pre-trained Transformers
Shrey Desai
Greg Durrett
UQLM
246
290
0
17 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
A Decomposable Attention Model for Natural Language Inference
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
213
1,367
0
06 Jun 2016
Adversarial Deep Averaging Networks for Cross-Lingual Sentiment
  Classification
Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification
Xilun Chen
Yu Sun
Ben Athiwaratkun
Claire Cardie
Kilian Q. Weinberger
224
315
0
06 Jun 2016
1