ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2101.00371
  4. Cited By
On-the-Fly Attention Modulation for Neural Generation

On-the-Fly Attention Modulation for Neural Generation

2 January 2021
Yue Dong
Chandra Bhagavatula
Ximing Lu
Jena D. Hwang
Antoine Bosselut
Jackie C.K. Cheung
Yejin Choi
ArXivPDFHTML

Papers citing "On-the-Fly Attention Modulation for Neural Generation"

9 / 9 papers shown
Title
Mechanistic Unveiling of Transformer Circuits: Self-Influence as a Key to Model Reasoning
Mechanistic Unveiling of Transformer Circuits: Self-Influence as a Key to Model Reasoning
Lefei Zhang
Lijie Hu
Di Wang
LRM
102
1
0
17 Feb 2025
DECIDER: A Dual-System Rule-Controllable Decoding Framework for Language Generation
DECIDER: A Dual-System Rule-Controllable Decoding Framework for Language Generation
Chen Xu
Tian Lan
Changlong Yu
Wei Wang
Jun Gao
...
Qunxi Dong
Kun Qian
Piji Li
Wei Bi
Bin Hu
50
0
0
04 Mar 2024
Towards a Mechanistic Interpretation of Multi-Step Reasoning
  Capabilities of Language Models
Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models
Yifan Hou
Jiaoda Li
Yu Fei
Alessandro Stolfo
Wangchunshu Zhou
Guangtao Zeng
Antoine Bosselut
Mrinmaya Sachan
LRM
35
40
0
23 Oct 2023
Should We Attend More or Less? Modulating Attention for Fairness
Should We Attend More or Less? Modulating Attention for Fairness
A. Zayed
Gonçalo Mordido
Samira Shabanian
Sarath Chandar
45
10
0
22 May 2023
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
Tianxing He
Jingyu Zhang
Tianle Wang
Sachin Kumar
Kyunghyun Cho
James R. Glass
Yulia Tsvetkov
55
44
0
20 Dec 2022
Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text
  Generation via Concentrating Attention
Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text Generation via Concentrating Attention
Wenhao Li
Xiaoyuan Yi
Jinyi Hu
Maosong Sun
Xing Xie
49
0
0
14 Nov 2022
ClarET: Pre-training a Correlation-Aware Context-To-Event Transformer
  for Event-Centric Generation and Classification
ClarET: Pre-training a Correlation-Aware Context-To-Event Transformer for Event-Centric Generation and Classification
Yucheng Zhou
Tao Shen
Xiubo Geng
Guodong Long
Daxin Jiang
42
57
0
04 Mar 2022
Controlling the Focus of Pretrained Language Generation Models
Controlling the Focus of Pretrained Language Generation Models
Jiabao Ji
Yoon Kim
James R. Glass
Tianxing He
45
5
0
02 Mar 2022
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
288
2,028
0
28 Jul 2020
1