ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,944 papers shown
Title
PosMLP-Video: Spatial and Temporal Relative Position Encoding for
  Efficient Video Recognition
PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition
Y. Hao
Diansong Zhou
Zhicai Wang
Chong-Wah Ngo
Meng Wang
ViT
84
5
0
03 Jul 2024
Knowledge Composition using Task Vectors with Learned Anisotropic
  Scaling
Knowledge Composition using Task Vectors with Learned Anisotropic Scaling
Frederic Z. Zhang
Paul Albert
Cristian Rodriguez-Opazo
Anton van den Hengel
Ehsan Abbasnejad
MoMe
113
13
0
03 Jul 2024
RDBE: Reasoning Distillation-Based Evaluation Enhances Automatic Essay
  Scoring
RDBE: Reasoning Distillation-Based Evaluation Enhances Automatic Essay Scoring
Ali Ghiasvand Mohammadkhani
51
0
0
03 Jul 2024
Let the Code LLM Edit Itself When You Edit the Code
Let the Code LLM Edit Itself When You Edit the Code
Zhenyu He
Jun Zhang
Shengjie Luo
Jingjing Xu
Zongzhang Zhang
Di He
KELM
107
1
0
03 Jul 2024
Change My Frame: Reframing in the Wild in r/ChangeMyView
Change My Frame: Reframing in the Wild in r/ChangeMyView
Arturo Martinez Peguero
Taro Watanabe
37
0
0
02 Jul 2024
Towards More Realistic Extraction Attacks: An Adversarial Perspective
Towards More Realistic Extraction Attacks: An Adversarial Perspective
Yash More
Prakhar Ganesh
G. Farnadi
AAML
126
7
0
02 Jul 2024
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via
  Dynamic Sparse Attention
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
Huiqiang Jiang
Yucheng Li
Chengruidong Zhang
Qianhui Wu
Xufang Luo
...
Amir H. Abdi
Dongsheng Li
Chin-Yew Lin
Yuqing Yang
L. Qiu
159
122
0
02 Jul 2024
Predicting vs. Acting: A Trade-off Between World Modeling & Agent
  Modeling
Predicting vs. Acting: A Trade-off Between World Modeling & Agent Modeling
Margaret Li
Weijia Shi
Artidoro Pagnoni
Peter West
Ari Holtzman
96
10
0
02 Jul 2024
SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring
  Expression Segmentation
SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Sayan Nag
Koustava Goswami
Srikrishna Karanam
107
4
0
02 Jul 2024
RVISA: Reasoning and Verification for Implicit Sentiment Analysis
RVISA: Reasoning and Verification for Implicit Sentiment Analysis
Wenna Lai
H. Xie
Guandong Xu
Qing Li
LRM
88
3
0
02 Jul 2024
Open foundation models for Azerbaijani language
Open foundation models for Azerbaijani language
Jafar Isbarov
Kavsar Huseynova
Elvin Mammadov
Mammad Hajili
Duygu Ataman
AI4CE
78
1
0
02 Jul 2024
Soft Language Prompts for Language Transfer
Soft Language Prompts for Language Transfer
Ivan Vykopal
Simon Ostermann
Marian Simko
AAML
86
2
0
02 Jul 2024
MelodyT5: A Unified Score-to-Score Transformer for Symbolic Music
  Processing
MelodyT5: A Unified Score-to-Score Transformer for Symbolic Music Processing
Shangda Wu
Yashan Wang
Xiaobing Li
Feng Yu
Maosong Sun
102
5
0
02 Jul 2024
Whispering Experts: Neural Interventions for Toxicity Mitigation in
  Language Models
Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models
Xavier Suau
Pieter Delobelle
Katherine Metcalf
Armand Joulin
N. Apostoloff
Luca Zappella
P. Rodríguez
MUAAML
107
14
0
02 Jul 2024
Breaking Language Barriers: Cross-Lingual Continual Pre-Training at
  Scale
Breaking Language Barriers: Cross-Lingual Continual Pre-Training at Scale
Wenzhen Zheng
Wenbo Pan
Xu Xu
Libo Qin
Li Yue
Ming Zhou
CLL
82
7
0
02 Jul 2024
Fake News Detection and Manipulation Reasoning via Large Vision-Language
  Models
Fake News Detection and Manipulation Reasoning via Large Vision-Language Models
Ruihan Jin
Ruibo Fu
Zhengqi Wen
Shuai Zhang
Yukun Liu
Jianhua Tao
103
5
0
02 Jul 2024
AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment
AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment
Yilong Lai
Jialong Wu
Congzhi Zhang
Haowen Sun
Deyu Zhou
147
4
0
02 Jul 2024
Extracting and Encoding: Leveraging Large Language Models and Medical
  Knowledge to Enhance Radiological Text Representation
Extracting and Encoding: Leveraging Large Language Models and Medical Knowledge to Enhance Radiological Text Representation
Pablo Messina
René Vidal
Denis Parra
Álvaro Soto
Vladimir Araujo
MedIm
129
4
0
02 Jul 2024
Text-Aware Diffusion for Policy Learning
Text-Aware Diffusion for Policy Learning
Calvin Luo
Mandy He
Zilai Zeng
Chen Sun
82
6
0
02 Jul 2024
Scope-enhanced Compositional Semantic Parsing for DRT
Scope-enhanced Compositional Semantic Parsing for DRT
Xiulin Yang
Jonas Groschwitz
Alexander Koller
Johan Bos
78
0
0
02 Jul 2024
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Dewei Zhou
Yuchen Li
Fan Ma
Zongxin Yang
Yue Yang
175
11
0
02 Jul 2024
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding
Jinghui Lu
Haiyang Yu
Yanjie Wang
Yongjie Ye
Jingqun Tang
...
Qi Liu
Hao Feng
Han Wang
Hao Liu
Can Huang
181
23
0
02 Jul 2024
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
Kepan Nan
Rui Xie
Penghao Zhou
Tiehan Fan
Zhenheng Yang
Zhijie Chen
Xiang Li
Jian Yang
Ying Tai
169
93
0
02 Jul 2024
What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language Models
What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language Models
Shengqi Zhu
Jeffrey M. Rzeszotarski
KELM
211
1
0
02 Jul 2024
Meerkat: Audio-Visual Large Language Model for Grounding in Space and
  Time
Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time
Sanjoy Chowdhury
Sayan Nag
Subhrajyoti Dasgupta
Jun Chen
Mohamed Elhoseiny
Ruohan Gao
Dinesh Manocha
VLMMLLM
102
15
0
01 Jul 2024
Empirical Tests of Optimization Assumptions in Deep Learning
Empirical Tests of Optimization Assumptions in Deep Learning
Hoang Tran
Qinzi Zhang
Ashok Cutkosky
77
2
0
01 Jul 2024
Normalization and effective learning rates in reinforcement learning
Normalization and effective learning rates in reinforcement learning
Clare Lyle
Zeyu Zheng
Khimya Khetarpal
James Martens
H. V. Hasselt
Razvan Pascanu
Will Dabney
105
13
0
01 Jul 2024
NLPGuard: A Framework for Mitigating the Use of Protected Attributes by
  NLP Classifiers
NLPGuard: A Framework for Mitigating the Use of Protected Attributes by NLP Classifiers
Salvatore Greco
Ke Zhou
L. Capra
Tania Cerquitelli
Daniele Quercia
66
4
0
01 Jul 2024
Deciphering the Factors Influencing the Efficacy of Chain-of-Thought:
  Probability, Memorization, and Noisy Reasoning
Deciphering the Factors Influencing the Efficacy of Chain-of-Thought: Probability, Memorization, and Noisy Reasoning
Akshara Prabhakar
Thomas Griffiths
R. Thomas McCoy
LRM
103
20
0
01 Jul 2024
RegMix: Data Mixture as Regression for Language Model Pre-training
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu
Xiaosen Zheng
Niklas Muennighoff
Guangtao Zeng
Longxu Dou
Tianyu Pang
Jing Jiang
Min Lin
MoE
182
54
1
01 Jul 2024
FORA: Fast-Forward Caching in Diffusion Transformer Acceleration
FORA: Fast-Forward Caching in Diffusion Transformer Acceleration
Pratheba Selvaraju
Tianyu Ding
Tianyi Chen
Ilya Zharkov
Luming Liang
125
29
0
01 Jul 2024
HyperLoader: Integrating Hypernetwork-Based LoRA and Adapter Layers into
  Multi-Task Transformers for Sequence Labelling
HyperLoader: Integrating Hypernetwork-Based LoRA and Adapter Layers into Multi-Task Transformers for Sequence Labelling
Jesús-Germán Ortiz-Barajas
Helena Gómez-Adorno
Thamar Solorio
60
2
0
01 Jul 2024
Dynamic Few-Shot Learning for Knowledge Graph Question Answering
Dynamic Few-Shot Learning for Knowledge Graph Question Answering
Jacopo DÁbramo
Andrea Zugarini
Paolo Torroni
85
2
0
01 Jul 2024
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Boyuan Chen
Diego Marti Monso
Yilun Du
Max Simchowitz
Russ Tedrake
Vincent Sitzmann
DiffM
176
109
0
01 Jul 2024
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Philippe Laban
Alexander R. Fabbri
Caiming Xiong
Chien-Sheng Wu
RALM
124
51
0
01 Jul 2024
Protecting Privacy in Classifiers by Token Manipulation
Protecting Privacy in Classifiers by Token Manipulation
Reém Harel
Yair Elboher
Yuval Pinter
59
1
0
01 Jul 2024
An Empirical Comparison of Generative Approaches for Product
  Attribute-Value Identification
An Empirical Comparison of Generative Approaches for Product Attribute-Value Identification
Kassem Sabeh
Robert Litschko
Mouna Kacimi
Barbara Plank
J. Gamper
63
3
0
01 Jul 2024
Exploring Advanced Large Language Models with LLMsuite
Exploring Advanced Large Language Models with LLMsuite
Giorgio Roffo
LLMAG
36
0
0
01 Jul 2024
Look Ahead or Look Around? A Theoretical Comparison Between
  Autoregressive and Masked Pretraining
Look Ahead or Look Around? A Theoretical Comparison Between Autoregressive and Masked Pretraining
Qi Zhang
Tianqi Du
Haotian Huang
Yifei Wang
Yisen Wang
71
5
0
01 Jul 2024
ESALE: Enhancing Code-Summary Alignment Learning for Source Code
  Summarization
ESALE: Enhancing Code-Summary Alignment Learning for Source Code Summarization
Chunrong Fang
Weisong Sun
Yuchen Chen
Xiao Chen
Zhao Wei
Quanjun Zhang
Yudu You
Bin Luo
Yang Liu
Zhenyu Chen
AI4TS
127
14
0
01 Jul 2024
How to Leverage Digit Embeddings to Represent Numbers?
How to Leverage Digit Embeddings to Represent Numbers?
Jasivan Sivakumar
N. Moosavi
65
0
0
01 Jul 2024
Eliminating Position Bias of Language Models: A Mechanistic Approach
Eliminating Position Bias of Language Models: A Mechanistic Approach
Ziqi Wang
Hanlin Zhang
Xiner Li
Kuan-Hao Huang
Chi Han
Shuiwang Ji
Sham Kakade
Hao Peng
Heng Ji
173
20
0
01 Jul 2024
Large Language Model Enhanced Knowledge Representation Learning: A Survey
Large Language Model Enhanced Knowledge Representation Learning: A Survey
Xin Wang
Zirui Chen
Haofen Wang
Leong Hou U
Zhao Li
Wenbin Guo
KELM
219
3
0
01 Jul 2024
A Comparative Study of Quality Evaluation Methods for Text Summarization
A Comparative Study of Quality Evaluation Methods for Text Summarization
Huyen Nguyen
Haihua Chen
Lavanya Pobbathi
Junhua Ding
ELM
88
6
0
30 Jun 2024
LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image
  Generation
LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation
Mushui Liu
Yuhang Ma
Yang Zhen
Jun Dan
Yunlong Yu
Zeng Zhao
Zhipeng Hu
Bai Liu
Changjie Fan
VLMDiffM
133
20
0
30 Jun 2024
LegalTurk Optimized BERT for Multi-Label Text Classification and NER
LegalTurk Optimized BERT for Multi-Label Text Classification and NER
Farnaz Zeidi
Mehmet Fatih Amasyali
Çiğdem Erol
VLM
67
2
0
30 Jun 2024
A Collocation-based Method for Addressing Challenges in Word-level
  Metric Differential Privacy
A Collocation-based Method for Addressing Challenges in Word-level Metric Differential Privacy
Stephen Meisenbacher
Maulik Chevli
Florian Matthes
67
2
0
30 Jun 2024
MasonTigers at SemEval-2024 Task 10: Emotion Discovery and Flip
  Reasoning in Conversation with Ensemble of Transformers and Prompting
MasonTigers at SemEval-2024 Task 10: Emotion Discovery and Flip Reasoning in Conversation with Ensemble of Transformers and Prompting
Al Nahian Bin Emran
Amrita Ganguly
Sadiya Sayara Chowdhury Puspo
Nishat Raihan
Dhiman Goswami
LRM
67
0
0
30 Jun 2024
Toward a Diffusion-Based Generalist for Dense Vision Tasks
Toward a Diffusion-Based Generalist for Dense Vision Tasks
Yue Fan
Yongqin Xian
Xiaohua Zhai
Alexander Kolesnikov
Muhammad Ferjad Naeem
Bernt Schiele
Federico Tombari
VLMMDEDiffM
65
1
0
29 Jun 2024
Brevity is the soul of wit: Pruning long files for code generation
Brevity is the soul of wit: Pruning long files for code generation
Aaditya K. Singh
Yu Yang
Kushal Tirumala
Mostafa Elhoushi
Ari S. Morcos
SyDa
92
2
0
29 Jun 2024
Previous
123...484950...197198199
Next