Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,944 papers shown
Title
PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition
Y. Hao
Diansong Zhou
Zhicai Wang
Chong-Wah Ngo
Meng Wang
ViT
84
5
0
03 Jul 2024
Knowledge Composition using Task Vectors with Learned Anisotropic Scaling
Frederic Z. Zhang
Paul Albert
Cristian Rodriguez-Opazo
Anton van den Hengel
Ehsan Abbasnejad
MoMe
113
13
0
03 Jul 2024
RDBE: Reasoning Distillation-Based Evaluation Enhances Automatic Essay Scoring
Ali Ghiasvand Mohammadkhani
51
0
0
03 Jul 2024
Let the Code LLM Edit Itself When You Edit the Code
Zhenyu He
Jun Zhang
Shengjie Luo
Jingjing Xu
Zongzhang Zhang
Di He
KELM
107
1
0
03 Jul 2024
Change My Frame: Reframing in the Wild in r/ChangeMyView
Arturo Martinez Peguero
Taro Watanabe
37
0
0
02 Jul 2024
Towards More Realistic Extraction Attacks: An Adversarial Perspective
Yash More
Prakhar Ganesh
G. Farnadi
AAML
126
7
0
02 Jul 2024
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
Huiqiang Jiang
Yucheng Li
Chengruidong Zhang
Qianhui Wu
Xufang Luo
...
Amir H. Abdi
Dongsheng Li
Chin-Yew Lin
Yuqing Yang
L. Qiu
159
122
0
02 Jul 2024
Predicting vs. Acting: A Trade-off Between World Modeling & Agent Modeling
Margaret Li
Weijia Shi
Artidoro Pagnoni
Peter West
Ari Holtzman
96
10
0
02 Jul 2024
SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Sayan Nag
Koustava Goswami
Srikrishna Karanam
107
4
0
02 Jul 2024
RVISA: Reasoning and Verification for Implicit Sentiment Analysis
Wenna Lai
H. Xie
Guandong Xu
Qing Li
LRM
88
3
0
02 Jul 2024
Open foundation models for Azerbaijani language
Jafar Isbarov
Kavsar Huseynova
Elvin Mammadov
Mammad Hajili
Duygu Ataman
AI4CE
78
1
0
02 Jul 2024
Soft Language Prompts for Language Transfer
Ivan Vykopal
Simon Ostermann
Marian Simko
AAML
86
2
0
02 Jul 2024
MelodyT5: A Unified Score-to-Score Transformer for Symbolic Music Processing
Shangda Wu
Yashan Wang
Xiaobing Li
Feng Yu
Maosong Sun
102
5
0
02 Jul 2024
Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models
Xavier Suau
Pieter Delobelle
Katherine Metcalf
Armand Joulin
N. Apostoloff
Luca Zappella
P. Rodríguez
MU
AAML
107
14
0
02 Jul 2024
Breaking Language Barriers: Cross-Lingual Continual Pre-Training at Scale
Wenzhen Zheng
Wenbo Pan
Xu Xu
Libo Qin
Li Yue
Ming Zhou
CLL
82
7
0
02 Jul 2024
Fake News Detection and Manipulation Reasoning via Large Vision-Language Models
Ruihan Jin
Ruibo Fu
Zhengqi Wen
Shuai Zhang
Yukun Liu
Jianhua Tao
103
5
0
02 Jul 2024
AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment
Yilong Lai
Jialong Wu
Congzhi Zhang
Haowen Sun
Deyu Zhou
147
4
0
02 Jul 2024
Extracting and Encoding: Leveraging Large Language Models and Medical Knowledge to Enhance Radiological Text Representation
Pablo Messina
René Vidal
Denis Parra
Álvaro Soto
Vladimir Araujo
MedIm
129
4
0
02 Jul 2024
Text-Aware Diffusion for Policy Learning
Calvin Luo
Mandy He
Zilai Zeng
Chen Sun
82
6
0
02 Jul 2024
Scope-enhanced Compositional Semantic Parsing for DRT
Xiulin Yang
Jonas Groschwitz
Alexander Koller
Johan Bos
78
0
0
02 Jul 2024
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
Dewei Zhou
Yuchen Li
Fan Ma
Zongxin Yang
Yue Yang
175
11
0
02 Jul 2024
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding
Jinghui Lu
Haiyang Yu
Yanjie Wang
Yongjie Ye
Jingqun Tang
...
Qi Liu
Hao Feng
Han Wang
Hao Liu
Can Huang
181
23
0
02 Jul 2024
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
Kepan Nan
Rui Xie
Penghao Zhou
Tiehan Fan
Zhenheng Yang
Zhijie Chen
Xiang Li
Jian Yang
Ying Tai
169
93
0
02 Jul 2024
What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language Models
Shengqi Zhu
Jeffrey M. Rzeszotarski
KELM
211
1
0
02 Jul 2024
Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time
Sanjoy Chowdhury
Sayan Nag
Subhrajyoti Dasgupta
Jun Chen
Mohamed Elhoseiny
Ruohan Gao
Dinesh Manocha
VLM
MLLM
102
15
0
01 Jul 2024
Empirical Tests of Optimization Assumptions in Deep Learning
Hoang Tran
Qinzi Zhang
Ashok Cutkosky
77
2
0
01 Jul 2024
Normalization and effective learning rates in reinforcement learning
Clare Lyle
Zeyu Zheng
Khimya Khetarpal
James Martens
H. V. Hasselt
Razvan Pascanu
Will Dabney
105
13
0
01 Jul 2024
NLPGuard: A Framework for Mitigating the Use of Protected Attributes by NLP Classifiers
Salvatore Greco
Ke Zhou
L. Capra
Tania Cerquitelli
Daniele Quercia
66
4
0
01 Jul 2024
Deciphering the Factors Influencing the Efficacy of Chain-of-Thought: Probability, Memorization, and Noisy Reasoning
Akshara Prabhakar
Thomas Griffiths
R. Thomas McCoy
LRM
103
20
0
01 Jul 2024
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu
Xiaosen Zheng
Niklas Muennighoff
Guangtao Zeng
Longxu Dou
Tianyu Pang
Jing Jiang
Min Lin
MoE
182
54
1
01 Jul 2024
FORA: Fast-Forward Caching in Diffusion Transformer Acceleration
Pratheba Selvaraju
Tianyu Ding
Tianyi Chen
Ilya Zharkov
Luming Liang
125
29
0
01 Jul 2024
HyperLoader: Integrating Hypernetwork-Based LoRA and Adapter Layers into Multi-Task Transformers for Sequence Labelling
Jesús-Germán Ortiz-Barajas
Helena Gómez-Adorno
Thamar Solorio
60
2
0
01 Jul 2024
Dynamic Few-Shot Learning for Knowledge Graph Question Answering
Jacopo DÁbramo
Andrea Zugarini
Paolo Torroni
85
2
0
01 Jul 2024
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
Boyuan Chen
Diego Marti Monso
Yilun Du
Max Simchowitz
Russ Tedrake
Vincent Sitzmann
DiffM
176
109
0
01 Jul 2024
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Philippe Laban
Alexander R. Fabbri
Caiming Xiong
Chien-Sheng Wu
RALM
124
51
0
01 Jul 2024
Protecting Privacy in Classifiers by Token Manipulation
Reém Harel
Yair Elboher
Yuval Pinter
59
1
0
01 Jul 2024
An Empirical Comparison of Generative Approaches for Product Attribute-Value Identification
Kassem Sabeh
Robert Litschko
Mouna Kacimi
Barbara Plank
J. Gamper
63
3
0
01 Jul 2024
Exploring Advanced Large Language Models with LLMsuite
Giorgio Roffo
LLMAG
36
0
0
01 Jul 2024
Look Ahead or Look Around? A Theoretical Comparison Between Autoregressive and Masked Pretraining
Qi Zhang
Tianqi Du
Haotian Huang
Yifei Wang
Yisen Wang
71
5
0
01 Jul 2024
ESALE: Enhancing Code-Summary Alignment Learning for Source Code Summarization
Chunrong Fang
Weisong Sun
Yuchen Chen
Xiao Chen
Zhao Wei
Quanjun Zhang
Yudu You
Bin Luo
Yang Liu
Zhenyu Chen
AI4TS
127
14
0
01 Jul 2024
How to Leverage Digit Embeddings to Represent Numbers?
Jasivan Sivakumar
N. Moosavi
65
0
0
01 Jul 2024
Eliminating Position Bias of Language Models: A Mechanistic Approach
Ziqi Wang
Hanlin Zhang
Xiner Li
Kuan-Hao Huang
Chi Han
Shuiwang Ji
Sham Kakade
Hao Peng
Heng Ji
173
20
0
01 Jul 2024
Large Language Model Enhanced Knowledge Representation Learning: A Survey
Xin Wang
Zirui Chen
Haofen Wang
Leong Hou U
Zhao Li
Wenbin Guo
KELM
219
3
0
01 Jul 2024
A Comparative Study of Quality Evaluation Methods for Text Summarization
Huyen Nguyen
Haihua Chen
Lavanya Pobbathi
Junhua Ding
ELM
88
6
0
30 Jun 2024
LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation
Mushui Liu
Yuhang Ma
Yang Zhen
Jun Dan
Yunlong Yu
Zeng Zhao
Zhipeng Hu
Bai Liu
Changjie Fan
VLM
DiffM
133
20
0
30 Jun 2024
LegalTurk Optimized BERT for Multi-Label Text Classification and NER
Farnaz Zeidi
Mehmet Fatih Amasyali
Çiğdem Erol
VLM
67
2
0
30 Jun 2024
A Collocation-based Method for Addressing Challenges in Word-level Metric Differential Privacy
Stephen Meisenbacher
Maulik Chevli
Florian Matthes
67
2
0
30 Jun 2024
MasonTigers at SemEval-2024 Task 10: Emotion Discovery and Flip Reasoning in Conversation with Ensemble of Transformers and Prompting
Al Nahian Bin Emran
Amrita Ganguly
Sadiya Sayara Chowdhury Puspo
Nishat Raihan
Dhiman Goswami
LRM
67
0
0
30 Jun 2024
Toward a Diffusion-Based Generalist for Dense Vision Tasks
Yue Fan
Yongqin Xian
Xiaohua Zhai
Alexander Kolesnikov
Muhammad Ferjad Naeem
Bernt Schiele
Federico Tombari
VLM
MDE
DiffM
65
1
0
29 Jun 2024
Brevity is the soul of wit: Pruning long files for code generation
Aaditya K. Singh
Yu Yang
Kushal Tirumala
Mostafa Elhoushi
Ari S. Morcos
SyDa
92
2
0
29 Jun 2024
Previous
1
2
3
...
48
49
50
...
197
198
199
Next