Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,844 papers shown
Title
CamCloneMaster: Enabling Reference-based Camera Control for Video Generation
Yawen Luo
J. Bai
Xiaoyu Shi
Menghan Xia
Xintao Wang
Pengfei Wan
Di Zhang
Kun Gai
Tianfan Xue
DiffM
VGen
48
0
0
03 Jun 2025
HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring
Zhixiong Su
Yichen Wang
Herun Wan
Zhaohan Zhang
Minnan Luo
DeLMO
52
0
0
03 Jun 2025
Towards Generating Controllable and Solvable Geometry Problem by Leveraging Symbolic Deduction Engine
Zhuoxuan Jiang
T. Zhang
Peiyan Peng
Jing Chen
Yinong Xun
Haotian Zhang
L. Li
Yong Li
Shaohua Zhang
AI4CE
47
0
0
03 Jun 2025
From Transformers to Large Language Models: A systematic review of AI applications in the energy sector towards Agentic Digital Twins
Gabriel Antonesi
T. Cioara
I. Anghel
Vasilis Michalakopoulos
Elissaios Sarmas
Liana Toderean
LLMAG
MedIm
AI4CE
13
0
0
03 Jun 2025
EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models
Mingzhe Li
Gehao Zhang
Zhenting Wang
Shiqing Ma
Siqi Pan
Richard Cartwright
Juan Zhai
DiffM
52
0
0
03 Jun 2025
FlexPainter: Flexible and Multi-View Consistent Texture Generation
Dongyu Yan
Leyi Wu
Jiantao Lin
Luozhou Wang
Tianshuo Xu
Zhifei Chen
Zhen Yang
Lie Xu
Shunsi Zhang
Yingcong Chen
DiffM
60
0
0
03 Jun 2025
ANT: Adaptive Neural Temporal-Aware Text-to-Motion Model
Wenshuo Chen
Kuimou Yu
Haozhe Jia
Kaishen Yuan
Bowen Tian
Songning Lai
Hongru Xiao
Erhang Zhang
Lei Wang
Yutao Yue
DiffM
VGen
68
0
0
03 Jun 2025
QKV Projections Require a Fraction of Their Memory
Malik Khalf
Yara Shamshoum
Nitzan Hodos
Yuval Sieradzki
Assaf Schuster
MQ
VLM
58
0
0
03 Jun 2025
SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios
Lingwei Dang
Ruizhi Shao
Hongwen Zhang
Wei Min
Yebin Liu
Qingyao Wu
DiffM
VGen
67
0
0
03 Jun 2025
The State of Large Language Models for African Languages: Progress and Challenges
Kedir Yassin Hussen
W. Sewunetie
Abinew Ali Ayele
Sukairaj Hafiz Imam
Shamsuddeen Hassan Muhammad
Seid Muhie Yimam
32
0
0
02 Jun 2025
Towards Efficient Few-shot Graph Neural Architecture Search via Partitioning Gradient Contribution
Wenhao Song
Xuan Wu
Bo Yang
You Zhou
Yubin Xiao
Yanchun Liang
H. Ge
Heow Pueh Lee
Chunguo Wu
58
0
0
02 Jun 2025
Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training
Pierre-Carl Langlais
Carlos Rosas Hinostroza
Mattia Nee
Catherine Arnett
Pavel Chizhov
Eliot Jones
Irène Girard
David Mach
Anastasia Stasenko
Ivan P. Yamshchikov
AILaw
69
1
0
02 Jun 2025
GLoSS: Generative Language Models with Semantic Search for Sequential Recommendation
Krishna Acharya
Aleksandr Petrov
Juba Ziani
DiffM
RALM
VLM
63
0
0
02 Jun 2025
LLM in the Loop: Creating the ParaDeHate Dataset for Hate Speech Detoxification
Shuzhou Yuan
Ercong Nie
Lukas Kouba
Ashish Yashwanth Kangen
Helmut Schmid
Hinrich Schütze
Michael Färber
62
0
0
02 Jun 2025
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
50
0
0
02 Jun 2025
AdaRewriter: Unleashing the Power of Prompting-based Conversational Query Reformulation via Test-Time Adaptation
Yilong Lai
Jialong Wu
Zhenglin Wang
Deyu Zhou
47
0
0
02 Jun 2025
Esoteric Language Models
Subham Sekhar Sahoo
Zhihan Yang
Yash Akhauri
Johnna Liu
Deepansha Singh
Zhoujun Cheng
Zhengzhong Liu
Eric P. Xing
John Thickstun
Arash Vahdat
59
0
0
02 Jun 2025
Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability
Yarden Bakish
Itamar Zimerman
Hila Chefer
Lior Wolf
14
0
0
02 Jun 2025
Multilingual Definition Modeling
Edison Marrese-Taylor
Erica K. Shimomoto
Alfredo Solano
Enrique Reid
50
0
0
02 Jun 2025
Many-for-Many: Unify the Training of Multiple Video and Image Generation and Manipulation Tasks
Tao Yang
Ruibin Li
Yangming Shi
Yuqi Zhang
Qide Dong
Haoran Cheng
Weiguo Feng
Shilei Wen
Bingyue Peng
Lei Zhang
DiffM
VGen
57
0
0
02 Jun 2025
MINT: Multimodal Instruction Tuning with Multimodal Interaction Grouping
Xiaojun Shan
Qi Cao
Xing Han
Haofei Yu
Paul Liang
42
0
0
02 Jun 2025
MLorc: Momentum Low-rank Compression for Large Language Model Adaptation
Wei Shen
Zhang Yaxiang
Minhui Huang
Mengfan Xu
Jiawei Zhang
Cong Shen
AI4CE
51
0
0
02 Jun 2025
Dual-Process Image Generation
Grace Luo
Jonathan Granskog
Aleksander Holynski
Trevor Darrell
VLM
67
0
0
02 Jun 2025
GRAM: Generative Recommendation via Semantic-aware Multi-granular Late Fusion
Sunkyung Lee
Minjin Choi
Eunseong Choi
Hye-young Kim
Jongwuk Lee
VLM
57
0
0
02 Jun 2025
Protocol Models: Scaling Decentralized Training with Communication-Efficient Model Parallelism
Sameera Ramasinghe
Thalaiyasingam Ajanthan
Gil Avraham
Yan Zuo
Alexander Long
GNN
77
0
0
02 Jun 2025
Unified Scaling Laws for Compressed Representations
Andrei Panferov
Alexandra Volkova
Ionut-Vlad Modoranu
Vage Egiazarian
M. Safaryan
Dan Alistarh
51
0
0
02 Jun 2025
LibriBrain: Over 50 Hours of Within-Subject MEG to Improve Speech Decoding Methods at Scale
Miran Özdogan
Gilad Landau
Gereon Elvers
Dulhan Jayalath
Pratik Somaiya
Francesco Mantegna
M. Woolrich
Oiwi Parker Jones
34
2
0
02 Jun 2025
IF-GUIDE: Influence Function-Guided Detoxification of LLMs
Zachary Coalson
Juhan Bae
Nicholas Carlini
Sanghyun Hong
TDI
68
0
0
02 Jun 2025
Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition
Yoonjun Cho
Soeun Kim
Dongjae Jeon
Kyelim Lee
Beomsoo Lee
Albert No
MQ
27
0
0
02 Jun 2025
Contextual Candor: Enhancing LLM Trustworthiness Through Hierarchical Unanswerability Detection
Steven Robinson
Antonio Carlos Rivera
HILM
33
0
0
01 Jun 2025
FedRPCA: Enhancing Federated LoRA Aggregation Using Robust PCA
Divyansh Jhunjhunwala
Arian Raje
Madan Ravi Ganesh
Chaithanya Kumar Mummadi
Chaoqun Dong
Jiawei Zhou
Wan-Yi Lin
Gauri Joshi
Zhenzhen Li
45
0
0
01 Jun 2025
In-the-wild Audio Spatialization with Flexible Text-guided Localization
Tianrui Pan
Jie Liu
Z. Huang
Jie Tang
Gangshan Wu
42
0
0
01 Jun 2025
GuideX: Guided Synthetic Data Generation for Zero-Shot Information Extraction
Neil De La Fuente
Oscar Sainz
Iker García-Ferrero
Eneko Agirre
SyDa
37
0
0
31 May 2025
Assortment of Attention Heads: Accelerating Federated PEFT with Head Pruning and Strategic Client Selection
Yeshwanth Venkatesha
Souvik Kundu
Priyadarshini Panda
17
1
0
31 May 2025
Spectral Insights into Data-Oblivious Critical Layers in Large Language Models
Xuyuan Liu
Lei Hsiung
Yaoqing Yang
Yujun Yan
AAML
27
0
0
31 May 2025
Structuring Radiology Reports: Challenging LLMs with Lightweight Models
Johannes Moll
Louisa Fay
Asfandyar Azhar
Sophie Ostmeier
Tim Lueth
S. Gatidis
Curtis P. Langlotz
Jean-Benoit Delbrouck
7
0
0
30 May 2025
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
Banseok Lee
Dongkyu Kim
Youngcheon You
Youngmin Kim
MQ
19
0
0
30 May 2025
SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training
Yehonathan Refael
Guy Smorodinsky
Tom Tirer
Ofir Lindenbaum
35
0
0
30 May 2025
Drop Dropout on Single-Epoch Language Model Pretraining
Houjun Liu
John Bauer
Christopher D. Manning
LRM
28
0
0
30 May 2025
Structure-Aware Fill-in-the-Middle Pretraining for Code
Linyuan Gong
Alvin Cheung
Mostafa Elhoushi
Sida Wang
CLL
AI4CE
15
0
0
30 May 2025
GradPower: Powering Gradients for Faster Language Model Pre-Training
Mingze Wang
Jinbo Wang
Jiaqi Zhang
Wei Wang
Peng Pei
Xunliang Cai
Weinan E
Lei Wu
46
0
0
30 May 2025
ReCalKV: Low-Rank KV Cache Compression via Head Reordering and Offline Calibration
Xianglong Yan
Zhiteng Li
Tianao Zhang
Linghe Kong
Yulun Zhang
Xiaokang Yang
48
0
0
30 May 2025
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering
Runnan Lu
Yuxuan Zhang
Jailing Liu
Haifa Wang
Yiren Song
DiffM
30
0
0
30 May 2025
Identity resolution of software metadata using Large Language Models
Eva Martín del Pico
Josep Lluís Gelpí
Salvador Capella-Gutiérrez
20
0
0
29 May 2025
Leave it to the Specialist: Repair Sparse LLMs with Sparse Fine-Tuning via Sparsity Evolution
Q. Xiao
Alan Ansell
Boqian Wu
Lu Yin
Mykola Pechenizkiy
Shiwei Liu
Decebal Constantin Mocanu
32
0
0
29 May 2025
A New Deep-learning-Based Approach For mRNA Optimization: High Fidelity, Computation Efficiency, and Multiple Optimization Factors
Zheng Gong
Ziyi Jiang
Weihao Gao
Deng Zhuo
Lan Ma
33
0
0
29 May 2025
Navigating the Accuracy-Size Trade-Off with Flexible Model Merging
Akash Dhasade
Divyansh Jhunjhunwala
Milos Vujasinovic
Gauri Joshi
Anne-Marie Kermarrec
MoMe
52
0
0
29 May 2025
Beyond Zero Initialization: Investigating the Impact of Non-Zero Initialization on LoRA Fine-Tuning Dynamics
Shiwei Li
Xiandi Luo
Xing Tang
Haozhao Wang
Hao Chen
Weihong Luo
Yuhua Li
Xiuqiang He
Ruixuan Li
AI4CE
45
0
0
29 May 2025
Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking
Yuatyong Chaichana
Thanapat Trachu
Peerat Limkonchotiwat
Konpat Preechakul
Tirasan Khandhawit
Ekapol Chuangsuwanich
MoMe
71
0
0
29 May 2025
Point-MoE: Towards Cross-Domain Generalization in 3D Semantic Segmentation via Mixture-of-Experts
Xuweiyi Chen
Wentao Zhou
Aruni RoyChowdhury
Zezhou Cheng
3DPC
54
0
0
29 May 2025
Previous
1
2
3
4
5
...
195
196
197
Next