Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,870 papers shown
Title
MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures
Lucas Morin
Valéry Weber
A. Nassar
Gerhard Ingmar Meijer
Luc Van Gool
Yawei Li
Peter W. J. Staar
89
2
0
20 Mar 2025
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Liming Jiang
Qing Yan
Yumin Jia
Zichuan Liu
Hao Kang
Xin Lu
110
4
0
20 Mar 2025
What can Off-the-Shelves Large Multi-Modal Models do for Dynamic Scene Graph Generation?
Xuanming Cui
Jaiminkumar Ashokbhai Bhoi
Chionh Wei Peng
Adriel Kuek
Ser-Nam Lim
104
0
0
20 Mar 2025
LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates
Ying Shen
Lifu Huang
102
2
0
20 Mar 2025
Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content
Sai Kartheek Reddy Kasu
Shankar Biradar
Sunil Saumya
103
0
0
20 Mar 2025
A Review on Large Language Models for Visual Analytics
Navya Sonal Agarwal
Sanjay Kumar Sonbhadra
110
0
0
19 Mar 2025
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
Boshen Xu
Yuting Mei
Xinbi Liu
Sipeng Zheng
Qin Jin
VLM
MDE
108
0
0
19 Mar 2025
Aligning Crowd-sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models
M. Wong
C. Tan
ALM
125
5
0
19 Mar 2025
GenM
3
^3
3
: Generative Pretrained Multi-path Motion Model for Text Conditional Human Motion Generation
Junyu Shi
Lijiang Liu
Yong Sun
Zhiyuan Zhang
Jinni Zhou
Qiang Nie
94
0
0
19 Mar 2025
Enforcing Consistency and Fairness in Multi-level Hierarchical Classification with a Mask-based Output Layer
Shijing Chen
Shoaib Jameel
Mohamed Reda Bouadjenek
Feilong Tang
Usman Naseem
Basem Suleiman
Hakim Hacid
Flora D. Salim
Imran Razzak
69
0
0
19 Mar 2025
MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space
Lixing Xiao
Shunlin Lu
Huaijin Pi
Ke Fan
Liang Pan
Yueer Zhou
Ziyong Feng
Xiaowei Zhou
Sida Peng
Jingbo Wang
DiffM
VGen
105
7
0
19 Mar 2025
Empowering Smaller Models: Tuning LLaMA and Gemma with Chain-of-Thought for Ukrainian Exam Tasks
Mykyta Syromiatnikov
Victoria Ruvinskaya
Nataliia Komleva
ALM
LRM
95
0
0
18 Mar 2025
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation
Hongyu Zhang
Yufan Deng
Shenghai Yuan
Peng Jin
Zesen Cheng
Yian Zhao
Chang-Shu Liu
Jie Chen
DiffM
VGen
123
0
0
18 Mar 2025
Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence
Sophia Hager
David Mueller
Kevin Duh
Nicholas Andrews
152
1
0
18 Mar 2025
Striving for Simplicity: Simple Yet Effective Prior-Aware Pseudo-Labeling for Semi-Supervised Ultrasound Image Segmentation
Yaxiong Chen
Yujie Wang
Zixuan Zheng
Jingliang Hu
Yilei Shi
Shengwu Xiong
Xiao Xiang Zhu
Lichao Mou
148
1
0
18 Mar 2025
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
Dewei Zhou
Mingwei Li
Zongxin Yang
Yi Yang
182
3
0
17 Mar 2025
ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning
Baohao Liao
Christian Herold
Seyyed Hadi Hashemi
Stefan Vasilev
Shahram Khadivi
Christof Monz
MQ
132
0
0
17 Mar 2025
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
Kairong Luo
Haodong Wen
Shengding Hu
Zhenbo Sun
Zhiyuan Liu
Maosong Sun
Kaifeng Lyu
Wenguang Chen
CLL
115
3
0
17 Mar 2025
A Survey on Transformer Context Extension: Approaches and Evaluation
Yijun Liu
Jinzheng Yu
Yang Xu
Zhongyang Li
Qingfu Zhu
LLMAG
128
3
0
17 Mar 2025
Valid Text-to-SQL Generation with Unification-based DeepStochLog
Ying Jiao
Luc de Raedt
G. Marra
NAI
92
2
0
17 Mar 2025
SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression
Xin Wang
Samiul Alam
Zhongwei Wan
Jikang Cheng
Hao Fei
MQ
112
4
0
16 Mar 2025
The Lucie-7B LLM and the Lucie Training Dataset: Open resources for multilingual language generation
Olivier Gouvert
Julie Hunter
Jérôme Louradour
Christophe Cerisara
Evan Dufraisse
Yaya Sy
Laura Rivière
Jean-Pierre Lorré
OpenLLM-France community
454
0
0
15 Mar 2025
Unified Modeling Language Code Generation from Diagram Images Using Multimodal Large Language Models
Averi Bates
Ryan Vavricka
Shane Carleton
Ruosi Shao
Chongle Pan
107
0
0
15 Mar 2025
MoLEx: Mixture of Layer Experts for Finetuning with Sparse Upcycling
R. Teo
T. Nguyen
MoE
149
2
0
14 Mar 2025
OpeNLGauge: An Explainable Metric for NLG Evaluation with Open-Weights LLMs
Ivan Kartáč
Mateusz Lango
Ondrej Dusek
ELM
85
1
0
14 Mar 2025
Direction-Aware Diagonal Autoregressive Image Generation
Yijia Xu
Jianzhong Ju
Jian Luan
J. Cui
185
0
0
14 Mar 2025
HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models
Ziqin Zhou
Yifan Yang
Yue Yang
Tianyu He
Houwen Peng
Kai Qiu
Qi Dai
Lili Qiu
Chong Luo
Lingqiao Liu
DiffM
VGen
82
1
0
14 Mar 2025
Key, Value, Compress: A Systematic Exploration of KV Cache Compression Techniques
Neusha Javidnia
B. Rouhani
F. Koushanfar
551
0
0
14 Mar 2025
REGEN: A Dataset and Benchmarks with Natural Language Critiques and Narratives
Kun Su
Krishna Sayana
H. Pham
James Pine
Yuri Vasilevski
...
Marialena Kyriakidi
Liam Hebert
Ambarish Jash
Anushya Subbiah
Sukhdeep S. Sodhi
93
0
0
14 Mar 2025
FedALT: Federated Fine-Tuning through Adaptive Local Training with Rest-of-World LoRA
Jieming Bian
Lei Wang
Letian Zhang
Jie Xu
105
3
0
14 Mar 2025
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion
A. Nassar
Andres Marafioti
Matteo Omenetti
Maksym Lysak
Nikolaos Livathinos
...
Yusik Kim
A. Said Gurbuz
Michele Dolfi
Miquel Farré
Peter W. J. Staar
102
6
0
14 Mar 2025
Towards Extreme Pruning of LLMs with Plug-and-Play Mixed Sparsity
Chi Xu
Gefei Zhang
Yantong Zhu
Luca Benini
Guosheng Hu
Yawei Li
Zhihong Zhang
58
1
0
14 Mar 2025
A Hybrid Architecture with Efficient Fine Tuning for Abstractive Patent Document Summarization
Nevidu Jayatilleke
Ruvan Weerasinghe
AILaw
221
0
0
13 Mar 2025
An Expanded Massive Multilingual Dataset for High-Performance Language Technologies (HPLT)
Laurie Burchell
Ona de Gibert
Nikolay Arefyev
Mikko Aulamo
Marta Bañón
...
Pavel Stepachev
and Jörg Tiedemann
Dušan Variš
Tereza Vojtěchová
Jaume Zaragoza-Bernabeu
96
4
0
13 Mar 2025
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
Yufan Deng
Xun Guo
Yanjie Wang
Jacob Zhiyuan Fang
Angtian Wang
Shenghai Yuan
Yiding Yang
Bo Liu
Haibin Huang
Chongyang Ma
DiffM
VGen
154
3
0
13 Mar 2025
VideoMerge: Towards Training-free Long Video Generation
Siyang Zhang
Harry Yang
Ser-Nam Lim
DiffM
VGen
96
1
0
13 Mar 2025
Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
Chenpeng Wu
Qiqi Gu
Heng Shi
Jianguo Yao
Haibing Guan
MoE
78
0
0
13 Mar 2025
From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM
Kshitij Ambilduke
Ben Peters
Sonal Sannigrahi
Anil Keshwani
Tsz Kin Lam
Bruno Martins
Marcely Zanon Boito
André F. T. Martins
110
2
0
13 Mar 2025
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Rongyao Fang
Chengqi Duan
Kun Wang
Linjiang Huang
Hao Li
...
Xingyu Zeng
R. Zhao
Jifeng Dai
Xihui Liu
Hongsheng Li
MLLM
ReLM
LRM
165
23
0
13 Mar 2025
Data Caricatures: On the Representation of African American Language in Pretraining Corpora
Nicholas Deas
Blake Vente
Amith Ananthram
Jessica A. Grieser
D. Patton
Shana Kleiner
James Shepard
Kathleen McKeown
79
0
0
13 Mar 2025
Enhancing Aviation Communication Transcription: Fine-Tuning Distil-Whisper with LoRA
Shokoufeh Mirzaei
Jesse Arzate
Yukti Vijay
51
0
0
13 Mar 2025
RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models
Yijing Lin
Mengqi Huang
Shuhan Zhuang
Zhendong Mao
VGen
99
3
0
13 Mar 2025
MultiConIR: Towards multi-condition Information Retrieval
Xuan Lu
Sifan Liu
Bochao Yin
Yiming Li
Xinghao Chen
Hui Su
Yaohui Jin
Wenjun Zeng
Xiaoyu Shen
117
0
0
13 Mar 2025
ARLED: Leveraging LED-based ARMAN Model for Abstractive Summarization of Persian Long Documents
Samira Zangooei
Amirhossein Darmani
Hossein Farahmand Nezhad
Laya Mahmoudi
86
0
0
13 Mar 2025
From Equations to Insights: Unraveling Symbolic Structures in PDEs with LLMs
Rohan Bhatnagar
Ling Liang
Krish Patel
Haizhao Yang
75
1
0
13 Mar 2025
AudioX: Diffusion Transformer for Anything-to-Audio Generation
Zeyue Tian
Yizhu Jin
Zhaoyang Liu
Ruibin Yuan
Xu Tan
Qifeng Chen
Wei Xue
Yu Guo
118
6
0
13 Mar 2025
Transformers without Normalization
Jiachen Zhu
Xinlei Chen
Kaiming He
Yann LeCun
Zhuang Liu
OffRL
ViT
160
20
0
13 Mar 2025
Autoregressive Image Generation with Randomized Parallel Decoding
Haopeng Li
Jinyue Yang
Guoqi Li
Huan Wang
100
1
0
13 Mar 2025
Numerical Error Analysis of Large Language Models
Stanislav Budzinskiy
Wenyi Fang
Longbin Zeng
Philipp Petersen
92
1
0
13 Mar 2025
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary B. Charles
Gabriel Teston
Lucio Dery
Keith Rush
Nova Fallen
Zachary Garrett
Arthur Szlam
Arthur Douillard
459
6
0
12 Mar 2025
Previous
1
2
3
...
15
16
17
...
196
197
198
Next