Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,866 papers shown
Title
Radio: Rate-Distortion Optimization for Large Language Model Compression
Sean I. Young
MQ
63
0
0
05 May 2025
Incentivizing Inclusive Contributions in Model Sharing Markets
Enpei Zhang
Jingyi Chai
Guangyi Liu
Yanfeng Wang
Siheng Chen
TDI
FedML
416
0
0
05 May 2025
An End-to-End Model For Logits Based Large Language Models Watermarking
Kahim Wong
Jicheng Zhou
Jiantao Zhou
Yain-Whar Si
WaLM
127
2
0
05 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
...
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
301
1
0
05 May 2025
An Empirical Study of Qwen3 Quantization
Xingyu Zheng
Yuye Li
Haoran Chu
Yue Feng
Xudong Ma
Jie Luo
Jinyang Guo
Haotong Qin
Michele Magno
Xianglong Liu
MQ
82
6
0
04 May 2025
Demystifying optimized prompts in language models
Rimon Melamed
Lucas H. McCabe
H. H. Huang
75
0
0
04 May 2025
Towards Safer Pretraining: Analyzing and Filtering Harmful Content in Webscale datasets for Responsible LLMs
Sai Krishna Mendu
Harish Yenala
Aditi Gulati
Shanu Kumar
Parag Agrawal
122
1
0
04 May 2025
R-Bench: Graduate-level Multi-disciplinary Benchmarks for LLM & MLLM Complex Reasoning Evaluation
Meng-Hao Guo
Jiajun Xu
Yi Zhang
Jiaxi Song
Haoyang Peng
...
Yongming Rao
Houwen Peng
Han Hu
Gordon Wetzstein
Shi-Min Hu
ELM
LRM
125
4
0
04 May 2025
Cannot See the Forest for the Trees: Invoking Heuristics and Biases to Elicit Irrational Choices of LLMs
Haoming Yang
Ke Ma
Xiaojun Jia
Yingfei Sun
Qianqian Xu
Qingming Huang
AAML
435
0
0
03 May 2025
Knowledge-Augmented Language Models Interpreting Structured Chest X-Ray Findings
Alexander Davis
Rafael Souza
Jia-Hao Lim
402
0
0
03 May 2025
OODTE: A Differential Testing Engine for the ONNX Optimizer
Nikolaos Louloudakis
Ajitha Rajan
86
0
0
03 May 2025
Compact Recurrent Transformer with Persistent Memory
Edison Mucllari
Z. Daniels
David C. Zhang
Qiang Ye
CLL
VLM
121
0
0
02 May 2025
MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance
Xing Hu
Zhixuan Chen
Dawei Yang
Zukang Xu
Chen Xu
Zhihang Yuan
Sifan Zhou
Jiangyong Yu
MoE
MQ
110
2
0
02 May 2025
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors
Nicy Scaria
Silvester John Joseph Kennedy
Diksha Seth
Ananya Thakur
Deepak N. Subramani
AI4Ed
99
0
0
02 May 2025
GENMO: A GENeralist Model for Human MOtion
Jiefeng Li
Jinkun Cao
Haotian Zhang
Davis Rempe
Jan Kautz
Umar Iqbal
Ye Yuan
DiffM
VGen
95
1
0
02 May 2025
Token-free Models for Sarcasm Detection
Sumit Mamtani
Maitreya Sonawane
Kanika Agarwal
Nishanth Sanjeev
90
0
0
02 May 2025
MateICL: Mitigating Attention Dispersion in Large-Scale In-Context Learning
Murtadha Ahmed
Wenbo
Liu yunfeng
97
0
0
02 May 2025
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
103
0
0
01 May 2025
Investigating Task Arithmetic for Zero-Shot Information Retrieval
Marco Braga
Pranav Kasela
Alessandro Raganato
G. Pasi
RALM
131
0
0
01 May 2025
ICQuant: Index Coding enables Low-bit LLM Quantization
Xinlin Li
Osama A. Hanna
Christina Fragouli
Suhas Diggavi
MQ
139
1
0
01 May 2025
Improving Routing in Sparse Mixture of Experts with Graph of Tokens
Tam Minh Nguyen
Ngoc N. Tran
Khai Nguyen
Richard G. Baraniuk
MoE
109
0
0
01 May 2025
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
Piotr Piekos
Róbert Csordás
Jürgen Schmidhuber
MoE
VLM
268
2
0
01 May 2025
Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing
Hong Zhang
Zhongjie Duan
Xingjun Wang
Yuze Zhao
Weiyi Lu
Zhipeng Di
Yongjun Xu
Yingda Chen
Yu Zhang
MLLM
179
6
0
30 Apr 2025
Robust Misinformation Detection by Visiting Potential Commonsense Conflict
Bing Wang
Ximing Li
C. Li
Bingrui Zhao
Bo Fu
Renchu Guan
Shengsheng Wang
83
0
0
30 Apr 2025
COSMOS: Predictable and Cost-Effective Adaptation of LLMs
Jiayu Wang
Aws Albarghouthi
Frederic Sala
97
0
0
30 Apr 2025
LLM-based Interactive Imitation Learning for Robotic Manipulation
Jonas Werner
Kun-Mo Chu
C. Weber
S. Wermter
171
1
0
30 Apr 2025
IP-CRR: Information Pursuit for Interpretable Classification of Chest Radiology Reports
Yuyan Ge
Kwan Ho Ryan Chan
Pablo Messina
René Vidal
62
0
0
30 Apr 2025
Improving Informally Romanized Language Identification
Adrian Benton
Alexander Gutkin
Christo Kirov
Brian Roark
70
0
0
30 Apr 2025
Galvatron: An Automatic Distributed System for Efficient Foundation Model Training
Xinyi Liu
Yijiao Wang
Shenhan Zhu
Fangcheng Fu
Qingshuo Liu
Guangming Lin
Tengjiao Wang
GNN
296
0
0
30 Apr 2025
X-Fusion: Introducing New Modality to Frozen Large Language Models
Sicheng Mo
Thao Nguyen
Xun Huang
Siddharth Srinivasan Iyer
Yijun Li
...
Eli Shechtman
Krishna Kumar Singh
Yong Jae Lee
Bolei Zhou
Yuheng Li
135
2
0
29 Apr 2025
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures
Miguel Nogales
Matteo Gambella
Manuel Roveri
102
0
0
29 Apr 2025
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer
Zechuan Zhang
Ji Xie
Yu Lu
Zongxin Yang
Yue Yang
DiffM
146
11
0
29 Apr 2025
JaccDiv: A Metric and Benchmark for Quantifying Diversity of Generated Marketing Text in the Music Industry
Anum Afzal
Alexandre Mercier
Florian Matthes
119
0
0
29 Apr 2025
Multimodal Large Language Models for Medicine: A Comprehensive Survey
Jiarui Ye
Hao Tang
LM&MA
183
0
0
29 Apr 2025
Small or Large? Zero-Shot or Finetuned? Guiding Language Model Choice for Specialized Applications in Healthcare
Lovedeep Gondara
Jonathan Simkin
Graham Sayle
Shebnum Devji
Gregory Arbour
Raymond Ng
LM&MA
53
0
0
29 Apr 2025
Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report
Paul Kassianik
Baturay Saglam
Alexander Chen
Blaine Nelson
Anu Vellore
...
Hyrum Anderson
Kojin Oshiba
Omar Santos
Yaron Singer
Amin Karbasi
PILM
89
2
0
28 Apr 2025
DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer
Junpeng Jiang
Gangyi Hong
Miao Zhang
Hengtong Hu
Kun Zhan
Rui Shao
Liqiang Nie
VGen
92
3
0
28 Apr 2025
FineQ: Software-Hardware Co-Design for Low-Bit Fine-Grained Mixed-Precision Quantization of LLMs
Xilong Xie
Liang Wang
Limin Xiao
Meng Han
Lin Sun
S. Zheng
Xiangrong Xu
MQ
84
0
0
28 Apr 2025
Large Language Models are Qualified Benchmark Builders: Rebuilding Pre-Training Datasets for Advancing Code Intelligence Tasks
Kang Yang
Xinjun Mao
Shangwen Wang
Yanjie Wang
Tanghaoran Zhang
Bo Lin
Yihao Qin
Zhang Zhang
Yao Lu
Kamal Al-Sabahi
ALM
283
1
0
28 Apr 2025
RepText: Rendering Visual Text via Replicating
Haobo Wang
Yongjun Xu
Yongqian Li
Jiajun Li
Chaowei Zhang
Jingchao Wang
Kejia Yang
Z. Chen
VLM
120
1
0
28 Apr 2025
Coreference Resolution for Vietnamese Narrative Texts
Hieu-Dai Tran
Duc-Vu Nguyen
Ngan Luu-Thuy Nguyen
86
0
0
28 Apr 2025
Learning Streaming Video Representation via Multitask Training
Yibin Yan
Jilan Xu
Shangzhe Di
Yikun Liu
Yudi Shi
Qirui Chen
Zeqian Li
Yifei Huang
Weidi Xie
CLL
164
1
0
28 Apr 2025
Enhancing Surgical Documentation through Multimodal Visual-Temporal Transformers and Generative AI
Hugo Georgenthum
Cristian Cosentino
Fabrizio Marozzo
Pietro Liò
MedIm
443
0
0
28 Apr 2025
Generative Product Recommendations for Implicit Superlative Queries
Kaustubh D. Dhole
Nikhita Vedula
Saar Kuzi
Giuseppe Castellucci
Eugene Agichtein
S. Malmasi
84
1
0
26 Apr 2025
Dynamic Fisher-weighted Model Merging via Bayesian Optimization
Sanwoo Lee
Jiahao Liu
Qifan Wang
Jiadong Wang
Xunliang Cai
Yunfang Wu
MoMe
462
1
0
26 Apr 2025
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation
Jiabin Fan
Guoqing Luo
Michael Bowling
Lili Mou
OffRL
140
0
0
26 Apr 2025
GLaMoR: Consistency Checking of OWL Ontologies using Graph Language Models
Justin Mücke
A. Scherp
389
0
0
26 Apr 2025
Pushing the boundary on Natural Language Inference
Pablo Miralles-González
Javier Huertas-Tato
Alejandro Martín
David Camacho
LRM
227
0
0
25 Apr 2025
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs
Hongyu Wang
Shuming Ma
Furu Wei
MQ
96
4
0
25 Apr 2025
TextTIGER: Text-based Intelligent Generation with Entity Prompt Refinement for Text-to-Image Generation
Shintaro Ozaki
Kazuki Hayashi
Yusuke Sakai
Jingun Kwon
Hidetaka Kamigaito
Katsuhiko Hayashi
Manabu Okumura
Taro Watanabe
VLM
132
0
0
25 Apr 2025
Previous
1
2
3
...
9
10
11
...
196
197
198
Next