Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,843 papers shown
Title
APE: Selective Fine-tuning with Acceptance Criteria for Language Model Adaptation
Javier Marín
47
0
0
26 May 2025
Graceful Forgetting in Generative Language Models
Chunyang Jiang
Chi-Min Chan
Yiyang Cai
Yulong Liu
Wei Xue
Yike Guo
MoMe
CLL
KELM
29
0
0
26 May 2025
ResSVD: Residual Compensated SVD for Large Language Model Compression
Haolei Bai
Siyong Jian
Tuo Liang
Yu Yin
Huan Wang
46
0
0
26 May 2025
Evaluating Large Language Models for Code Review
Umut Cihan
Arda İçöz
Vahid Haratian
Eray Tüzün
ALM
17
0
0
26 May 2025
WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference
Sihan Chen
Dan Zhao
Jongwoo Ko
Colby R. Banbury
Huiping Zhuang
Luming Liang
Tianyi Chen
42
0
0
26 May 2025
PreP-OCR: A Complete Pipeline for Document Image Restoration and Enhanced OCR Accuracy
Shuhao Guan
Moule Lin
Cheng Xu
Xinyi Liu
Jinman Zhao
Jiexin Fan
Qi Xu
Derek Greene
65
2
0
26 May 2025
Deconstructing Obfuscation: A four-dimensional framework for evaluating Large Language Models assembly code deobfuscation capabilities
Anton Tkachenko
Dmitrij Suskevic
Benjamin Adolphi
50
0
0
26 May 2025
LlamaSeg: Image Segmentation via Autoregressive Mask Generation
Jiru Deng
Tengjin Weng
Tianyu Yang
Wenhan Luo
Zhiheng Li
Wenhao Jiang
VLM
147
0
0
26 May 2025
Learning Extrapolative Sequence Transformations from Markov Chains
Sophia Hager
Aleem Khan
Andrew Wang
Nicholas Andrews
BDL
33
0
0
26 May 2025
ETS: Open Vocabulary Electroencephalography-To-Text Decoding and Sentiment Classification
Mohamed Masry
Mohamed Amen
Mohamed Elzyat
Mohamed Hamed
Norhan Magdy
Maram Khaled
10
0
0
26 May 2025
Enhancing Visual Reliance in Text Generation: A Bayesian Perspective on Mitigating Hallucination in Large Vision-Language Models
Nanxing Hu
Xiaoyue Duan
Jinchao Zhang
Guoliang Kang
MLLM
61
0
0
26 May 2025
MM-Prompt: Cross-Modal Prompt Tuning for Continual Visual Question Answering
Xu Li
Fan Lyu
LRM
20
0
0
26 May 2025
Concept Reachability in Diffusion Models: Beyond Dataset Constraints
Marta Aparicio Rodriguez
Xenia Miscouridou
Anastasia Borovykh
43
0
0
25 May 2025
Step-level Reward for Free in RL-based T2I Diffusion Model Fine-tuning
Xinyao Liao
Wei Wei
Xiaoye Qu
Yu Cheng
EGVM
62
0
0
25 May 2025
CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design
H. Zhang
Dexiang Hong
Maoke Yang
Yutao Chen
Zhao Zhang
Jie Shao
Xinglong Wu
Zuxuan Wu
Yu Jiang
DiffM
AI4CE
168
0
0
25 May 2025
Semantic-enhanced Co-attention Prompt Learning for Non-overlapping Cross-Domain Recommendation
Lei Guo
Chenlong Song
Feng Guo
Xiaohui Han
Xiaojun Chang
Lei Zhu
31
0
0
25 May 2025
Efficient Data Selection at Scale via Influence Distillation
Mahdi Nikdan
Vincent Cohen-Addad
Dan Alistarh
Vahab Mirrokni
TDI
71
0
0
25 May 2025
Conventional Contrastive Learning Often Falls Short: Improving Dense Retrieval with Cross-Encoder Listwise Distillation and Synthetic Data
Manveer Singh Tamber
Suleman Kazi
Vivek Sourabh
Jimmy Lin
63
0
0
25 May 2025
CardioCoT: Hierarchical Reasoning for Multimodal Survival Analysis
Shaohao Rui
Haoyang Su
Jinyi Xiang
Lian-Ming Wu
Xiaosong Wang
41
0
0
25 May 2025
Why Do Some Inputs Break Low-Bit LLM Quantization?
Ting-Yun Chang
Muru Zhang
Jesse Thomason
Robin Jia
MQ
15
0
0
24 May 2025
EvdCLIP: Improving Vision-Language Retrieval with Entity Visual Descriptions from Large Language Models
G. MEng
Sunan He
Jinpeng Wang
Tao Dai
Letian Zhang
Jieming Zhu
Qing Li
Gang Wang
Rui Zhang
Yong Jiang
VLM
294
0
0
24 May 2025
LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-Tuning
Junyu Chen
Junzhuo Li
Zhen Peng
Wenjie Wang
Yuxiang Ren
Long Shi
Xuming Hu
MQ
33
0
0
24 May 2025
Synthesizing and Adapting Error Correction Data for Mobile Large Language Model Applications
Yanxiang Zhang
Zheng Xu
Shanshan Wu
Yuanbo Zhang
Daniel Ramage
KELM
46
0
0
24 May 2025
μ
μ
μ
-MoE: Test-Time Pruning as Micro-Grained Mixture-of-Experts
T. Koike-Akino
Jing Liu
Ye Wang
MoE
34
0
0
24 May 2025
VISTA: Vision-Language Inference for Training-Free Stock Time-Series Analysis
Tina Khezresmaeilzadeh
Parsa Razmara
Seyedarmin Azizi
Mohammad Erfan Sadeghi
Erfan Baghaei Portaghloo
AI4TS
274
0
0
24 May 2025
Skip-Thinking: Chunk-wise Chain-of-Thought Distillation Enable Smaller Language Models to Reason Better and Faster
Xiao Chen
Sihang Zhou
K. Liang
Xiaoyu Sun
Xinwang Liu
LRM
30
1
0
24 May 2025
Localizing Knowledge in Diffusion Transformers
Arman Zarei
Samyadeep Basu
Keivan Rezaei
Zihao Lin
Sayan Nag
Soheil Feizi
38
0
0
24 May 2025
Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models
Jamie Hayes
Ilia Shumailov
Christopher A. Choquette-Choo
Matthew Jagielski
G. Kaissis
...
Matthieu Meeus
Yves-Alexandre de Montjoye
Franziska Boenisch
Adam Dziedzic
A. Feder Cooper
58
1
0
24 May 2025
SVD-Free Low-Rank Adaptive Gradient Optimization for Large Language Models
Ionut-Vlad Modoranu
M. Safaryan
Erik Schultheis
Dan Alistarh
36
0
0
23 May 2025
Explaining Sources of Uncertainty in Automated Fact-Checking
Jingyi Sun
Greta Warren
Irina Shklovski
Isabelle Augenstein
65
1
0
23 May 2025
Two-Stage Regularization-Based Structured Pruning for LLMs
Mingkuan Feng
Jinyang Wu
Siyuan Liu
Shuai Zhang
Hongjian Fang
Ruihan Jin
Feihu Che
Pengpeng Shao
Zhengqi Wen
28
0
0
23 May 2025
NeuroTrails: Training with Dynamic Sparse Heads as the Key to Effective Ensembling
Bram Grooten
Farid Hasanov
Chenxiang Zhang
Q. Xiao
Boqian Wu
...
Shiwei Liu
L. Yin
Elena Mocanu
Mykola Pechenizkiy
Decebal Constantin Mocanu
60
0
0
23 May 2025
LatentLLM: Attention-Aware Joint Tensor Compression
T. Koike-Akino
Xiangyu Chen
Jing Liu
Ye Wang
Wang
Matthew Brand
29
0
0
23 May 2025
Graph-Linguistic Fusion: Using Language Models for Wikidata Vandalism Detection
Mykola Trokhymovych
Lydia Pintscher
R. Baeza-Yates
Diego Sáez-Trumper
KELM
22
0
0
23 May 2025
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
Xinran Gu
Kaifeng Lyu
Jiazheng Li
Jingzhao Zhang
83
0
0
23 May 2025
DataRater: Meta-Learned Dataset Curation
Dan A. Calian
Gregory Farquhar
Iurii Kemaev
Luisa M. Zintgraf
Matteo Hessel
...
András Gyorgy
Tom Schaul
Jeffrey Dean
Hado van Hasselt
David Silver
162
1
0
23 May 2025
How Knowledge Popularity Influences and Enhances LLM Knowledge Boundary Perception
Shiyu Ni
Keping Bi
Jiafeng Guo
Xueqi Cheng
39
0
0
23 May 2025
Mutarjim: Advancing Bidirectional Arabic-English Translation with a Small Language Model
Khalil Hennara
Muhammad Hreden
Mohamed Motaism Hamed
Zeina Aldallal
Sara Chrouf
Safwan AlModhayan
52
0
0
23 May 2025
PLUMAGE: Probabilistic Low rank Unbiased Min Variance Gradient Estimator for Efficient Large Model Training
Matan Haroush
Daniel Soudry
186
0
0
23 May 2025
Curriculum Guided Reinforcement Learning for Efficient Multi Hop Retrieval Augmented Generation
Yuelyu Ji
Rui Meng
Zhuochun Li
Daqing He
181
1
0
23 May 2025
LCD: Advancing Extreme Low-Bit Clustering for Large Language Models via Knowledge Distillation
Fangxin Liu
Ning Yang
Junping Zhao
Tao Yang
Haibing Guan
Li Jiang
MQ
31
0
0
23 May 2025
Hard Negative Mining for Domain-Specific Retrieval in Enterprise Systems
Hansa Meghwani
Amit Agarwal
Priyaranjan Pattnayak
Hitesh Laxmichand Patel
Srikant Panda
54
0
0
23 May 2025
Wasserstein Transfer Learning
Kaicheng Zhang
Sinian Zhang
Doudou Zhou
Yidong Zhou
17
0
0
23 May 2025
UNJOIN: Enhancing Multi-Table Text-to-SQL Generation via Schema Simplification
Poojah Ganesan
Rajat Aayush Jha
Dan Roth
Vivek Gupta
77
0
0
23 May 2025
Power-Law Decay Loss for Large Language Model Finetuning: A Theory Perspective
Jintian Shao
52
0
0
22 May 2025
Bootstrapping your behavior: a new pretraining strategy for user behavior sequence data
Weichang Wu
Xiaolu Zhang
Jun Zhou
Yuchen Li
Wenwen Xia
15
0
0
22 May 2025
Small-to-Large Generalization: Data Influences Models Consistently Across Scale
Alaa Khaddaj
Logan Engstrom
Aleksander Madry
TDI
AI4CE
74
0
0
22 May 2025
PaTH Attention: Position Encoding via Accumulating Householder Transformations
Songlin Yang
Yikang Shen
Kaiyue Wen
Shawn Tan
Mayank Mishra
Liliang Ren
Rameswar Panda
Yoon Kim
66
1
0
22 May 2025
Improving Chemical Understanding of LLMs via SMILES Parsing
Yunhui Jang
Jaehyung Kim
SungSoo Ahn
43
0
0
22 May 2025
Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding
Feilong Tang
Chengzhi Liu
Zhongxing Xu
Ming Hu
Zelin Peng
...
Minquan Lin
Yifan Peng
Xuelian Cheng
Imran Razzak
Zongyuan Ge
72
1
0
22 May 2025
Previous
1
2
3
...
5
6
7
...
195
196
197
Next