Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 8,765 papers shown
Title
Emergent Specialization: Rare Token Neurons in Language Models
Jing Liu
Haozheng Wang
Yueheng Li
MILM
LRM
9
0
0
19 May 2025
Enhancing Channel-Independent Time Series Forecasting via Cross-Variate Patch Embedding
Donghwa Shin
Edwin Zhang
AI4TS
7
0
0
19 May 2025
A3 : an Analytical Low-Rank Approximation Framework for Attention
Jeffrey T. H. Wong
Cheng Zhang
Xinye Cao
Pedro Gimenes
George A. Constantinides
Wayne Luk
Yiren Zhao
OffRL
MQ
2
0
0
19 May 2025
Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents
Yunseok Jang
Yeda Song
Sungryull Sohn
Lajanugen Logeswaran
Tiange Luo
Dong-Ki Kim
Kyunghoon Bae
Honglak Lee
VGen
4
0
0
19 May 2025
SounDiT: Geo-Contextual Soundscape-to-Landscape Generation
Junbo Wang
Haofeng Tan
Bowen Liao
Albert Jiang
Teng Fei
Qixing Huang
Zhengzhong Tu
Shan Ye
Yuhao Kang
14
0
0
19 May 2025
Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning
Shaobo Wang
Ziming Wang
Xiangqi Jin
J. T. Wang
J. A. Zhang
...
Zichen Wen
Zhong Li
Zeang Sheng
Xuming Hu
Linfeng Zhang
SyDa
2
0
0
18 May 2025
Bidirectional LMs are Better Knowledge Memorizers? A Benchmark for Real-world Knowledge Injection
Yuwei Zhang
W. Yu
Shangbin Feng
Yifan Zhu
Letian Peng
Jayanth Srinivasa
Gaowen Liu
Jingbo Shang
KELM
2
0
0
18 May 2025
Hyperbolic Residual Quantization: Discrete Representations for Data with Latent Hierarchies
Piotr Piękos
Subhradeep Kayal
Alexandros Karatzoglou
4
0
0
18 May 2025
One-for-All Pruning: A Universal Model for Customized Compression of Large Language Models
Rongguang Ye
Ming Tang
2
0
0
18 May 2025
Hyperspectral Image Land Cover Captioning Dataset for Vision Language Models
Aryan Das
Tanishq Rachamalla
Pravendra Singh
Koushik Biswas
Vinay Kumar Verma
Swalpa Kumar Roy
VLM
2
0
0
18 May 2025
HybridServe: Efficient Serving of Large AI Models with Confidence-Based Cascade Routing
Leyang Xue
Yao Fu
Luo Mai
Mahesh K. Marina
2
0
0
18 May 2025
HBO: Hierarchical Balancing Optimization for Fine-Tuning Large Language Models
Weixuan Wang
Minghao Wu
Barry Haddow
Alexandra Birch
2
0
0
18 May 2025
AltLoRA: Towards Better Gradient Approximation in Low-Rank Adaptation with Alternating Projections
Xin Yu
Yujia Wang
Jinghui Chen
Lingzhou Xue
7
0
0
18 May 2025
ChemPile: A 250GB Diverse and Curated Dataset for Chemical Foundation Models
Adrian Mirza
Nawaf Alampara
Martiño Ríos-García
Mohamed Abdelalim
Jack Butler
...
Mark Worrall
Adamo Young
Philippe Schwaller
Michael Pieler
Kevin Maik Jablonka
7
0
0
18 May 2025
A Survey of Attacks on Large Language Models
Wenrui Xu
Keshab K. Parhi
AAML
ELM
14
0
0
18 May 2025
AI-Driven Automation Can Become the Foundation of Next-Era Science of Science Research
Renqi Chen
Haoyang Su
Shixiang Tang
Zhenfei Yin
Qi Wu
Hui Li
Ye Sun
Nanqing Dong
Wanli Ouyang
Philip Torr
AI4CE
7
0
0
17 May 2025
Learning to Highlight Audio by Watching Movies
Chao Huang
Ruohan Gao
J. M. F. Tsang
Jan Kurcius
Cagdas Bilen
Chenliang Xu
Anurag Kumar
Sanjeel Parekh
VGen
4
0
0
17 May 2025
Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks
Giyeong Oh
Woohyun Cho
Siyeol Kim
Suhwan Choi
Younjae Yu
4
0
0
17 May 2025
Masking in Multi-hop QA: An Analysis of How Language Models Perform with Context Permutation
Wenyu Huang
Pavlos Vougiouklis
Mirella Lapata
Jeff Z. Pan
2
0
0
16 May 2025
The Ripple Effect: On Unforeseen Complications of Backdoor Attacks
Rui Zhang
Yun Shen
Hongwei Li
Wenbo Jiang
Hanxiao Chen
Yuan Zhang
Guowen Xu
Yang Zhang
SILM
AAML
17
0
0
16 May 2025
Qronos: Correcting the Past by Shaping the Future... in Post-Training Quantization
Shihao Zhang
Haoyu Zhang
Ian Colbert
Rayan Saab
MQ
12
0
0
16 May 2025
GenKnowSub: Improving Modularity and Reusability of LLMs through General Knowledge Subtraction
Mohammadtaha Bagherifard
Sahar Rajabi
Ali Edalat
Yadollah Yaghoobzadeh
KELM
32
0
0
16 May 2025
Low-Resource Language Processing: An OCR-Driven Summarization and Translation Pipeline
Hrishit Madhavi
Jacob Cherian
Yuvraj Khamkar
Dhananjay Bhagat
VLM
12
0
0
16 May 2025
Gaussian Weight Sampling for Scalable, Efficient and Stable Pseudo-Quantization Training
Myeonghwan Ahn
Sungjoo Yoo
MQ
17
0
0
16 May 2025
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
Yifei He
Siqi Zeng
Yuzheng Hu
Rui Yang
Tong Zhang
Han Zhao
MoMe
ALM
24
0
0
16 May 2025
Conditioning Matters: Training Diffusion Policies is Faster Than You Think
Zibin Dong
Yicheng Liu
Yinchuan Li
Hang Zhao
Haifeng Zhang
19
0
0
16 May 2025
AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges
Ranjan Sapkota
Konstantinos I Roumeliotis
Manoj Karkee
AI4TS
24
0
0
15 May 2025
Superposition Yields Robust Neural Scaling
Yizhou Liu
Ziming Liu
Jeff Gore
MILM
24
0
0
15 May 2025
ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention
Jintian Shao
Hongyi Huang
Jiayi Wu
Beiwen Zhang
ZhiYu Wu
You Shan
MingKai Zheng
29
0
0
15 May 2025
MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation
Yanbo Ding
Xirui Hu
Zhizhi Guo
Yansen Wang
DiffM
VGen
36
0
0
15 May 2025
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
Bingda Tang
Boyang Zheng
Xichen Pan
Sayak Paul
Saining Xie
31
0
0
15 May 2025
Multi-Token Prediction Needs Registers
Anastasios Gerontopoulos
Spyros Gidaris
N. Komodakis
24
0
0
15 May 2025
From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models
Yidan Wang
Yubing Ren
Yanan Cao
Binxing Fang
30
0
0
15 May 2025
Task-Core Memory Management and Consolidation for Long-term Continual Learning
Tianyu Huai
Jie Zhou
Yuxuan Cai
Qin Chen
Wen Wu
Xingjiao Wu
Xipeng Qiu
Liang He
CLL
33
0
0
15 May 2025
Comparing LLM Text Annotation Skills: A Study on Human Rights Violations in Social Media Data
Poli A. Nemkova
S. Ubani
Mark V. Albert
AILaw
35
0
0
15 May 2025
Multilingual Machine Translation with Quantum Encoder Decoder Attention-based Convolutional Variational Circuits
Subrit Dikshit
Ritu Tiwari
Priyank Jain
24
0
0
14 May 2025
A 2D Semantic-Aware Position Encoding for Vision Transformers
Xi Chen
Shiyang Zhou
Muqi Huang
Jiaxu Feng
Yun Xiong
...
Yuyao Zhang
Huishuai Bao
Sijia Peng
Chong Li
Feng Shi
ViT
31
0
0
14 May 2025
MorphMark: Flexible Adaptive Watermarking for Large Language Models
Zongqi Wang
Tianle Gu
Baoyuan Wu
Yujiu Yang
WaLM
16
0
0
14 May 2025
Variational Prefix Tuning for Diverse and Accurate Code Summarization Using Pre-trained Language Models
Junda Zhao
Yuliang Song
Eldan Cohen
21
0
0
14 May 2025
Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?
Anthony GX-Chen
Dongyan Lin
Mandana Samiei
Doina Precup
Blake A. Richards
Rob Fergus
Kenneth Marino
CML
LRM
34
0
0
14 May 2025
ALOHA: Empowering Multilingual Agent for University Orientation with Hierarchical Retrieval
Mingxu Tao
Bowen Tang
Mingxuan Ma
Yining Zhang
Hourun Li
Feifan Wen
Hao Ma
Jia-Qi Yang
22
0
0
13 May 2025
Controllable Image Colorization with Instance-aware Texts and Masks
Yanru An
Ling Gui
Qiang Hu
Chunlei Cai
Tianxiao Ye
Xiaoyun Zhang
Yanfeng Wang
DiffM
39
0
0
13 May 2025
RepCali: High Efficient Fine-tuning Via Representation Calibration in Latent Space for Pre-trained Language Models
Fujun Zhang
Xiangdong Su
34
0
0
13 May 2025
Large Language Models Meet Stance Detection: A Survey of Tasks, Methods, Applications, Challenges and Future Directions
Lata Pangtey
Anukriti Bhatnagar
Shubhi Bansal
Shahid Shafi Dar
Nagendra Kumar
34
0
0
13 May 2025
Evaluating LLM Metrics Through Real-World Capabilities
Justin K Miller
Wenjia Tang
ELM
ALM
44
0
0
13 May 2025
Fast Text-to-Audio Generation with Adversarial Post-Training
Zachary Novack
Zach Evans
Zack Zukowski
Josiah Taylor
CJ Carr
...
Adnan Al-Sinan
Gian Marco Iodice
Julian McAuley
Taylor Berg-Kirkpatrick
Jordi Pons
30
0
0
13 May 2025
Lost in Transliteration: Bridging the Script Gap in Neural IR
Andreas Chari
Iadh Ounis
Sean MacAvaney
24
0
0
13 May 2025
A Comparative Analysis of Static Word Embeddings for Hungarian
Máté Gedeon
39
0
0
12 May 2025
Circuit Partitioning Using Large Language Models for Quantum Compilation and Simulations
Pranav Sinha
Sumit Kumar Jha
Sunny Raj
39
0
0
12 May 2025
Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation Model
Wei Li
Ming Hu
Guoan Wang
Lihao Liu
Kaijin Zhou
Junzhi Ning
Xin Guo
Zongyuan Ge
Lixu Gu
Junjun He
28
0
0
12 May 2025
1
2
3
4
...
174
175
176
Next