Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,844 papers shown
Title
Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models
Ben Finkelshtein
.Ismail .Ilkan Ceylan
Michael M. Bronstein
Ron Levie
21
0
0
01 Jul 2025
Progressive Binarization with Semi-Structured Pruning for LLMs
Xinyu Yan
Tianao Zhang
Zhiteng Li
Yulun Zhang
MQ
138
1
0
01 Jul 2025
A Pre-trained Sequential Recommendation Framework: Popularity Dynamics for Zero-shot Transfer
Junting Wang
Praneet Rathi
Hari Sundaram
HAI
VLM
34
5
0
01 Jul 2025
Emergent Temporal Correspondences from Video Diffusion Transformers
Jisu Nam
Soowon Son
Dahyun Chung
Jiyoung Kim
Siyoon Jin
Junhwa Hur
Seungryong Kim
VGen
23
0
0
20 Jun 2025
Private Training & Data Generation by Clustering Embeddings
Felix Y. Zhou
Samson Zhou
Vahab Mirrokni
Alessandro Epasto
Vincent Cohen-Addad
20
0
0
20 Jun 2025
Language Bottleneck Models: A Framework for Interpretable Knowledge Tracing and Beyond
Antonin Berthon
M. Schaar
12
0
0
20 Jun 2025
Mechanisms vs. Outcomes: Probing for Syntax Fails to Explain Performance on Targeted Syntactic Evaluations
Ananth Agarwal
Jasper Jian
Christopher D. Manning
Shikhar Murty
12
0
0
20 Jun 2025
A Minimalist Optimizer Design for LLM Pretraining
Athanasios Glentis
Jiaxiang Li
Andi Han
Mingyi Hong
14
0
0
20 Jun 2025
Better Language Model Inversion by Compactly Representing Next-Token Distributions
Murtaza Nazir
Matthew Finlayson
John X. Morris
Xiang Ren
Swabha Swayamdipta
12
0
0
20 Jun 2025
GeoGuess: Multimodal Reasoning based on Hierarchy of Visual Information in Street View
Fenghua Cheng
Jinxiang Wang
Sen Wang
Zi Huang
Xue Li
LRM
19
0
0
19 Jun 2025
Comparative Analysis of Abstractive Summarization Models for Clinical Radiology Reports
Anindita Bhattacharya
Tohida Rehman
Debarshi Kumar Sanyal
S. Chattopadhyay
15
0
0
19 Jun 2025
Analyzing the Influence of Knowledge Graph Information on Relation Extraction
Cedric Moller
Ricardo Usbeck
7
0
0
19 Jun 2025
DISCIE -- Discriminative Closed Information Extraction
Cedric Moller
Ricardo Usbeck
17
0
0
19 Jun 2025
Revela: Dense Retriever Learning via Language Modeling
Fengyu Cai
Tong Chen
Xinran Zhao
Sihao Chen
Hongming Zhang
Sherry Tongshuang Wu
Iryna Gurevych
Heinz Koeppl
RALM
VLM
16
0
0
19 Jun 2025
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Zhiyuan Liang
Dongwen Tang
Yuhao Zhou
Xuanlei Zhao
Mingjia Shi
...
Damian Borth
Michael M. Bronstein
Yang You
Zhangyang Wang
Kai Wang
OffRL
17
0
0
19 Jun 2025
Relational Deep Learning: Challenges, Foundations and Next-Generation Architectures
Vijay Prakash Dwivedi
Charilaos I. Kanatsoulis
Shenyang Huang
Jure Leskovec
GNN
3DV
35
0
0
19 Jun 2025
DiscRec: Disentangled Semantic-Collaborative Modeling for Generative Recommendation
Chang Liu
Yimeng Bai
Xiaoyan Zhao
Yang Zhang
Fuli Feng
Wenge Rong
40
0
0
18 Jun 2025
SecFwT: Efficient Privacy-Preserving Fine-Tuning of Large Language Models Using Forward-Only Passes
Jinglong Luo
Zhuo Zhang
Yehong Zhang
Shiyu Liu
Ye Dong
Xun Zhou
Hui Wang
Yue Yu
Zenglin Xu
12
0
0
18 Jun 2025
A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals
Andrea Cadeddu
Alessandro Chessa
Vincenzo De Leo
Gianni Fenu
Enrico Motta
Francesco Osborne
Diego Reforgiato Recupero
Angelo Salatino
Luca Secchi
14
0
0
18 Jun 2025
PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning
Yuhui Shi
Yehan Yang
Qiang Sheng
Hao Mi
Beizhe Hu
Chaoxi Xu
Juan Cao
DeLMO
59
0
0
18 Jun 2025
Cohort Discovery: A Survey on LLM-Assisted Clinical Trial Recruitment
Shrestha Ghosh
Moritz Schneider
Carina Reinicke
Carsten Eickhoff
12
0
0
18 Jun 2025
Demystifying the Visual Quality Paradox in Multimodal Large Language Models
Shuo Xing
Lanqing guo
Hongyuan Hua
Seoyoung Lee
Peiran Li
Yufei Wang
Zhangyang Wang
Zhengzhong Tu
VLM
36
0
0
18 Jun 2025
Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model
Anirud Aggarwal
Abhinav Shrivastava
M. Gwilliam
50
0
0
18 Jun 2025
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models
Byung-Kwan Lee
Ryo Hachiuma
Yong Man Ro
Yu-Chun Wang
Yueh-Hua Wu
VLM
35
0
0
18 Jun 2025
Dense SAE Latents Are Features, Not Bugs
Xiaoqing Sun
Alessandro Stolfo
Joshua Engels
Ben Wu
Senthooran Rajamanoharan
Mrinmaya Sachan
Max Tegmark
57
0
0
18 Jun 2025
Managing Complex Failure Analysis Workflows with LLM-based Reasoning and Acting Agents
Aline Dobrovsky
Konstantin Schekotihin
Christian Burmer
LLMAG
20
0
0
18 Jun 2025
Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework
Laura Kopf
Nils Feldhus
Kirill Bykov
P. Bommer
Anna Hedström
Marina M.-C. Höhne
Oliver Eberle
19
0
0
18 Jun 2025
AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs
Di He
Ajay Jaiswal
Songjun Tu
Li Shen
Ganzhao Yuan
Shiwei Liu
L. Yin
34
0
0
17 Jun 2025
Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language Models
Chenchen Yuan
Zheyu Zhang
Shuo Yang
Bardh Prenkaj
Gjergji Kasneci
26
0
0
17 Jun 2025
Multi-Scale Finetuning for Encoder-based Time Series Foundation Models
Zhongzheng Qiao
Chenghao Liu
Y. Zhang
Ming Jin
Quang Pham
Qingsong Wen
P.N. Suganthan
Xudong Jiang
Savitha Ramasamy
AI4TS
AI4CE
17
0
0
17 Jun 2025
Adaptive Accompaniment with ReaLchords
Yusong Wu
Tim Cooijmans
Kyle Kastner
Adam Roberts
Ian Simon
...
Shayegan Omidshafiei
Aaron Courville
Pablo Samuel Castro
Natasha Jaques
Cheng-Zhi Anna Huang
19
0
0
17 Jun 2025
DisProtEdit: Exploring Disentangled Representations for Multi-Attribute Protein Editing
Max Ku
S. Sun
Hongyu Guo
Wenhu Chen
12
0
0
17 Jun 2025
Toward Rich Video Human-Motion2D Generation
Ruihao Xi
Xuekuan Wang
Yongcheng Li
Shuhua Li
Zichen Wang
Yiwei Wang
Feng Wei
Cairong Zhao
VGen
19
0
0
17 Jun 2025
Don't throw the baby out with the bathwater: How and why deep learning for ARC
Jack Cole
Mohamed Osman
LRM
38
0
0
17 Jun 2025
Fretting-Transformer: Encoder-Decoder Model for MIDI to Tablature Transcription
Anna Hamberger
Sebastian Murgul
Jochen Schmidt
Michael Heizmann
21
0
0
17 Jun 2025
Essential-Web v1.0: 24T tokens of organized web data
Essential AI
Andrew Hojel
Michael Pust
Tim Romanski
Yash Vanjani
...
Platon Mazarakis
Saad Jamal
Saurabh Srivastava
Somanshu Singla
Ashish Vaswani
20
0
0
17 Jun 2025
EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization
Zhongqian Fu
Ning Ding
Kai Han
Xianzhi Yu
Xiaosong Li
Xinghao Chen
Yehui Tang
Yunhe Wang
MQ
MoE
20
0
0
16 Jun 2025
Rectifying Privacy and Efficacy Measurements in Machine Unlearning: A New Inference Attack Perspective
Nima Naderloui
Shenao Yan
Binghui Wang
Jie Fu
Wendy Hui Wang
Weiran Liu
Yuan Hong
AAML
25
0
0
16 Jun 2025
CFBenchmark-MM: Chinese Financial Assistant Benchmark for Multimodal Large Language Model
Jiangtong Li
Yiyun Zhu
Dawei Cheng
Zhijun Ding
Changjun Jiang
25
0
0
16 Jun 2025
Antibody Foundational Model : Ab-RoBERTa
Eunna Huh
Hyeonsu Lee
Hyunjin Shin
12
0
0
16 Jun 2025
Equitable Electronic Health Record Prediction with FAME: Fairness-Aware Multimodal Embedding
Nikkie Hooman
Zhongjie Wu
Eric C. Larson
Mehak Gupta
25
0
0
16 Jun 2025
SPOT: Bridging Natural Language and Geospatial Search for Investigative Journalists
Lynn Khellaf
Ipek Baris Schlicht
Tilman Mirass
Julia Bayer
Tilman Wagner
Ruben Bouwmeester
9
0
0
16 Jun 2025
ASMR: Augmenting Life Scenario using Large Generative Models for Robotic Action Reflection
Shang-Chi Tsai
Seiya Kawano
Angel García Contreras
Koichiro Yoshino
Yun-Nung Chen
LM&Ro
20
2
0
16 Jun 2025
Flexible-length Text Infilling for Discrete Diffusion Models
Andrew Zhang
Anushka Sivakumar
Chiawei Tang
Chris Thomas
DiffM
16
0
0
16 Jun 2025
Few-Shot Learning for Industrial Time Series: A Comparative Analysis Using the Example of Screw-Fastening Process Monitoring
X. Tu
Haocheng Zhang
Tao Chengxu
Zuyi Chen
AI4TS
17
0
0
16 Jun 2025
Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs
Sayed Mohammad Vakilzadeh Hatefi
Maximilian Dreyer
Reduan Achtibat
Patrick Kahardipraja
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
28
0
0
16 Jun 2025
BOW: Bottlenecked Next Word Exploration
Ming shen
Zhikun Xu
Xiao Ye
Jacob Dineen
Ben Zhou
OffRL
LRM
28
0
0
16 Jun 2025
Watermarking LLM-Generated Datasets in Downstream Tasks
Y. Liu
Tianshuo Cong
Michael Backes
Zheng Li
Yang Zhang
WaLM
36
0
0
16 Jun 2025
ArgHiTZ at ArchEHR-QA 2025: A Two-Step Divide and Conquer Approach to Patient Question Answering for Top Factuality
Adrián Cuadrón
Aimar Sagasti
Maitane Urruela
Iker de la Iglesia
Ane G Domingo-Aldama
Aitziber Atutxa
Josu Goikoetxea
Ander Barrena
19
0
0
15 Jun 2025
EraserDiT: Fast Video Inpainting with Diffusion Transformer Model
Jie Liu
Zheng Hui
DiffM
VGen
23
0
0
15 Jun 2025
1
2
3
4
...
195
196
197
Next