Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,870 papers shown
Title
Enhancing Time Series Forecasting via Multi-Level Text Alignment with LLMs
Taibiao Zhao
Xiaobing Chen
Mingxuan Sun
AI4TS
120
1
0
10 Apr 2025
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Jiayang Sun
Hongru Wang
Jie Cao
Huaibo Huang
Ran He
DiffM
114
0
0
10 Apr 2025
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Alex Warstadt
Aaron Mueller
Leshem Choshen
E. Wilcox
Chengxu Zhuang
...
Rafael Mosquera
Bhargavi Paranjape
Adina Williams
Tal Linzen
Ryan Cotterell
200
121
0
10 Apr 2025
Between Linear and Sinusoidal: Rethinking the Time Encoder in Dynamic Graph Learning
Hsing-Huan Chung
Shravan Chaudhari
Xing Han
Yoav Wald
Suchi Saria
Joydeep Ghosh
AI4TS
76
0
0
10 Apr 2025
SF2T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding
Yangliu Hu
Zikai Song
Na Feng
Yawei Luo
Junqing Yu
Yi-Ping Phoebe Chen
Wei Yang
71
2
0
10 Apr 2025
GPT Carry-On: Training Foundation Model for Customization Could Be Simple, Scalable and Affordable
Jianqiao Wangni
45
0
0
10 Apr 2025
NLP Security and Ethics, in the Wild
Heather Lent
Erick Galinkin
Yiyi Chen
Jens Myrup Pedersen
Leon Derczynski
Johannes Bjerva
SILM
135
0
0
09 Apr 2025
EDIT: Enhancing Vision Transformers by Mitigating Attention Sink through an Encoder-Decoder Architecture
Wenfeng Feng
Guoying Sun
83
0
0
09 Apr 2025
Defending LLM Watermarking Against Spoofing Attacks with Contrastive Representation Learning
Li An
Yujian Liu
Y. Liu
Yang Zhang
Yuheng Bu
Shiyu Chang
AAML
407
1
0
09 Apr 2025
RuOpinionNE-2024: Extraction of Opinion Tuples from Russian News Texts
Natalia Loukachevitch
Natalia Tkachenko
Anna Lapanitsyna
M. Tikhomirov
Nicolay Rusnachenko
85
0
0
09 Apr 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
161
0
0
09 Apr 2025
Modeling Response Consistency in Multi-Agent LLM Systems: A Comparative Analysis of Shared and Separate Context Approaches
Tooraj Helmi
LLMAG
35
1
0
09 Apr 2025
A Neuro-inspired Interpretation of Unlearning in Large Language Models through Sample-level Unlearning Difficulty
Xiaohua Feng
Yuyuan Li
C. Wang
Junlin Liu
Lulu Zhang
Chaochao Chen
MU
57
0
0
09 Apr 2025
Lugha-Llama: Adapting Large Language Models for African Languages
Happy Buzaaba
Alexander Wettig
David Ifeoluwa Adelani
Christiane Fellbaum
94
0
0
09 Apr 2025
PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering
Y. Gao
Zihang Lin
Chuanbin Liu
Min Zhou
T. Ge
Bo Zheng
Hongtao Xie
DiffM
114
5
0
09 Apr 2025
Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi
Monojit Choudhury
Shivam Chauhan
Rocktim Jyoti Das
Dhruv Sahnan
Xudong Han
...
Rituraj Joshi
Gurpreet Gosal
Avraham Sheinin
Natalia Vassilieva
Preslav Nakov
99
1
0
08 Apr 2025
From Superficial to Deep: Integrating External Knowledge for Follow-up Question Generation Using Knowledge Graph and LLM
Jianyu Liu
Yi Huang
Sheng Bi
Junlan Feng
Guilin Qi
137
2
0
08 Apr 2025
GOLLuM: Gaussian Process Optimized LLMs -- Reframing LLM Finetuning through Bayesian Optimization
Bojana Ranković
P. Schwaller
BDL
483
1
0
08 Apr 2025
Multi-Sense Embeddings for Language Models and Knowledge Distillation
Qitong Wang
Mohammed J. Zaki
Georgios Kollias
Vasileios Kalantzis
KELM
60
1
0
08 Apr 2025
The Zero Body Problem: Probing LLM Use of Sensory Language
Rebecca M. M. Hicke
Sil Hamilton
David M. Mimno
92
0
0
08 Apr 2025
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation
Biao Zhang
Fedor Moiseev
Joshua Ainslie
Paul Suganthan
Min Ma
Surya Bhupatiraju
Fede Lebron
Orhan Firat
Armand Joulin
Zhe Dong
AI4CE
44
0
0
08 Apr 2025
Query Understanding in LLM-based Conversational Information Seeking
Yifei Yuan
Zahra Abbasiantaeb
Yang Deng
Mohammad Aliannejadi
73
1
0
08 Apr 2025
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
Dongyang Fan
Vinko Sabolčec
Matin Ansaripour
Ayush Kumar Tarun
Martin Jaggi
Antoine Bosselut
Imanol Schlag
63
1
0
08 Apr 2025
Speech-to-Trajectory: Learning Human-Like Verbal Guidance for Robot Motion
Eran Bamani
Eden Nissinman
Rotem Atari
Nevo Heimann Saadon
A. Sintov
422
0
0
07 Apr 2025
AccLLM: Accelerating Long-Context LLM Inference Via Algorithm-Hardware Co-Design
Yanbiao Liang
Huihong Shi
Haikuo Shao
Zhongfeng Wang
94
0
0
07 Apr 2025
Bidirectional Hierarchical Protein Multi-Modal Representation Learning
Xuefeng Liu
Songhao Jiang
Chih-chan Tien
Jinfeng Xu
Rick L. Stevens
56
0
0
07 Apr 2025
Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models
Ruikang Liu
Yuxuan Sun
Manyi Zhang
Haoli Bai
Xianzhi Yu
Tiezheng Yu
C. Yuan
Lu Hou
MQ
LRM
126
11
0
07 Apr 2025
Achieving binary weight and activation for LLMs using Post-Training Quantization
Siqing Song
Chuang Wang
Ruiqi Wang
Yi Yang
Xuyao Zhang
MQ
132
0
0
07 Apr 2025
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
Will Cai
Tianneng Shi
Xuandong Zhao
Dawn Song
78
6
0
07 Apr 2025
Pre-training Generative Recommender with Multi-Identifier Item Tokenization
Bowen Zheng
Enze Liu
Zhongfu Chen
Zhongrui Ma
Yue Wang
Wayne Xin Zhao
Ji-Rong Wen
164
0
0
06 Apr 2025
Universal Item Tokenization for Transferable Generative Recommendation
Bowen Zheng
Hongyu Lu
Yu Chen
Wayne Xin Zhao
Ji-Rong Wen
138
0
0
06 Apr 2025
Thanos: A Block-wise Pruning Algorithm for Efficient Large Language Model Compression
Ivan Ilin
Peter Richtárik
50
0
0
06 Apr 2025
Activation Patching for Interpretable Steering in Music Generation
Simone Facchiano
Giorgio Strano
Donato Crisostomi
Irene Tallini
Tommaso Mencattini
Fabio Galasso
Emanuele Rodolà
LLMSV
67
1
0
06 Apr 2025
On the Spatial Structure of Mixture-of-Experts in Transformers
Daniel Bershatsky
Ivan Oseledets
MoE
75
0
0
06 Apr 2025
Enabling Collaborative Parametric Knowledge Calibration for Retrieval-Augmented Vision Question Answering
Jiaqi Deng
Kaize Shi
Zonghan Wu
Huan Huo
Dingxian Wang
Guandong Xu
42
0
0
05 Apr 2025
A Comprehensive Survey of Challenges and Opportunities of Few-Shot Learning Across Multiple Domains
Andrea Gajic
Sudip Vhaduri
OOD
VLM
123
0
0
05 Apr 2025
Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
Yuheng Wu
Wentao Guo
Zirui Liu
Heng Ji
Zhaozhuo Xu
Denghui Zhang
84
0
0
05 Apr 2025
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
Maksim Siniukov
Di Chang
Minh Tran
Hongkun Gong
Ashutosh Chaubey
Mohammad Soleymani
DiffM
VGen
112
0
0
05 Apr 2025
Investigating and Mitigating Stereotype-aware Unfairness in LLM-based Recommendations
Zihuai Zhao
Wenqi Fan
Yao Wu
Qing Li
143
1
0
05 Apr 2025
Sigma: A dataset for text-to-code semantic parsing with statistical analysis
Saleh Almohaimeed
Shenyang Liu
May Alsofyani
Saad Almohaimeed
Liqiang Wang
115
0
0
05 Apr 2025
Transformer representation learning is necessary for dynamic multi-modal physiological data on small-cohort patients
Bingxu Wang
Kunzhi Cai
Yuqi Zhang
Yachong Guo
Zeyi Zhou
Wenjiao Li
Yachong Guo
Wei Wang
Qing Zhou
MedIm
107
0
0
05 Apr 2025
Foundation Models for Time Series: A Survey
Siva Rama Krishna Kottapalli
Karthik Hubli
Sandeep Chandrashekhara
Garima Jain
Sunayana Hubli
Gayathri Botla
Ramesh Doddaiah
AI4TS
AI4CE
113
0
0
05 Apr 2025
STEP: Staged Parameter-Efficient Pre-training for Large Language Models
Kazuki Yano
Takumi Ito
Jun Suzuki
LRM
132
1
0
05 Apr 2025
Entropy-Based Block Pruning for Efficient Large Language Models
Liangwei Yang
Yuhui Xu
Juntao Tan
Doyen Sahoo
Siyang Song
Caiming Xiong
Han Wang
Shelby Heinecke
AAML
64
0
0
04 Apr 2025
Safe Screening Rules for Group OWL Models
Runxue Bao
Quanchao Lu
Yanfu Zhang
110
0
0
04 Apr 2025
MegaMath: Pushing the Limits of Open Math Corpora
Fan Zhou
Zengzhi Wang
Nikhil Ranjan
Zhoujun Cheng
Liping Tang
Guowei He
Zhengzhong Liu
Eric P. Xing
LRM
135
3
0
03 Apr 2025
Efficient Model Editing with Task-Localized Sparse Fine-tuning
Leonardo Iurada
Marco Ciccone
Tatiana Tommasi
KELM
MoMe
148
3
0
03 Apr 2025
VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence
Hao Li
Hao Fei
Zechao Hu
Zhengwei Yang
Zheng Wang
64
1
0
03 Apr 2025
Leveraging LLM For Synchronizing Information Across Multilingual Tables
Siddharth Khincha
Tushar Kataria
Ankita Anand
Dan Roth
Vivek Gupta
137
0
0
03 Apr 2025
GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration
Yuhang Li
Ruokai Yin
Donghyun Lee
Shiting Xiao
Priyadarshini Panda
MQ
124
0
0
03 Apr 2025
Previous
1
2
3
...
12
13
14
...
196
197
198
Next