Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,870 papers shown
Title
TimesBERT: A BERT-Style Foundation Model for Time Series Understanding
Haoran Zhang
Yong Liu
Yunzhong Qiu
Haixuan Liu
Zhongyi Pei
Jianmin Wang
Mingsheng Long
AI4TS
68
1
0
28 Feb 2025
Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition
Yifei Duan
Raphael Shang
Deng Liang
Yongqiang Cai
128
0
0
28 Feb 2025
CoSMoEs: Compact Sparse Mixture of Experts
Patrick Huber
Akshat Shrivastava
Ernie Chang
Chinnadhurai Sankar
Ahmed Aly
Adithya Sagar
MoE
65
0
0
28 Feb 2025
UoR-NCL at SemEval-2025 Task 1: Using Generative LLMs and CLIP Models for Multilingual Multimodal Idiomaticity Representation
Thanet Markchom
Tong Wu
Liting Huang
Huizhi Liang
161
1
0
28 Feb 2025
Learning to Substitute Components for Compositional Generalization
Zechao Li
Gangwei Jiang
Chenwang Wu
Ying Wei
Defu Lian
Enhong Chen
114
0
0
28 Feb 2025
Multimodal Learning for Just-In-Time Software Defect Prediction in Autonomous Driving Systems
Faisal Mohammad
Duksan Ryu
92
0
0
28 Feb 2025
Teach-to-Reason with Scoring: Self-Explainable Rationale-Driven Multi-Trait Essay Scoring
Heejin Do
Sangwon Ryu
Gary Geunbae Lee
LRM
88
0
0
28 Feb 2025
Identifying Emerging Concepts in Large Corpora
Sibo Ma
Julian Nyarko
69
0
0
28 Feb 2025
Optimizing Large Language Models for ESG Activity Detection in Financial Texts
Mattia Birti
Francesco Osborne
Andrea Maurino
86
0
0
28 Feb 2025
Advancements in Natural Language Processing for Automatic Text Summarization
Nevidu Jayatilleke
Ruvan Weerasinghe
Nipuna Senanayake
370
1
0
27 Feb 2025
EdiText: Controllable Coarse-to-Fine Text Editing with Diffusion Language Models
Che Hyun Lee
Heeseung Kim
Jiheum Yeom
Sungroh Yoon
DiffM
121
1
0
27 Feb 2025
FedMentalCare: Towards Privacy-Preserving Fine-Tuned LLMs to Analyze Mental Health Status Using Federated Learning Framework
S M Sarwar
AI4MH
75
1
0
27 Feb 2025
Probabilistic Federated Prompt-Tuning with Non-IID and Imbalanced Data
Pei-Yau Weng
Minh Hoang
Lam M. Nguyen
My T. Thai
Tsui-Wei Weng
T. Hoang
FedML
153
4
0
27 Feb 2025
XCOMPS: A Multilingual Benchmark of Conceptual Minimal Pairs
Linyang He
Ercong Nie
Sukru Samet Dindar
Arsalan Firoozi
Adrian Nicolas Florea
...
Haotian Ye
Jonathan R. Brennan
Helmut Schmid
Hinrich Schütze
Nima Mesgarani
107
1
0
27 Feb 2025
From Retrieval to Generation: Comparing Different Approaches
Abdelrahman Abdallah
Jamshid Mozafari
Bhawna Piryani
Mohammed Ali
Adam Jatowt
RALM
106
0
0
27 Feb 2025
DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models
Weihao Wu
Zhiwei Lin
Yixuan Zhou
Jingbei Li
Rui Niu
Qinghua Wu
Songjun Cao
Long Ma
Zhiyong Wu
DiffM
84
0
0
27 Feb 2025
HALO: Hardware-aware quantization with low critical-path-delay weights for LLM acceleration
Rohan Juneja
Shivam Aggarwal
Safeen Huda
Tulika Mitra
L. Peh
80
0
0
27 Feb 2025
Teaching Dense Retrieval Models to Specialize with Listwise Distillation and LLM Data Augmentation
Manveer Singh Tamber
Suleman Kazi
Vivek Sourabh
Jimmy J. Lin
115
2
0
27 Feb 2025
LiGT: Layout-infused Generative Transformer for Visual Question Answering on Vietnamese Receipts
Thanh-Phong Le
Trung Le Chi Phan
Nghia Hieu Nguyen
Kiet Van Nguyen
ViT
89
1
0
26 Feb 2025
LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM
Yehonathan Refael
Iftach Arbel
Ofir Lindenbaum
Tom Tirer
169
1
0
26 Feb 2025
CLLoRA: An Approach to Measure the Effects of the Context Length for LLM Fine-Tuning
Ping Zhang
Zhaorui Zhang
Sheng Di
Yao Xin
Benben Liu
101
2
0
26 Feb 2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
Jinbo Wang
Mingze Wang
Zhanpeng Zhou
Junchi Yan
Weinan E
Lei Wu
152
2
0
26 Feb 2025
Evaluating LLMs and Pre-trained Models for Text Summarization Across Diverse Datasets
Tohida Rehman
Soumabha Ghosh
Kuntal Das
Souvik Bhattacharjee
Debarshi Kumar Sanyal
S. Chattopadhyay
121
0
0
26 Feb 2025
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
Michael Y. Hu
Jackson Petty
Chuan Shi
William Merrill
Tal Linzen
AI4CE
135
2
0
26 Feb 2025
Exploring Graph Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation
Yansen Wang
Xinnan Dai
Wenqi Fan
Yao Ma
144
2
0
26 Feb 2025
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
Jiawei Huang
Bingcong Li
Christoph Dann
Niao He
OffRL
266
3
0
26 Feb 2025
CAMEx: Curvature-aware Merging of Experts
Dung V. Nguyen
Minh H. Nguyen
Luc Q. Nguyen
R. Teo
T. Nguyen
Linh Duy Tran
MoMe
177
4
0
26 Feb 2025
(Mis)Fitting: A Survey of Scaling Laws
Margaret Li
Sneha Kudugunta
Luke Zettlemoyer
140
4
0
26 Feb 2025
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Taishi Nakamura
Takuya Akiba
Kazuki Fujii
Yusuke Oda
Rio Yokota
Jun Suzuki
MoMe
MoE
134
2
0
26 Feb 2025
Deciphering the complaint aspects: Towards an aspect-based complaint identification model with video complaint dataset in finance
Sarmistha Das
Basha Mujavarsheik
R E Zera Lyngkhoi
Sriparna Saha
Alka Maurya
50
0
0
26 Feb 2025
Winning Big with Small Models: Knowledge Distillation vs. Self-Training for Reducing Hallucination in QA Agents
A. Lewis
Michael White
Jing Liu
T. Koike-Akino
K. Parsons
Yanjie Wang
HILM
167
0
0
26 Feb 2025
Towards Sustainable Web Agents: A Plea for Transparency and Dedicated Metrics for Energy Consumption
Lars Krupp
Daniel Geißler
P. Lukowicz
Jakob Karolus
LLMAG
130
0
0
25 Feb 2025
Data Augmentation for Instruction Following Policies via Trajectory Segmentation
Niklas Höpner
Ilaria Tiddi
H. V. Hoof
82
0
0
25 Feb 2025
Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations
Lucy Farnik
Tim Lawson
Conor Houghton
Laurence Aitchison
107
1
0
25 Feb 2025
Self-Adjust Softmax
Chuanyang Zheng
Yihang Gao
Guoxuan Chen
Han Shi
Jing Xiong
Xiaozhe Ren
Chao Huang
Xin Jiang
Zhiyu Li
Yu Li
81
1
0
25 Feb 2025
Compressing Language Models for Specialized Domains
Miles Williams
G. Chrysostomou
Vitor Jeronymo
Nikolaos Aletras
MQ
118
0
0
25 Feb 2025
Predicting Through Generation: Why Generation Is Better for Prediction
Md. Kowsher
Nusrat Jahan Prottasha
Prakash Bhat
Chun-Nam Yu
Mojtaba Soltanalian
Ivan Garibay
O. Garibay
Chen Chen
Niloofar Yousefi
AI4TS
244
1
0
25 Feb 2025
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
Yashan Wang
Shangda Wu
Jianhuai Hu
Xingjian Du
Yueqi Peng
Yongxin Huang
Shuai Fan
Xiaobing Li
Feng Yu
Maosong Sun
225
2
0
25 Feb 2025
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation
Pengzhi Li
Pengfei Yu
Zide Liu
Wei He
Xuhao Pan
Xudong Rao
Tao Wei
Wei Chen
VLM
157
0
0
25 Feb 2025
On Synthetic Data Strategies for Domain-Specific Generative Retrieval
Haoyang Wen
Jiang Guo
Yi Zhang
Jiarong Jiang
Ziyi Wang
SyDa
125
1
0
25 Feb 2025
Opus: A Workflow Intention Framework for Complex Workflow Generation
Phillip Kingston
Théo Fagnoni
Mahsun Altin
33
0
0
25 Feb 2025
Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification
Vasilii Feofanov
Songkang Wen
Marius Alonso
Romain Ilbert
Hongbo Guo
Malik Tiomoko
Lujia Pan
Jianfeng Zhang
I. Redko
AI4TS
VLM
137
4
0
24 Feb 2025
Encryption-Friendly LLM Architecture
Donghwan Rho
Taeseong Kim
Minje Park
Jung Woo Kim
Hyunsik Chae
Jung Hee Cheon
Ernest K. Ryu
230
6
0
24 Feb 2025
Corrections Meet Explanations: A Unified Framework for Explainable Grammatical Error Correction
Jingheng Ye
Shang Qin
Hai-Tao Zheng
Hai-Tao Zheng
Shen Wang
Qingsong Wen
109
0
0
24 Feb 2025
Model Lakes
Koyena Pal
David Bau
Renée J. Miller
176
2
0
24 Feb 2025
Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models
Anirudh Sundar
Sinead Williamson
Katherine Metcalf
B. Theobald
Skyler Seto
Masha Fedzechkina
LLMSV
135
1
0
24 Feb 2025
Optimizing Singular Spectrum for Large Language Model Compression
Dengjie Li
Tiancheng Shen
Yao Zhou
Baisong Yang
Zhongying Liu
Masheng Yang
Guohao Li
Yibo Yang
Yujie Zhong
Ming-Hsuan Yang
79
1
0
24 Feb 2025
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
Viktor Moskvoretskii
M. Lysyuk
Mikhail Salnikov
Nikolay Ivanov
Sergey Pletenev
Daria Galimzianova
Nikita Krayko
Vasily Konovalov
Irina Nikishina
Alexander Panchenko
RALM
146
7
0
24 Feb 2025
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
Liming Liu
Zhenghao Xu
Zixuan Zhang
Hao Kang
Zichong Li
Chen Liang
Weizhu Chen
T. Zhao
409
3
0
24 Feb 2025
The Role of Sparsity for Length Generalization in Transformers
Noah Golowich
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
83
0
0
24 Feb 2025
Previous
1
2
3
...
18
19
20
...
196
197
198
Next