ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,870 papers shown
Title
TimesBERT: A BERT-Style Foundation Model for Time Series Understanding
TimesBERT: A BERT-Style Foundation Model for Time Series Understanding
Haoran Zhang
Yong Liu
Yunzhong Qiu
Haixuan Liu
Zhongyi Pei
Jianmin Wang
Mingsheng Long
AI4TS
68
1
0
28 Feb 2025
Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition
Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition
Yifei Duan
Raphael Shang
Deng Liang
Yongqiang Cai
128
0
0
28 Feb 2025
CoSMoEs: Compact Sparse Mixture of Experts
Patrick Huber
Akshat Shrivastava
Ernie Chang
Chinnadhurai Sankar
Ahmed Aly
Adithya Sagar
MoE
65
0
0
28 Feb 2025
UoR-NCL at SemEval-2025 Task 1: Using Generative LLMs and CLIP Models for Multilingual Multimodal Idiomaticity Representation
UoR-NCL at SemEval-2025 Task 1: Using Generative LLMs and CLIP Models for Multilingual Multimodal Idiomaticity Representation
Thanet Markchom
Tong Wu
Liting Huang
Huizhi Liang
161
1
0
28 Feb 2025
Learning to Substitute Components for Compositional Generalization
Learning to Substitute Components for Compositional Generalization
Zechao Li
Gangwei Jiang
Chenwang Wu
Ying Wei
Defu Lian
Enhong Chen
114
0
0
28 Feb 2025
Multimodal Learning for Just-In-Time Software Defect Prediction in Autonomous Driving Systems
Multimodal Learning for Just-In-Time Software Defect Prediction in Autonomous Driving Systems
Faisal Mohammad
Duksan Ryu
92
0
0
28 Feb 2025
Teach-to-Reason with Scoring: Self-Explainable Rationale-Driven Multi-Trait Essay Scoring
Teach-to-Reason with Scoring: Self-Explainable Rationale-Driven Multi-Trait Essay Scoring
Heejin Do
Sangwon Ryu
Gary Geunbae Lee
LRM
88
0
0
28 Feb 2025
Identifying Emerging Concepts in Large Corpora
Identifying Emerging Concepts in Large Corpora
Sibo Ma
Julian Nyarko
69
0
0
28 Feb 2025
Optimizing Large Language Models for ESG Activity Detection in Financial Texts
Optimizing Large Language Models for ESG Activity Detection in Financial Texts
Mattia Birti
Francesco Osborne
Andrea Maurino
86
0
0
28 Feb 2025
Advancements in Natural Language Processing for Automatic Text Summarization
Advancements in Natural Language Processing for Automatic Text Summarization
Nevidu Jayatilleke
Ruvan Weerasinghe
Nipuna Senanayake
370
1
0
27 Feb 2025
EdiText: Controllable Coarse-to-Fine Text Editing with Diffusion Language Models
EdiText: Controllable Coarse-to-Fine Text Editing with Diffusion Language Models
Che Hyun Lee
Heeseung Kim
Jiheum Yeom
Sungroh Yoon
DiffM
121
1
0
27 Feb 2025
FedMentalCare: Towards Privacy-Preserving Fine-Tuned LLMs to Analyze Mental Health Status Using Federated Learning Framework
S M Sarwar
AI4MH
75
1
0
27 Feb 2025
Probabilistic Federated Prompt-Tuning with Non-IID and Imbalanced Data
Probabilistic Federated Prompt-Tuning with Non-IID and Imbalanced Data
Pei-Yau Weng
Minh Hoang
Lam M. Nguyen
My T. Thai
Tsui-Wei Weng
T. Hoang
FedML
153
4
0
27 Feb 2025
XCOMPS: A Multilingual Benchmark of Conceptual Minimal Pairs
XCOMPS: A Multilingual Benchmark of Conceptual Minimal Pairs
Linyang He
Ercong Nie
Sukru Samet Dindar
Arsalan Firoozi
Adrian Nicolas Florea
...
Haotian Ye
Jonathan R. Brennan
Helmut Schmid
Hinrich Schütze
Nima Mesgarani
107
1
0
27 Feb 2025
From Retrieval to Generation: Comparing Different Approaches
From Retrieval to Generation: Comparing Different Approaches
Abdelrahman Abdallah
Jamshid Mozafari
Bhawna Piryani
Mohammed Ali
Adam Jatowt
RALM
106
0
0
27 Feb 2025
DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models
DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models
Weihao Wu
Zhiwei Lin
Yixuan Zhou
Jingbei Li
Rui Niu
Qinghua Wu
Songjun Cao
Long Ma
Zhiyong Wu
DiffM
84
0
0
27 Feb 2025
HALO: Hardware-aware quantization with low critical-path-delay weights for LLM acceleration
HALO: Hardware-aware quantization with low critical-path-delay weights for LLM acceleration
Rohan Juneja
Shivam Aggarwal
Safeen Huda
Tulika Mitra
L. Peh
80
0
0
27 Feb 2025
Teaching Dense Retrieval Models to Specialize with Listwise Distillation and LLM Data Augmentation
Teaching Dense Retrieval Models to Specialize with Listwise Distillation and LLM Data Augmentation
Manveer Singh Tamber
Suleman Kazi
Vivek Sourabh
Jimmy J. Lin
115
2
0
27 Feb 2025
LiGT: Layout-infused Generative Transformer for Visual Question Answering on Vietnamese Receipts
LiGT: Layout-infused Generative Transformer for Visual Question Answering on Vietnamese Receipts
Thanh-Phong Le
Trung Le Chi Phan
Nghia Hieu Nguyen
Kiet Van Nguyen
ViT
89
1
0
26 Feb 2025
LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM
LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM
Yehonathan Refael
Iftach Arbel
Ofir Lindenbaum
Tom Tirer
169
1
0
26 Feb 2025
CLLoRA: An Approach to Measure the Effects of the Context Length for LLM Fine-Tuning
CLLoRA: An Approach to Measure the Effects of the Context Length for LLM Fine-Tuning
Ping Zhang
Zhaorui Zhang
Sheng Di
Yao Xin
Benben Liu
101
2
0
26 Feb 2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
Jinbo Wang
Mingze Wang
Zhanpeng Zhou
Junchi Yan
Weinan E
Lei Wu
152
2
0
26 Feb 2025
Evaluating LLMs and Pre-trained Models for Text Summarization Across Diverse Datasets
Evaluating LLMs and Pre-trained Models for Text Summarization Across Diverse Datasets
Tohida Rehman
Soumabha Ghosh
Kuntal Das
Souvik Bhattacharjee
Debarshi Kumar Sanyal
S. Chattopadhyay
121
0
0
26 Feb 2025
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
Michael Y. Hu
Jackson Petty
Chuan Shi
William Merrill
Tal Linzen
AI4CE
135
2
0
26 Feb 2025
Exploring Graph Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation
Exploring Graph Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation
Yansen Wang
Xinnan Dai
Wenqi Fan
Yao Ma
144
2
0
26 Feb 2025
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
Jiawei Huang
Bingcong Li
Christoph Dann
Niao He
OffRL
266
3
0
26 Feb 2025
CAMEx: Curvature-aware Merging of Experts
CAMEx: Curvature-aware Merging of Experts
Dung V. Nguyen
Minh H. Nguyen
Luc Q. Nguyen
R. Teo
T. Nguyen
Linh Duy Tran
MoMe
177
4
0
26 Feb 2025
(Mis)Fitting: A Survey of Scaling Laws
(Mis)Fitting: A Survey of Scaling Laws
Margaret Li
Sneha Kudugunta
Luke Zettlemoyer
140
4
0
26 Feb 2025
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
Taishi Nakamura
Takuya Akiba
Kazuki Fujii
Yusuke Oda
Rio Yokota
Jun Suzuki
MoMeMoE
134
2
0
26 Feb 2025
Deciphering the complaint aspects: Towards an aspect-based complaint identification model with video complaint dataset in finance
Sarmistha Das
Basha Mujavarsheik
R E Zera Lyngkhoi
Sriparna Saha
Alka Maurya
50
0
0
26 Feb 2025
Winning Big with Small Models: Knowledge Distillation vs. Self-Training for Reducing Hallucination in QA Agents
Winning Big with Small Models: Knowledge Distillation vs. Self-Training for Reducing Hallucination in QA Agents
A. Lewis
Michael White
Jing Liu
T. Koike-Akino
K. Parsons
Yanjie Wang
HILM
167
0
0
26 Feb 2025
Towards Sustainable Web Agents: A Plea for Transparency and Dedicated Metrics for Energy Consumption
Towards Sustainable Web Agents: A Plea for Transparency and Dedicated Metrics for Energy Consumption
Lars Krupp
Daniel Geißler
P. Lukowicz
Jakob Karolus
LLMAG
130
0
0
25 Feb 2025
Data Augmentation for Instruction Following Policies via Trajectory Segmentation
Niklas Höpner
Ilaria Tiddi
H. V. Hoof
82
0
0
25 Feb 2025
Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations
Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations
Lucy Farnik
Tim Lawson
Conor Houghton
Laurence Aitchison
107
1
0
25 Feb 2025
Self-Adjust Softmax
Self-Adjust Softmax
Chuanyang Zheng
Yihang Gao
Guoxuan Chen
Han Shi
Jing Xiong
Xiaozhe Ren
Chao Huang
Xin Jiang
Zhiyu Li
Yu Li
81
1
0
25 Feb 2025
Compressing Language Models for Specialized Domains
Compressing Language Models for Specialized Domains
Miles Williams
G. Chrysostomou
Vitor Jeronymo
Nikolaos Aletras
MQ
118
0
0
25 Feb 2025
Predicting Through Generation: Why Generation Is Better for Prediction
Predicting Through Generation: Why Generation Is Better for Prediction
Md. Kowsher
Nusrat Jahan Prottasha
Prakash Bhat
Chun-Nam Yu
Mojtaba Soltanalian
Ivan Garibay
O. Garibay
Chen Chen
Niloofar Yousefi
AI4TS
244
1
0
25 Feb 2025
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
Yashan Wang
Shangda Wu
Jianhuai Hu
Xingjian Du
Yueqi Peng
Yongxin Huang
Shuai Fan
Xiaobing Li
Feng Yu
Maosong Sun
225
2
0
25 Feb 2025
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation
Pengzhi Li
Pengfei Yu
Zide Liu
Wei He
Xuhao Pan
Xudong Rao
Tao Wei
Wei Chen
VLM
157
0
0
25 Feb 2025
On Synthetic Data Strategies for Domain-Specific Generative Retrieval
On Synthetic Data Strategies for Domain-Specific Generative Retrieval
Haoyang Wen
Jiang Guo
Yi Zhang
Jiarong Jiang
Ziyi Wang
SyDa
125
1
0
25 Feb 2025
Opus: A Workflow Intention Framework for Complex Workflow Generation
Opus: A Workflow Intention Framework for Complex Workflow Generation
Phillip Kingston
Théo Fagnoni
Mahsun Altin
33
0
0
25 Feb 2025
Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification
Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification
Vasilii Feofanov
Songkang Wen
Marius Alonso
Romain Ilbert
Hongbo Guo
Malik Tiomoko
Lujia Pan
Jianfeng Zhang
I. Redko
AI4TSVLM
137
4
0
24 Feb 2025
Encryption-Friendly LLM Architecture
Encryption-Friendly LLM Architecture
Donghwan Rho
Taeseong Kim
Minje Park
Jung Woo Kim
Hyunsik Chae
Jung Hee Cheon
Ernest K. Ryu
230
6
0
24 Feb 2025
Corrections Meet Explanations: A Unified Framework for Explainable Grammatical Error Correction
Corrections Meet Explanations: A Unified Framework for Explainable Grammatical Error Correction
Jingheng Ye
Shang Qin
Hai-Tao Zheng
Hai-Tao Zheng
Shen Wang
Qingsong Wen
109
0
0
24 Feb 2025
Model Lakes
Model Lakes
Koyena Pal
David Bau
Renée J. Miller
176
2
0
24 Feb 2025
Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models
Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models
Anirudh Sundar
Sinead Williamson
Katherine Metcalf
B. Theobald
Skyler Seto
Masha Fedzechkina
LLMSV
135
1
0
24 Feb 2025
Optimizing Singular Spectrum for Large Language Model Compression
Dengjie Li
Tiancheng Shen
Yao Zhou
Baisong Yang
Zhongying Liu
Masheng Yang
Guohao Li
Yibo Yang
Yujie Zhong
Ming-Hsuan Yang
79
1
0
24 Feb 2025
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
Viktor Moskvoretskii
M. Lysyuk
Mikhail Salnikov
Nikolay Ivanov
Sergey Pletenev
Daria Galimzianova
Nikita Krayko
Vasily Konovalov
Irina Nikishina
Alexander Panchenko
RALM
146
7
0
24 Feb 2025
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
Liming Liu
Zhenghao Xu
Zixuan Zhang
Hao Kang
Zichong Li
Chen Liang
Weizhu Chen
T. Zhao
409
3
0
24 Feb 2025
The Role of Sparsity for Length Generalization in Transformers
The Role of Sparsity for Length Generalization in Transformers
Noah Golowich
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
83
0
0
24 Feb 2025
Previous
123...181920...196197198
Next