ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,843 papers shown
Title
Towards Universal Offline Black-Box Optimization via Learning Language Model Embeddings
Towards Universal Offline Black-Box Optimization via Learning Language Model Embeddings
Rong-Xi Tan
Ming Chen
Ke Xue
Yao Wang
Yaoyuan Wang
Sheng Fu
Chao Qian
OffRL
15
0
0
08 Jun 2025
ConfQA: Answer Only If You Are Confident
ConfQA: Answer Only If You Are Confident
Yin Huang
Yifan Ethan Xu
Kai Sun
Vera Yan
Alicia Sun
...
Yue Liu
Aaron Colak
Anuj Kumar
Wen-tau Yih
Xin Luna Dong
HILM
13
0
0
08 Jun 2025
DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains
DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains
Zhihui Chen
Kai He
Yucheng Huang
Yunxiao Zhu
Mengling Feng
DeLMOMedIm
19
0
0
07 Jun 2025
EndoARSS: Adapting Spatially-Aware Foundation Model for Efficient Activity Recognition and Semantic Segmentation in Endoscopic Surgery
EndoARSS: Adapting Spatially-Aware Foundation Model for Efficient Activity Recognition and Semantic Segmentation in Endoscopic Surgery
Guankun Wang
Rui Tang
Mengya Xu
Long Bai
Huxin Gao
Hongliang Ren
23
0
0
07 Jun 2025
Hybrid Extractive Abstractive Summarization for Multilingual Sentiment Analysis
Hybrid Extractive Abstractive Summarization for Multilingual Sentiment Analysis
Mikhail Krasitskii
Grigori Sidorov
Olga Kolesnikova
Liliana Chanona Hernandez
Alexander Gelbukh
13
0
0
07 Jun 2025
Spark Transformer: Reactivating Sparsity in FFN and Attention
Spark Transformer: Reactivating Sparsity in FFN and Attention
Chong You
Kan Wu
Zhipeng Jia
Lin Chen
Srinadh Bhojanapalli
...
Felix X. Yu
Prateek Jain
David Culler
Henry M. Levy
Sanjiv Kumar
19
0
0
07 Jun 2025
Can In-Context Reinforcement Learning Recover From Reward Poisoning Attacks?
Can In-Context Reinforcement Learning Recover From Reward Poisoning Attacks?
Paulius Sasnauskas
Yiğit Yalın
Goran Radanović
15
0
0
07 Jun 2025
SAFE: Finding Sparse and Flat Minima to Improve Pruning
SAFE: Finding Sparse and Flat Minima to Improve Pruning
Dongyeop Lee
Kwanhee Lee
Jinseok Chung
Namhoon Lee
28
0
0
07 Jun 2025
Advancing Question Generation with Joint Narrative and Difficulty Control
Advancing Question Generation with Joint Narrative and Difficulty Control
Bernardo Leite
Henrique Lopes Cardoso
15
0
0
07 Jun 2025
Label-semantics Aware Generative Approach for Domain-Agnostic Multilabel Classification
Label-semantics Aware Generative Approach for Domain-Agnostic Multilabel Classification
Subhendu Khatuya
Shashwat Naidu
Saptarshi Ghosh
Pawan Goyal
Niloy Ganguly
VLM
22
0
0
07 Jun 2025
Mitigating Catastrophic Forgetting with Adaptive Transformer Block Expansion in Federated Fine-Tuning
Mitigating Catastrophic Forgetting with Adaptive Transformer Block Expansion in Federated Fine-Tuning
Yujia Huo
Jianchun Liu
Hongli Xu
Zhenguo Ma
Shilong Wang
Liusheng Huang
CLL
43
0
0
06 Jun 2025
Zero-Shot Detection of LLM-Generated Code via Approximated Task Conditioning
Zero-Shot Detection of LLM-Generated Code via Approximated Task Conditioning
Maor Ashkenazi
Ofir Brenner
Tal Furman Shohet
Eran Treister
48
0
0
06 Jun 2025
Hey, That's My Data! Label-Only Dataset Inference in Large Language Models
Hey, That's My Data! Label-Only Dataset Inference in Large Language Models
Chen Xiong
Zihao Wang
Rui Zhu
Tsung-Yi Ho
Pin-Yu Chen
Jingwei Xiong
Haixu Tang
Lucila Ohno-Machado
52
0
0
06 Jun 2025
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Jiatao Gu
Tianrong Chen
David Berthelot
Huangjie Zheng
Yuyang Wang
Ruixiang Zhang
Laurent Dinh
Miguel Angel Bautista
Josh Susskind
Shuangfei Zhai
43
0
0
06 Jun 2025
Contextually Guided Transformers via Low-Rank Adaptation
Contextually Guided Transformers via Low-Rank Adaptation
A. Zhmoginov
Jihwan Lee
Max Vladymyrov
Mark Sandler
OffRL
55
0
0
06 Jun 2025
Large Language Models are Good Relational Learners
Large Language Models are Good Relational Learners
Fang Wu
Vijay Prakash Dwivedi
Jure Leskovec
38
0
0
06 Jun 2025
(LiFT) Lightweight Fitness Transformer: A language-vision model for Remote Monitoring of Physical Training
(LiFT) Lightweight Fitness Transformer: A language-vision model for Remote Monitoring of Physical Training
A. Postlmayr
P. Cosman
S. Dey
20
0
0
06 Jun 2025
Text-to-LoRA: Instant Transformer Adaption
Text-to-LoRA: Instant Transformer Adaption
Rujikorn Charakorn
Edoardo Cetin
Yujin Tang
Robert Tjarko Lange
AI4CE
51
0
0
06 Jun 2025
Projectable Models: One-Shot Generation of Small Specialized Transformers from Large Ones
Projectable Models: One-Shot Generation of Small Specialized Transformers from Large Ones
A. Zhmoginov
Jihwan Lee
Mark Sandler
37
0
0
06 Jun 2025
Generating Grounded Responses to Counter Misinformation via Learning Efficient Fine-Grained Critiques
Generating Grounded Responses to Counter Misinformation via Learning Efficient Fine-Grained Critiques
Xiaofei Xu
Xiuzhen Zhang
Ke Deng
HILM
39
0
0
06 Jun 2025
BAQ: Efficient Bit Allocation Quantization for Large Language Models
BAQ: Efficient Bit Allocation Quantization for Large Language Models
Chao Zhang
Li Wang
S. Lasaulce
Mérouane Debbah
MQ
62
0
0
06 Jun 2025
Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical Techniques
Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical Techniques
Adarsh Prasad Behera
J. Champati
Roberto Morabito
Sasu Tarkoma
J. Gross
21
0
0
06 Jun 2025
SoK: Are Watermarks in LLMs Ready for Deployment?
SoK: Are Watermarks in LLMs Ready for Deployment?
Kieu Dang
Phung Lai
Nhathai Phan
Yelong Shen
Ruoming Jin
Abdallah Khreishah
My T. Thai
27
0
0
05 Jun 2025
Seamless Dysfluent Speech Text Alignment for Disordered Speech Analysis
Seamless Dysfluent Speech Text Alignment for Disordered Speech Analysis
Zongli Ye
Jiachen Lian
Xuanru Zhou
Jinming Zhang
Haodong Li
...
Rian Bogley
Lisa Wauters
Zachary Miller
M. G. Tempini
Gopala Anumanchipalli
17
0
0
05 Jun 2025
ContentV: Efficient Training of Video Generation Models with Limited Compute
Wenfeng Lin
Renjie Chen
Boyuan Liu
Shiyue Yan
Ruoyu Feng
...
Chao Feng
Jiao Ran
Qi Wu
Zuotao Liu
Mingyu Guo
VGen
104
0
0
05 Jun 2025
Power Law Guided Dynamic Sifting for Efficient Attention
Nirav Koley
Prajwal Singhania
A. Bhatele
160
0
0
05 Jun 2025
Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models
Taha Entesari
Arman Hatami
Rinat Khaziev
Anil Ramakrishna
Mahyar Fazlyab
MU
112
0
0
05 Jun 2025
SECNEURON: Reliable and Flexible Abuse Control in Local LLMs via Hybrid Neuron Encryption
Zhiqiang Wang
Haohua Du
Junyang Wang
Haifeng Sun
Kaiwen Guo
Haikuo Yu
Chao Liu
Xiang-Yang Li
AAML
127
0
0
05 Jun 2025
Improving Low-Resource Morphological Inflection via Self-Supervised Objectives
Adam Wiemerslage
Katharina von der Wense
97
0
0
05 Jun 2025
Controlling Summarization Length Through EOS Token Weighting
Zeno Belligoli
Emmanouil Stergiadis
Eran Fainman
Ilya Gusev
83
0
0
05 Jun 2025
Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models
Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models
Thao Nguyen
Yang Li
O. Yu. Golovneva
Luke Zettlemoyer
Sewoong Oh
Ludwig Schmidt
Xian Li
OnRL
146
0
0
05 Jun 2025
StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models
StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models
Ya Jiang
Chuxiong Wu
Massieh Kordi Boroujeny
Brian L. Mark
Kai Zeng
WaLM
33
0
0
05 Jun 2025
Theoretical Analysis of Positional Encodings in Transformer Models: Impact on Expressiveness and Generalization
Theoretical Analysis of Positional Encodings in Transformer Models: Impact on Expressiveness and Generalization
Yin Li
13
0
0
05 Jun 2025
RELIC: Evaluating Compositional Instruction Following via Language Recognition
Jackson Petty
Michael Y. Hu
Wentao Wang
Shauli Ravfogel
William Merrill
Tal Linzen
75
0
0
05 Jun 2025
Intelligibility of Text-to-Speech Systems for Mathematical Expressions
Intelligibility of Text-to-Speech Systems for Mathematical Expressions
Sujoy Roychowdhury
H. G. Ranjani
Sumit Soman
Nishtha Paul
Subhadip Bandyopadhyay
Siddhanth Iyengar
17
0
0
05 Jun 2025
MANBench: Is Your Multimodal Model Smarter than Human?
MANBench: Is Your Multimodal Model Smarter than Human?
Han Zhou
Qitong Xu
Yiheng Dong
Xin Yang
17
0
0
04 Jun 2025
Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order
Egor Petrov
Grigoriy Evseev
Aleksey Antonov
Andrey Veprikov
Pavel Plyusnin
Nikolay Bushkov
Stanislav Moiseev
Aleksandr Beznosikov
72
0
0
04 Jun 2025
AhaKV: Adaptive Holistic Attention-Driven KV Cache Eviction for Efficient Inference of Large Language Models
AhaKV: Adaptive Holistic Attention-Driven KV Cache Eviction for Efficient Inference of Large Language Models
Yifeng Gu
Zicong Jiang
Jianxiu Jin
K. Guo
Ziyang Zhang
Xiangmin Xu
101
0
0
04 Jun 2025
Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models
Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models
Seungcheol Park
Jeongin Bae
Beomseok Kwon
Minjun Kim
Byeongwook Kim
S. Kwon
U. Kang
Dongsoo Lee
MQ
129
0
0
04 Jun 2025
ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations
ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations
Quang Hieu Pham
T. Nguyen
Tung Pham
Anh Tuan Luu
Dat Quoc Nguyen
ReLMLRM
136
0
0
04 Jun 2025
Evaluating Apple Intelligence's Writing Tools for Privacy Against Large Language Model-Based Inference Attacks: Insights from Early Datasets
Evaluating Apple Intelligence's Writing Tools for Privacy Against Large Language Model-Based Inference Attacks: Insights from Early Datasets
Mohd. Farhan Israk Soumik
Syed Mhamudul Hasan
Abdur R. Shahid
92
0
0
04 Jun 2025
LayerFlow: A Unified Model for Layer-aware Video Generation
LayerFlow: A Unified Model for Layer-aware Video Generation
S. Ji
Hao Luo
Xi Chen
Yuanpeng Tu
Yiyang Wang
Hengshuang Zhao
VGenOffRL
79
0
0
04 Jun 2025
Structured Pruning for Diverse Best-of-N Reasoning Optimization
Structured Pruning for Diverse Best-of-N Reasoning Optimization
Hieu Trung Nguyen
Bao Nguyen
Viet Anh Nguyen
LRM
69
0
0
04 Jun 2025
ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices
ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices
Hao Yu
Tangyu Jiang
Shuning Jia
Shannan Yan
Shunning Liu
Haolong Qian
Guanghao Li
Shuting Dong
Huaisong Zhang
Chun Yuan
96
0
0
04 Jun 2025
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
Ziyi Wu
Anil Kag
Ivan Skorokhodov
Willi Menapace
Ashkan Mirzaei
Igor Gilitschenski
Sergey Tulyakov
Aliaksandr Siarohin
DiffMVGen
62
0
0
04 Jun 2025
MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP
MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP
Kurt Micallef
Claudia Borg
27
0
0
04 Jun 2025
LexTime: A Benchmark for Temporal Ordering of Legal Events
LexTime: A Benchmark for Temporal Ordering of Legal Events
Claire Barale
Leslie Barrett
Vikram Sunil Bajaj
Michael Rovatsos
AILaw
112
0
0
04 Jun 2025
HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring
HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring
Zhixiong Su
Yichen Wang
Herun Wan
Zhaohan Zhang
Minnan Luo
DeLMO
52
0
0
03 Jun 2025
Trajectory Prediction Meets Large Language Models: A Survey
Trajectory Prediction Meets Large Language Models: A Survey
Yi Xu
Ruining Yang
Yitian Zhang
Yizhou Wang
Jianglin Lu
M. Zhang
Lili Su
Y. Fu
LM&RoLRM
16
0
0
03 Jun 2025
Advancing Decoding Strategies: Enhancements in Locally Typical Sampling for LLMs
Advancing Decoding Strategies: Enhancements in Locally Typical Sampling for LLMs
Jaydip Sen
Saptarshi Sengupta
S. Dasgupta
20
0
0
03 Jun 2025
Previous
123456...195196197
Next