ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,877 papers shown
Title
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models
Zhenxing Mi
Kuan-Chieh Wang
Guocheng Qian
Hanrong Ye
Runtao Liu
Sergey Tulyakov
Kfir Aberman
Dan Xu
LRM
97
2
0
12 Feb 2025
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation
H. Seo
Wongi Jeong
Jae-sun Seo
Se Young Chun
140
0
0
12 Feb 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
91
0
0
11 Feb 2025
Tractable Transformers for Flexible Conditional Generation
Tractable Transformers for Flexible Conditional Generation
Hoang Trung-Dung
Xuejie Liu
Dayuan Zhao
Mathias Niepert
Yitao Liang
Guy Van den Broeck
80
0
0
11 Feb 2025
Hallucination, Monofacts, and Miscalibration: An Empirical Investigation
Hallucination, Monofacts, and Miscalibration: An Empirical Investigation
Miranda Muqing Miao
Michael Kearns
160
0
0
11 Feb 2025
When Incentives Backfire, Data Stops Being Human
When Incentives Backfire, Data Stops Being Human
Sebastin Santy
Prasanta Bhattacharya
Manoel Horta Ribeiro
Kelsey Allen
Sewoong Oh
200
0
0
11 Feb 2025
Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining
Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining
Daouda Sow
Herbert Woisetschläger
Saikiran Bulusu
Shiqiang Wang
Hans-Arno Jacobsen
Yingbin Liang
142
6
0
10 Feb 2025
UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths
Weijia Mao
Zhiyong Yang
Mike Zheng Shou
MoE
198
1
0
10 Feb 2025
RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset
RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset
Naome A. Etori
Maria Gini
169
3
0
10 Feb 2025
MoETuner: Optimized Mixture of Expert Serving with Balanced Expert Placement and Token Routing
Seokjin Go
Divya Mahajan
MoE
127
0
0
10 Feb 2025
Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection
Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection
Maximilian Spliethover
Tim Knebler
Fabian Fumagalli
Maximilian Muschalik
Barbara Hammer
Eyke Hüllermeier
Henning Wachsmuth
188
1
0
10 Feb 2025
Gradient Multi-Normalization for Stateless and Scalable LLM Training
Gradient Multi-Normalization for Stateless and Scalable LLM Training
M. Scetbon
Chao Ma
Wenbo Gong
Edward Meeds
179
1
0
10 Feb 2025
Do we really have to filter out random noise in pre-training data for language models?
Do we really have to filter out random noise in pre-training data for language models?
Jinghan Ru
Yuxin Xie
Xianwei Zhuang
Yuguo Yin
Zhihui Guo
Zhiming Liu
Qianli Ren
Yuexian Zou
193
6
0
10 Feb 2025
CORRECT: Context- and Reference-Augmented Reasoning and Prompting for Fact-Checking
CORRECT: Context- and Reference-Augmented Reasoning and Prompting for Fact-Checking
Delvin Ce Zhang
Dongwon Lee
HILMLRM
171
0
0
09 Feb 2025
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer
Xinyu Liu
Ailing Zeng
Wei Xue
Harry Yang
Wenhan Luo
Qifeng Liu
Yike Guo
VGen
406
1
0
09 Feb 2025
MixLLM: Dynamic Routing in Mixed Large Language Models
MixLLM: Dynamic Routing in Mixed Large Language Models
Xinyuan Wang
Yanchi Liu
Wei Cheng
Xujiang Zhao
Zhe Chen
Wenchao Yu
Yanjie Fu
Haifeng Chen
143
6
0
09 Feb 2025
Towards Sustainable NLP: Insights from Benchmarking Inference Energy in Large Language Models
Towards Sustainable NLP: Insights from Benchmarking Inference Energy in Large Language Models
S. Poddar
Paramita Koley
Janardan Misra
Niloy Ganguly
Saptarshi Ghosh
Saptarshi Ghosh
149
0
0
08 Feb 2025
Design Considerations in Offline Preference-based RL
Design Considerations in Offline Preference-based RL
Alekh Agarwal
Christoph Dann
T. V. Marinov
OffRL
108
1
0
08 Feb 2025
OntoTune: Ontology-Driven Self-training for Aligning Large Language Models
OntoTune: Ontology-Driven Self-training for Aligning Large Language Models
Zhiqiang Liu
Chengtao Gan
Junjie Wang
Yanzhe Zhang
Zhongpu Bo
Mengshu Sun
Ningyu Zhang
Wen Zhang
120
2
0
08 Feb 2025
Towards the Development of Balanced Synthetic Data for Correcting Grammatical Errors in Arabic: An Approach Based on Error Tagging Model and Synthetic Data Generating Model
Towards the Development of Balanced Synthetic Data for Correcting Grammatical Errors in Arabic: An Approach Based on Error Tagging Model and Synthetic Data Generating Model
Ahlam Alrehili
Areej Alhothali
130
0
0
07 Feb 2025
MindAligner: Explicit Brain Functional Alignment for Cross-Subject Visual Decoding from Limited fMRI Data
MindAligner: Explicit Brain Functional Alignment for Cross-Subject Visual Decoding from Limited fMRI Data
Yuqin Dai
Zhouheng Yao
Chunfeng Song
Qihao Zheng
Weijian Mai
Kunyu Peng
Shuai Lu
Wanli Ouyang
Jian Yang
Jiamin Wu
487
2
0
07 Feb 2025
Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks
Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks
Jing Yang
Max Glockner
Anderson de Rezende Rocha
Iryna Gurevych
LRM
159
1
0
07 Feb 2025
BOUQuET: dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation
BOUQuET: dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation
Omnilingual MT Team
Pierre Yves Andrews
Mikel Artetxe
Mariano Coria Meglioli
Marta R. Costa-jussá
...
Eduardo Sánchez
Ioannis Tsiamas
Arina Turkatenko
Albert Ventayol-Boada
Shireen Yates
181
0
0
06 Feb 2025
FairT2I: Mitigating Social Bias in Text-to-Image Generation via Large Language Model-Assisted Detection and Attribute Rebalancing
FairT2I: Mitigating Social Bias in Text-to-Image Generation via Large Language Model-Assisted Detection and Attribute Rebalancing
Jinya Sakurai
Issei Sato
151
1
0
06 Feb 2025
Decoder-Only LLMs are Better Controllers for Diffusion Models
Decoder-Only LLMs are Better Controllers for Diffusion Models
Ziyi Dong
Yao Xiao
Pengxu Wei
Liang Lin
DiffM
214
0
0
06 Feb 2025
Teaching Large Language Models Number-Focused Headline Generation With Key Element Rationales
Teaching Large Language Models Number-Focused Headline Generation With Key Element Rationales
Zhen Qian
Xiuzhen Zhang
Xiaofei Xu
Xiwei Xu
LRM
72
0
0
05 Feb 2025
FuXi-$\alpha$: Scaling Recommendation Model with Feature Interaction Enhanced Transformer
FuXi-α\alphaα: Scaling Recommendation Model with Feature Interaction Enhanced Transformer
Yufei Ye
Wei Guo
Jin Yao Chin
Hao Wang
Hong Zhu
...
Yuyang Ye
Yixiao Liu
Ruiming Tang
Defu Lian
Enhong Chen
149
2
0
05 Feb 2025
ALPET: Active Few-shot Learning for Citation Worthiness Detection in Low-Resource Wikipedia Languages
ALPET: Active Few-shot Learning for Citation Worthiness Detection in Low-Resource Wikipedia Languages
Aida Halitaj
Arkaitz Zubiaga
115
0
0
05 Feb 2025
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache
Rishabh Tiwari
Haocheng Xi
Aditya Tomar
Coleman Hooper
Sehoon Kim
Maxwell Horton
Mahyar Najibi
Michael W. Mahoney
Kemal Kurniawan
Amir Gholami
MQ
112
5
0
05 Feb 2025
IAO Prompting: Making Knowledge Flow Explicit in LLMs through Structured Reasoning Templates
IAO Prompting: Making Knowledge Flow Explicit in LLMs through Structured Reasoning Templates
Aissatou Diallo
Antonis Bikakis
Luke Dickens
Anthony Hunter
Rob Miller
LRM
104
0
0
05 Feb 2025
Fine-tuning Language Models for Recipe Generation: A Comparative Analysis and Benchmark Study
Fine-tuning Language Models for Recipe Generation: A Comparative Analysis and Benchmark Study
Anneketh Vij
Changhao Liu
Rahul Anil Nair
Theo Ho
Edward Shi
Ayan Bhowmick
144
1
0
04 Feb 2025
Learning the RoPEs: Better 2D and 3D Position Encodings with STRING
Learning the RoPEs: Better 2D and 3D Position Encodings with STRING
Connor Schenck
Isaac Reid
M. Jacob
Alex Bewley
Joshua Ainslie
...
Matthias Minderer
Dmitry Kalashnikov
Jonathan Tompson
Vikas Sindhwani
Krzysztof Choromanski
100
1
0
04 Feb 2025
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Xiang Liu
Zhenheng Tang
Hong Chen
Peijie Dong
Zeyu Li
Xiuze Zhou
Bo Li
Xuming Hu
Xiaowen Chu
473
7
0
04 Feb 2025
DERMARK: A Dynamic, Efficient and Robust Multi-bit Watermark for Large Language Models
DERMARK: A Dynamic, Efficient and Robust Multi-bit Watermark for Large Language Models
Qihao Lin
Chen Tang
Lan zhang
Junyang Zhang
Xiangyang Li
WaLM
131
0
0
04 Feb 2025
Process-Supervised Reinforcement Learning for Code Generation
Process-Supervised Reinforcement Learning for Code Generation
Yufan Ye
Ting Zhang
Wenbin Jiang
Hua Huang
OffRLLRMSyDa
114
1
0
03 Feb 2025
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Ahmed Masry
Juan A. Rodriguez
Tianyu Zhang
Suyuchen Wang
Chao Wang
...
I. Laradji
David Vazquez
Perouz Taslakian
Spandana Gella
Sai Rajeswar
96
0
0
03 Feb 2025
Explaining Context Length Scaling and Bounds for Language Models
Explaining Context Length Scaling and Bounds for Language Models
Jingzhe Shi
Qinwei Ma
Hongyi Liu
Hang Zhao
Jeng-Neng Hwang
Lei Li
LRM
258
3
0
03 Feb 2025
Choose Your Model Size: Any Compression by a Single Gradient Descent
Choose Your Model Size: Any Compression by a Single Gradient Descent
Martin Genzel
Patrick Putzky
Pengfei Zhao
Siyang Song
Mattes Mollenhauer
Robert Seidel
Stefan Dietzel
Thomas Wollmann
110
0
0
03 Feb 2025
Harmful Terms and Where to Find Them: Measuring and Modeling Unfavorable Financial Terms and Conditions in Shopping Websites at Scale
Harmful Terms and Where to Find Them: Measuring and Modeling Unfavorable Financial Terms and Conditions in Shopping Websites at Scale
Elisa Tsai
Neal Mangaokar
Boyuan Zheng
Haizhong Zheng
A. Prakash
80
0
0
03 Feb 2025
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Borui Xu
Yao Chen
Zeyi Wen
Weiguo Liu
Bingsheng He
188
2
0
02 Feb 2025
SecPE: Secure Prompt Ensembling for Private and Robust Large Language Models
SecPE: Secure Prompt Ensembling for Private and Robust Large Language Models
Jiawen Zhang
Kejia Chen
Zunlei Feng
Jian Lou
Mingli Song
Qingbin Liu
Xiaoyu Yang
AAMLSILMFedML
171
1
0
02 Feb 2025
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Binchi Zhang
Zaiyi Zheng
Zhengzhang Chen
Wenlin Yao
212
1
0
01 Feb 2025
FinchGPT: a Transformer based language model for birdsong analysis
FinchGPT: a Transformer based language model for birdsong analysis
Kosei Kobayashi
Kosuke Matsuzaki
Masaya Taniguchi
Keisuke Sakaguchi
Kentaro Inui
Kentaro Abe
99
1
0
01 Feb 2025
On the Impact of Noise in Differentially Private Text Rewriting
On the Impact of Noise in Differentially Private Text Rewriting
Stephen Meisenbacher
Maulik Chevli
Florian Matthes
107
0
0
31 Jan 2025
Symmetric Pruning of Large Language Models
Symmetric Pruning of Large Language Models
Kai Yi
Peter Richtárik
AAMLVLM
112
0
0
31 Jan 2025
Memory-Efficient Fine-Tuning of Transformers via Token Selection
Memory-Efficient Fine-Tuning of Transformers via Token Selection
Antoine Simoulin
Namyong Park
Xiaoyi Liu
Grey Yang
195
1
0
31 Jan 2025
Reverse Probing: Evaluating Knowledge Transfer via Finetuned Task Embeddings for Coreference Resolution
Reverse Probing: Evaluating Knowledge Transfer via Finetuned Task Embeddings for Coreference Resolution
Tatiana Anikina
Arne Binder
David Harbecke
Stalin Varanasi
Leonhard Hennig
Simon Ostermann
Sebastian Möller
Josef van Genabith
154
0
0
31 Jan 2025
Hierarchical Multi-field Representations for Two-Stage E-commerce Retrieval
Hierarchical Multi-field Representations for Two-Stage E-commerce Retrieval
Niklas Freymuth
Dong Liu
Thomas Ricatte
Saab Mansour
107
0
0
30 Jan 2025
Fake News Detection After LLM Laundering: Measurement and Explanation
Fake News Detection After LLM Laundering: Measurement and Explanation
Rupak Kumar Das
Jonathan Dodge
191
1
0
29 Jan 2025
AdditiveLLM: Large Language Models Predict Defects in Additive Manufacturing
AdditiveLLM: Large Language Models Predict Defects in Additive Manufacturing
P. Pak
A. Farimani
AI4CE
135
1
0
29 Jan 2025
Previous
123...212223...196197198
Next