ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,916 papers shown
Title
Fluid: Scaling Autoregressive Text-to-image Generative Models with
  Continuous Tokens
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Lijie Fan
Tianhong Li
Siyang Qin
Yuanzhen Li
Chen Sun
Michael Rubinstein
Deqing Sun
Kaiming He
Yonglong Tian
VLMDiffM
131
57
0
17 Oct 2024
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
Runsen Xu
Zhiwei Huang
Tai Wang
Yuxiao Chen
Jiangmiao Pang
Dahua Lin
VGen
97
18
0
17 Oct 2024
MotionBank: A Large-scale Video Motion Benchmark with Disentangled
  Rule-based Annotations
MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
Liang Xu
Shaoyang Hua
Zili Lin
Yifan Liu
Feipeng Ma
Yichao Yan
Xin Jin
Xiaokang Yang
Wenjun Zeng
VGen
107
4
0
17 Oct 2024
Enhancing Fact Retrieval in PLMs through Truthfulness
Enhancing Fact Retrieval in PLMs through Truthfulness
Paul Youssef
Jorg Schlotterer
C. Seifert
KELMHILM
56
0
0
17 Oct 2024
Unlocking Legal Knowledge: A Multilingual Dataset for Judicial
  Summarization in Switzerland
Unlocking Legal Knowledge: A Multilingual Dataset for Judicial Summarization in Switzerland
Luca Rolshoven
Vishvaksenan Rasiah
Srinanda Brügger Bose
Matthias Sturmer
Joel Niklaus
ELMAILaw
77
2
0
17 Oct 2024
Instruction-Driven Game Engine: A Poker Case Study
Instruction-Driven Game Engine: A Poker Case Study
Hongqiu Wu
Xingyuan Liu
Yan Wang
Hai Zhao
65
2
0
17 Oct 2024
Fine-Tuning Language Models on Multiple Datasets for Citation Intention
  Classification
Fine-Tuning Language Models on Multiple Datasets for Citation Intention Classification
Zeren Shui
Petros Karypis
Daniel S. Karls
Mingjian Wen
Saurav Manchanda
E. Tadmor
George Karypis
46
1
0
17 Oct 2024
LLM-Rank: A Graph Theoretical Approach to Pruning Large Language Models
LLM-Rank: A Graph Theoretical Approach to Pruning Large Language Models
David Hoffmann
Kailash Budhathoki
Matthaeus Kleindessner
61
0
0
17 Oct 2024
FRAG: Toward Federated Vector Database Management for Collaborative and
  Secure Retrieval-Augmented Generation
FRAG: Toward Federated Vector Database Management for Collaborative and Secure Retrieval-Augmented Generation
Dongfang Zhao
76
3
0
17 Oct 2024
CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models
CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models
Shangda Wu
Yashan Wang
Ruibin Yuan
Zhancheng Guo
Xu Tan
...
Yuanliang Dong
Jiafeng Liu
Xiaobing Li
Feng Yu
Maosong Sun
215
5
0
17 Oct 2024
From Babbling to Fluency: Evaluating the Evolution of Language Models in
  Terms of Human Language Acquisition
From Babbling to Fluency: Evaluating the Evolution of Language Models in Terms of Human Language Acquisition
Qiyuan Yang
Pengda Wang
Luke D. Plonsky
Frederick L. Oswald
Hanjie Chen
ELM
77
2
0
17 Oct 2024
Data Defenses Against Large Language Models
Data Defenses Against Large Language Models
William Agnew
Harry H. Jiang
Cella Sum
Maarten Sap
Sauvik Das
AAML
127
0
0
17 Oct 2024
Sound Check: Auditing Audio Datasets
Sound Check: Auditing Audio Datasets
William Agnew
Julia Barnett
Annie Chu
Rachel Hong
Michael Feffer
Robin Netzorg
Harry H. Jiang
Ezra Awumey
Sauvik Das
127
1
0
17 Oct 2024
Sparse Mixture-of-Experts for Compositional Generalization: Empirical Evidence and Theoretical Foundations of Optimal Sparsity
Sparse Mixture-of-Experts for Compositional Generalization: Empirical Evidence and Theoretical Foundations of Optimal Sparsity
Jinze Zhao
Peihao Wang
J. Yang
Ruisi Cai
Gaowen Liu
Jayanth Srinivasa
Ramana Rao Kompella
Yingbin Liang
Zhangyang Wang
MoE
75
0
0
17 Oct 2024
Estimating the Probabilities of Rare Outputs in Language Models
Estimating the Probabilities of Rare Outputs in Language Models
Gabriel Wu
Jacob Hilton
AAMLUQCV
137
3
0
17 Oct 2024
Self-Comparison for Dataset-Level Membership Inference in Large
  (Vision-)Language Models
Self-Comparison for Dataset-Level Membership Inference in Large (Vision-)Language Models
J. Ren
Kangrui Chen
Chen Chen
Vikash Sehwag
Yue Xing
Jiliang Tang
Lingjuan Lyu
66
2
0
16 Oct 2024
Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via
  Lightweight Value Optimization
Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization
Xingqi Wang
Xiaoyuan Yi
Xing Xie
Jia Jia
60
1
0
16 Oct 2024
Learning to Predict Usage Options of Product Reviews with LLM-Generated
  Labels
Learning to Predict Usage Options of Product Reviews with LLM-Generated Labels
Leo Kohlenberg
Leonard Horns
Frederic Sadrieh
Nils Kiele
Matthis Clausen
Konstantin Ketterer
Avetis Navasardyan
Tamara Czinczoll
Gerard de Melo
Ralf Herbrich
34
0
0
16 Oct 2024
RapidDock: Unlocking Proteome-scale Molecular Docking
RapidDock: Unlocking Proteome-scale Molecular Docking
Rafał Powalski
Bazyli Klockiewicz
Maciej Jaśkowski
Bartosz Topolski
Paweł Dąbrowski-Tumański
Maciej Wiśniewski
Łukasz Kuciński
Piotr Miłoś
Dariusz Plewczynski
57
0
0
16 Oct 2024
Context-Infused Visual Grounding for Art
Context-Infused Visual Grounding for Art
Selina Khan
Nanne van Noord
ObjD
71
1
0
16 Oct 2024
Optimizing Low-Resource Language Model Training: Comprehensive Analysis
  of Multi-Epoch, Multi-Lingual, and Two-Stage Approaches
Optimizing Low-Resource Language Model Training: Comprehensive Analysis of Multi-Epoch, Multi-Lingual, and Two-Stage Approaches
Kosuke Akimoto
Masafumi Oyamada
64
0
0
16 Oct 2024
Off-dynamics Conditional Diffusion Planners
Off-dynamics Conditional Diffusion Planners
Wen Zheng Terence Ng
Jianda Chen
Tianwei Zhang
DiffMOffRL
140
0
0
16 Oct 2024
Enhancing LLM Agents for Code Generation with Possibility and Pass-rate Prioritized Experience Replay
Enhancing LLM Agents for Code Generation with Possibility and Pass-rate Prioritized Experience Replay
Yuyang Chen
Kaiyan Zhao
Yiming Wang
Ming Yang
Jian Zhang
Yan Li
157
1
0
16 Oct 2024
Exploring Large Language Models for Hate Speech Detection in Rioplatense
  Spanish
Exploring Large Language Models for Hate Speech Detection in Rioplatense Spanish
Juan Manuel Pérez
Paula Miguel
Viviana Cotik
41
1
0
16 Oct 2024
On A Scale From 1 to 5: Quantifying Hallucination in Faithfulness Evaluation
On A Scale From 1 to 5: Quantifying Hallucination in Faithfulness Evaluation
Xiaonan Jing
Srinivas Billa
Danny Godbout
HILM
129
0
0
16 Oct 2024
Channel-Wise Mixed-Precision Quantization for Large Language Models
Channel-Wise Mixed-Precision Quantization for Large Language Models
Zihan Chen
Bike Xie
Jundong Li
Cong Shen
MQ
116
3
0
16 Oct 2024
StyleDistance: Stronger Content-Independent Style Embeddings with Synthetic Parallel Examples
StyleDistance: Stronger Content-Independent Style Embeddings with Synthetic Parallel Examples
Ajay Patel
Jiacheng Zhu
Justin Qiu
Zachary Horvitz
Marianna Apidianaki
Kathleen McKeown
Chris Callison-Burch
165
4
0
16 Oct 2024
MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection
MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection
Bokai Lin
Zihao Zeng
Zipeng Xiao
Siqi Kou
Tianqi Hou
Xiaofeng Gao
Hao Zhang
Zhijie Deng
88
6
0
16 Oct 2024
Towards Neural Scaling Laws for Time Series Foundation Models
Towards Neural Scaling Laws for Time Series Foundation Models
Qingren Yao
Chao-Han Huck Yang
Renhe Jiang
Yuxuan Liang
Ming Jin
Shirui Pan
AI4TSAI4CE
162
9
0
16 Oct 2024
LegalLens Shared Task 2024: Legal Violation Identification in
  Unstructured Text
LegalLens Shared Task 2024: Legal Violation Identification in Unstructured Text
Ben Hagag
Liav Harpaz
Gil Semo
Dor Bernsohn
Rohit Saha
Pashootan Vaezipoor
Kyryl Truskovskyi
Gerasimos Spanakis
AILaw
65
6
0
15 Oct 2024
MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the
  Hints from Its Router
MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router
Yanyue Xie
Zhi Zhang
Ding Zhou
Cong Xie
Ziang Song
Xin Liu
Yanzhi Wang
Xue Lin
An Xu
LLMAG
89
5
0
15 Oct 2024
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based
  Language Models
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based Language Models
Kushal Tatariya
Vladimir Araujo
Thomas Bauwens
Miryam de Lhoneux
VLM
74
1
0
15 Oct 2024
From promise to practice: realizing high-performance decentralized
  training
From promise to practice: realizing high-performance decentralized training
Zesen Wang
Jiaojiao Zhang
Xuyang Wu
M. Johansson
110
0
0
15 Oct 2024
DySpec: Faster Speculative Decoding with Dynamic Token Tree Structure
DySpec: Faster Speculative Decoding with Dynamic Token Tree Structure
Yunfan Xiong
Ruoyu Zhang
Yanzeng Li
Tianhao Wu
Lei Zou
83
6
0
15 Oct 2024
Tokenization and Morphology in Multilingual Language Models: A
  Comparative Analysis of mT5 and ByT5
Tokenization and Morphology in Multilingual Language Models: A Comparative Analysis of mT5 and ByT5
Thao Anh Dang
Limor Raviv
Lukas Galke
52
1
0
15 Oct 2024
Learning from Imperfect Data: Towards Efficient Knowledge Distillation
  of Autoregressive Language Models for Text-to-SQL
Learning from Imperfect Data: Towards Efficient Knowledge Distillation of Autoregressive Language Models for Text-to-SQL
Qihuang Zhong
Kunfeng Chen
Liang Ding
Juhua Liu
Di Lin
Dacheng Tao
56
1
0
15 Oct 2024
Enhance Graph Alignment for Large Language Models
Enhance Graph Alignment for Large Language Models
Haitong Luo
Xuying Meng
Suhang Wang
Tianxiang Zhao
Fali Wang
Hanyun Cao
Yujun Zhang
438
2
0
15 Oct 2024
ChatHouseDiffusion: Prompt-Guided Generation and Editing of Floor Plans
ChatHouseDiffusion: Prompt-Guided Generation and Editing of Floor Plans
Sizhong Qin
Chengyu He
Qiaoyun Chen
Sen Yang
Wenjie Liao
Yi Gu
Xinzheng Lu
81
0
0
15 Oct 2024
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments
Syed Abdul Gaffar Shakhadri
Kruthika KR
Rakshit Aralimatti
VLM
47
0
0
15 Oct 2024
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Luping Liu
Chao Du
Tianyu Pang
Zehan Wang
Chongxuan Li
Dong Xu
VLM
123
8
0
15 Oct 2024
MoH: Multi-Head Attention as Mixture-of-Head Attention
MoH: Multi-Head Attention as Mixture-of-Head Attention
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
105
18
0
15 Oct 2024
LLM Unlearning via Loss Adjustment with Only Forget Data
LLM Unlearning via Loss Adjustment with Only Forget Data
Yaxuan Wang
Jiaheng Wei
Chris Yuhao Liu
Jinlong Pang
Qiang Liu
A. Shah
Yujia Bao
Yang Liu
Wei Wei
KELMMU
165
20
0
14 Oct 2024
Enhancing AI Assisted Writing with One-Shot Implicit Negative Feedback
Enhancing AI Assisted Writing with One-Shot Implicit Negative Feedback
Benjamin Towle
Ke Zhou
56
0
0
14 Oct 2024
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent
  Approach
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach
Rory Young
Nicolas Pugeault
AAML
136
0
0
14 Oct 2024
Tübingen-CL at SemEval-2024 Task 1:Ensemble Learning for Semantic
  Relatedness Estimation
Tübingen-CL at SemEval-2024 Task 1:Ensemble Learning for Semantic Relatedness Estimation
Leixin Zhang
Çağrı Çöltekin
85
2
0
14 Oct 2024
Customize Your Visual Autoregressive Recipe with Set Autoregressive
  Modeling
Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling
Wenze Liu
Le Zhuo
Yi Xin
Sheng Xia
Peng Gao
Xiangyu Yue
129
9
0
14 Oct 2024
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Tongtian Yue
Longteng Guo
Jie Cheng
Xuange Gao
Qingbin Liu
MoE
67
3
0
14 Oct 2024
BookWorm: A Dataset for Character Description and Analysis
BookWorm: A Dataset for Character Description and Analysis
Argyrios Papoudakis
Mirella Lapata
Frank Keller
54
2
0
14 Oct 2024
FasterDiT: Towards Faster Diffusion Transformers Training without
  Architecture Modification
FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification
J. Yao
Wang Cheng
Wenyu Liu
Xinggang Wang
93
13
0
14 Oct 2024
BanglaQuAD: A Bengali Open-domain Question Answering Dataset
BanglaQuAD: A Bengali Open-domain Question Answering Dataset
Md. Rony
Sudipto Kumar Shaha
Rakib Al Hasan
Sumon Kanti Dey
Amzad Hossain Rafi
Amzad Hossain Rafi
Ashraf Hasan Sirajee
Jens Lehmann
103
1
0
14 Oct 2024
Previous
123...323334...197198199
Next