ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,903 papers shown
Title
Regress, Don't Guess -- A Regression-like Loss on Number Tokens for Language Models
Regress, Don't Guess -- A Regression-like Loss on Number Tokens for Language Models
Jonas Zausinger
Lars Pennig
Anamarija Kozina
Sean Sdahl
Julian Sikora
...
Anna Ketteler
Thorben Prein
Vishwa Mohan Singh
Michael Morris Danziger
Jannis Born
82
3
0
04 Nov 2024
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Yuqi Luo
Chenyang Song
Xu Han
Yuxiao Chen
Chaojun Xiao
Zhiyuan Liu
Maosong Sun
Jiansheng Wei
Zhiyuan Liu
Maosong Sun
147
7
0
04 Nov 2024
Sing-On-Your-Beat: Simple Text-Controllable Accompaniment Generations
Sing-On-Your-Beat: Simple Text-Controllable Accompaniment Generations
Quoc-Huy Trinh
Minh-Van Nguyen
Trong-Hieu Nguyen-Mau
Khoa Tran
Thanh Do
56
0
0
03 Nov 2024
Enriching Tabular Data with Contextual LLM Embeddings: A Comprehensive
  Ablation Study for Ensemble Classifiers
Enriching Tabular Data with Contextual LLM Embeddings: A Comprehensive Ablation Study for Ensemble Classifiers
Gjergji Kasneci
Enkelejda Kasneci
LMTD
116
2
0
03 Nov 2024
Achieving Domain-Independent Certified Robustness via Knowledge
  Continuity
Achieving Domain-Independent Certified Robustness via Knowledge Continuity
Alan Sun
Chiyu Ma
Kenneth Ge
Soroush Vosoughi
63
1
0
03 Nov 2024
Domain-specific Guided Summarization for Mental Health Posts
Domain-specific Guided Summarization for Mental Health Posts
Lu Qian
Yuqi Wang
Zehua Wang
H. Zhang
Wei Wang
Ting Yu
Anh Nguyen
AI4MH
130
3
0
03 Nov 2024
Activating Self-Attention for Multi-Scene Absolute Pose Regression
Activating Self-Attention for Multi-Scene Absolute Pose Regression
Miso Lee
Jihwan Kim
Jae-Pil Heo
ViT
69
1
0
03 Nov 2024
Efficient Deep Learning Infrastructures for Embedded Computing Systems:
  A Comprehensive Survey and Future Envision
Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision
Xiangzhong Luo
Di Liu
Hao Kong
Shuo Huai
Hui Chen
Guochu Xiong
Weichen Liu
67
6
0
03 Nov 2024
$B^4$: A Black-Box Scrubbing Attack on LLM Watermarks
B4B^4B4: A Black-Box Scrubbing Attack on LLM Watermarks
Baizhou Huang
Xiao Pu
Xiaojun Wan
81
1
0
02 Nov 2024
PRIMO: Progressive Induction for Multi-hop Open Rule Generation
PRIMO: Progressive Induction for Multi-hop Open Rule Generation
Jianyu Liu
Sheng Bi
Guilin Qi
60
0
0
02 Nov 2024
CmdCaliper: A Semantic-Aware Command-Line Embedding Model and Dataset
  for Security Research
CmdCaliper: A Semantic-Aware Command-Line Embedding Model and Dataset for Security Research
Sian-Yao Huang
Cheng-Lin Yang
Hongpeng Zhou
Chun-Ying Huang
83
2
0
02 Nov 2024
Music Foundation Model as Generic Booster for Music Downstream Tasks
Music Foundation Model as Generic Booster for Music Downstream Tasks
Weihsiang Liao
Yuhta Takida
Yukara Ikemiya
Zhi-Wei Zhong
Chieh-Hsin Lai
...
Stefan Uhlich
Taketo Akama
Woosung Choi
Yuichiro Koyama
Yuki Mitsufuji
237
1
0
02 Nov 2024
MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert
  Pruning and Intra-Expert Low-Rank Decomposition
MoE-I2^22: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
Cheng Yang
Yang Sui
Jinqi Xiao
Lingyi Huang
Yu Gong
Yuanlin Duan
Wenqi Jia
Miao Yin
Yu Cheng
Bo Yuan
MoE
143
7
0
01 Nov 2024
GameGen-X: Interactive Open-world Game Video Generation
GameGen-X: Interactive Open-world Game Video Generation
Haoxuan Che
Xuanhua He
Quande Liu
Cheng Jin
Hao Chen
VGen
141
25
0
01 Nov 2024
Learning to Rank Salient Content for Query-focused Summarization
Learning to Rank Salient Content for Query-focused Summarization
Sajad Sotudeh
Nazli Goharian
100
1
0
01 Nov 2024
TextDestroyer: A Training- and Annotation-Free Diffusion Method for Destroying Anomal Text from Images
TextDestroyer: A Training- and Annotation-Free Diffusion Method for Destroying Anomal Text from Images
Mengcheng Li
Mingbao Lin
Yong Li
Chia-Wen Lin
DiffM
94
0
0
01 Nov 2024
Generative Emotion Cause Explanation in Multimodal Conversations
Generative Emotion Cause Explanation in Multimodal Conversations
Lin Wang
Xiaocui Yang
Shi Feng
Daling Wang
Yifei Zhang
Zhitao Zhang
107
0
0
01 Nov 2024
LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property
  Prediction
LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction
Andre Niyongabo Rubungo
Kangming Li
Jason Hattrick-Simpers
Adji Bousso Dieng
106
9
0
31 Oct 2024
P-Masking: Power Law Masking Improves Multi-attribute Controlled
  Generation
P-Masking: Power Law Masking Improves Multi-attribute Controlled Generation
Mohamed Elgaar
Hadi Amiri
AI4CE
71
0
0
31 Oct 2024
Scaling Concept With Text-Guided Diffusion Models
Scaling Concept With Text-Guided Diffusion Models
Chao Huang
Susan Liang
Yunlong Tang
Yapeng Tian
Anurag Kumar
Chenliang Xu
DiffM
92
6
0
31 Oct 2024
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided
  Mixture-of-Experts
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-Experts
Xiang Deng
Youxin Pang
Xiaochen Zhao
Chao Xu
Lizhen Wang
Hongjiang Xiao
Shi Yan
Hongwen Zhang
Yebin Liu
DiffMVGen
68
1
0
31 Oct 2024
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
Amir Hossein Kargaran
François Yvon
Hinrich Schutze
VLM
121
8
0
31 Oct 2024
DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios
DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios
Junchao Wu
Runzhe Zhan
Derek F. Wong
Shu Yang
Xinyi Yang
Yulin Yuan
Lidia S. Chao
DeLMO
198
2
0
31 Oct 2024
Efficient and Interpretable Grammatical Error Correction with Mixture of
  Experts
Efficient and Interpretable Grammatical Error Correction with Mixture of Experts
Muhammad Reza Qorib
Alham Fikri Aji
Hwee Tou Ng
KELMMoE
72
0
0
30 Oct 2024
Graph-Augmented Relation Extraction Model with LLMs-Generated Support
  Document
Graph-Augmented Relation Extraction Model with LLMs-Generated Support Document
Vicky Dong
Hao Yu
Yao Chen
67
0
0
30 Oct 2024
MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of
  Low-rank Experts
MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
Jie Zhu
Yukang Chen
Mingyu Ding
Ping Luo
Leye Wang
Jingdong Wang
DiffM
69
5
0
30 Oct 2024
EMMA: End-to-End Multimodal Model for Autonomous Driving
EMMA: End-to-End Multimodal Model for Autonomous Driving
Jyh-Jing Hwang
Runsheng Xu
Hubert Lin
Wei-Chih Hung
Jingwei Ji
...
Benjamin Sapp
Yin Zhou
James Guo
Dragomir Anguelov
Mingxing Tan
VLMLM&Ro
110
38
0
30 Oct 2024
ProTransformer: Robustify Transformers via Plug-and-Play Paradigm
ProTransformer: Robustify Transformers via Plug-and-Play Paradigm
Zhichao Hou
Weizhi Gao
Yuchen Shen
Feiyi Wang
Xiaorui Liu
VLM
74
2
0
30 Oct 2024
FlexTSF: A Universal Forecasting Model for Time Series with Variable
  Regularities
FlexTSF: A Universal Forecasting Model for Time Series with Variable Regularities
Jingge Xiao
Yile Chen
Gao Cong
Wolfgang Nejdl
Simon Gottschalk
AI4TS
75
0
0
30 Oct 2024
Controlling Language and Diffusion Models by Transporting Activations
Controlling Language and Diffusion Models by Transporting Activations
P. Rodríguez
Arno Blaas
Michal Klein
Luca Zappella
N. Apostoloff
Marco Cuturi
Xavier Suau
LLMSV
128
6
0
30 Oct 2024
Smaller Large Language Models Can Do Moral Self-Correction
Smaller Large Language Models Can Do Moral Self-Correction
Guangliang Liu
Zhiyu Xue
Rongrong Wang
K. Johnson
Kristen Marie Johnson
LRM
98
0
0
30 Oct 2024
Toxicity of the Commons: Curating Open-Source Pre-Training Data
Toxicity of the Commons: Curating Open-Source Pre-Training Data
Catherine Arnett
Eliot Jones
Ivan P. Yamshchikov
Pierre-Carl Langlais
73
4
0
29 Oct 2024
Class-Aware Contrastive Optimization for Imbalanced Text Classification
Class-Aware Contrastive Optimization for Imbalanced Text Classification
Grigorii Khvatskii
Nuno Moniz
Khoa D. Doan
Nitesh Chawla
79
1
0
29 Oct 2024
Multi-aspect Depression Severity Assessment via Inductive Dialogue
  System
Multi-aspect Depression Severity Assessment via Inductive Dialogue System
C. Lee
Seungyeon Seo
Heejin Do
Gary Geunbae Lee
53
0
0
29 Oct 2024
Preserving Pre-trained Representation Space: On Effectiveness of
  Prefix-tuning for Large Multi-modal Models
Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models
Donghoon Kim
Gusang Lee
Kyuhong Shim
B. Shim
102
1
0
29 Oct 2024
MotionGPT-2: A General-Purpose Motion-Language Model for Motion
  Generation and Understanding
MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding
Yuan Wang
Di Huang
Yaqi Zhang
Wanli Ouyang
J. Jiao
Xuetao Feng
Yan Zhou
Pengfei Wan
Shixiang Tang
Dan Xu
VGen
115
16
0
29 Oct 2024
On the Role of Depth and Looping for In-Context Learning with Task
  Diversity
On the Role of Depth and Looping for In-Context Learning with Task Diversity
Khashayar Gatmiry
Nikunj Saunshi
Sashank J. Reddi
Stefanie Jegelka
Sanjiv Kumar
89
2
0
29 Oct 2024
Fourier Head: Helping Large Language Models Learn Complex Probability Distributions
Fourier Head: Helping Large Language Models Learn Complex Probability Distributions
Nate Gillman
Daksh Aggarwal
Michael Freeman
Saurabh Singh
Chen Sun
AI4TS
110
4
0
29 Oct 2024
Online Detection of LLM-Generated Texts via Sequential Hypothesis Testing by Betting
Online Detection of LLM-Generated Texts via Sequential Hypothesis Testing by Betting
Can Chen
Jun-Kun Wang
DeLMO
169
0
0
29 Oct 2024
How Does Critical Batch Size Scale in Pre-training?
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
194
18
0
29 Oct 2024
Hierarchical Knowledge Graph Construction from Images for Scalable
  E-Commerce
Hierarchical Knowledge Graph Construction from Images for Scalable E-Commerce
Zhantao Yang
Han Zhang
Fangyi Chen
Anudeepsekhar Bolimera
Marios Savvides
63
0
0
28 Oct 2024
HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced
  Context Awareness and Extrapolation
HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Yuhan Chen
Ang Lv
Jian Luan
Bin Wang
Wen Liu
66
5
0
28 Oct 2024
BongLLaMA: LLaMA for Bangla Language
BongLLaMA: LLaMA for Bangla Language
Abdullah Khan Zehady
Safi Al Mamun
Naymul Islam
Santu Karmaker
ALM
50
1
0
28 Oct 2024
DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive
  Learning
DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning
Xun Guo
Shan Zhang
Yongxin He
Ting Zhang
Wanquan Feng
Haibin Huang
Chongyang Ma
DeLMO
86
10
0
28 Oct 2024
AutoRAG: Automated Framework for optimization of Retrieval Augmented
  Generation Pipeline
AutoRAG: Automated Framework for optimization of Retrieval Augmented Generation Pipeline
Dongkyu Kim
Byoungwook Kim
Donggeon Han
Matouš Eibich
88
15
0
28 Oct 2024
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with
  Annual Updates
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates
Hexuan Deng
Wenxiang Jiao
Xuebo Liu
Min Zhang
Zhaopeng Tu
104
4
0
28 Oct 2024
Rephrasing natural text data with different languages and quality levels
  for Large Language Model pre-training
Rephrasing natural text data with different languages and quality levels for Large Language Model pre-training
Michael Pieler
Marco Bellagente
H. Teufel
Duy Phung
Nathan Cooper
...
Reshinth Adithyan
Zaid Alyafeai
Nikhil Pinnaparaju
Maksym Zhuravinskyi
Carlos Riquelme
74
1
0
28 Oct 2024
Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large
  Language Models
Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models
Yilun Jin
Zheng Li
Chenwei Zhang
Tianyu Cao
Yifan Gao
...
Yi Xu
Kai Chen
Qiang Yang
Meng Jiang
Bing Yin
RALM
107
3
0
28 Oct 2024
Relation-based Counterfactual Data Augmentation and Contrastive Learning
  for Robustifying Natural Language Inference Models
Relation-based Counterfactual Data Augmentation and Contrastive Learning for Robustifying Natural Language Inference Models
H. Yang
Sseung-won Hwang
Jungmin So
70
0
0
28 Oct 2024
Segmenting Watermarked Texts From Language Models
Segmenting Watermarked Texts From Language Models
Xingchi Li
Guanxun Li
Xianyang Zhang
WaLM
70
0
0
28 Oct 2024
Previous
123...293031...197198199
Next