ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,920 papers shown
Title
Implicit to Explicit Entropy Regularization: Benchmarking ViT
  Fine-tuning under Noisy Labels
Implicit to Explicit Entropy Regularization: Benchmarking ViT Fine-tuning under Noisy Labels
Maria Marrium
Arif Mahmood
Mohammed Bennamoun
NoLaAAML
102
0
0
05 Oct 2024
Entity Insertion in Multilingual Linked Corpora: The Case of Wikipedia
Entity Insertion in Multilingual Linked Corpora: The Case of Wikipedia
Tomás Feith
Akhil Arora
Martin Gerlach
Debjit Paul
Robert West
KELM
66
2
0
05 Oct 2024
Persona Knowledge-Aligned Prompt Tuning Method for Online Debate
Persona Knowledge-Aligned Prompt Tuning Method for Online Debate
Chunkit Chan
Cheng Jiayang
Xin Liu
Yauwai Yim
Yuxin Jiang
Zheye Deng
Haoran Li
Yangqiu Song
Ginny Wong
Simon See
114
0
0
05 Oct 2024
Deep Transfer Learning Based Peer Review Aggregation and Meta-review
  Generation for Scientific Articles
Deep Transfer Learning Based Peer Review Aggregation and Meta-review Generation for Scientific Articles
Md. Tarek Hasan
Mohammad Nazmush Shamael
H. M. Mutasim Billah
Arifa Akter
M. Hossain
Sumayra Islam
Salekul Islam
Swakkhar Shatabda
67
0
0
05 Oct 2024
LongGenBench: Long-context Generation Benchmark
LongGenBench: Long-context Generation Benchmark
Xiang Liu
Peijie Dong
Xuming Hu
Xiaowen Chu
RALM
105
9
0
05 Oct 2024
Understanding the Effect of Algorithm Transparency of Model Explanations
  in Text-to-SQL Semantic Parsing
Understanding the Effect of Algorithm Transparency of Model Explanations in Text-to-SQL Semantic Parsing
Daking Rai
Rydia R. Weiland
Kayla Margaret Gabriella Herrera
Tyler H. Shaw
Ziyu Yao
90
2
0
05 Oct 2024
Large Language Models can be Strong Self-Detoxifiers
Large Language Models can be Strong Self-Detoxifiers
Ching-Yun Ko
Pin-Yu Chen
Payel Das
Youssef Mroueh
Soham Dan
Georgios Kollias
Subhajit Chaudhury
Tejaswini Pedapati
Luca Daniel
73
3
0
04 Oct 2024
What Matters for Model Merging at Scale?
What Matters for Model Merging at Scale?
Prateek Yadav
Tu Vu
Jonathan Lai
Alexandra Chronopoulou
Manaal Faruqui
Joey Tianyi Zhou
Tsendsuren Munkhdalai
MoMe
107
22
0
04 Oct 2024
Explicit, Implicit, and Scattered: Revisiting Event Extraction to
  Capture Complex Arguments
Explicit, Implicit, and Scattered: Revisiting Event Extraction to Capture Complex Arguments
Omar Sharif
Joseph Gatto
Madhusudan Basak
S. Preum
76
4
0
04 Oct 2024
Exploring the Benefit of Activation Sparsity in Pre-training
Exploring the Benefit of Activation Sparsity in Pre-training
Zhengyan Zhang
Chaojun Xiao
Qiujieli Qin
Yankai Lin
Zhiyuan Zeng
Xu Han
Zhiyuan Liu
Ruobing Xie
Maosong Sun
Jie Zhou
MoE
124
4
0
04 Oct 2024
Team MTS @ AutoMin 2021: An Overview of Existing Summarization
  Approaches and Comparison to Unsupervised Summarization Techniques
Team MTS @ AutoMin 2021: An Overview of Existing Summarization Approaches and Comparison to Unsupervised Summarization Techniques
Olga Iakovenko
Anna Andreeva
Anna Lapidus
Liana Mikaelyan
46
2
0
04 Oct 2024
Metadata Matters for Time Series: Informative Forecasting with
  Transformers
Metadata Matters for Time Series: Informative Forecasting with Transformers
Jiaxiang Dong
Haixu Wu
Yuxuan Wang
Li Zhang
Jianmin Wang
Mingsheng Long
AI4TS
46
0
0
04 Oct 2024
Text-guided Diffusion Model for 3D Molecule Generation
Text-guided Diffusion Model for 3D Molecule Generation
Yanchen Luo
Sihang Li
Changhao Nai
Zhiyuan Liu
Jiancan Wu
An Zhang
Wenjie Du
Xiang Wang
87
7
0
04 Oct 2024
Towards a Benchmark for Large Language Models for Business Process
  Management Tasks
Towards a Benchmark for Large Language Models for Business Process Management Tasks
Kiran Busch
Henrik Leopold
96
1
0
04 Oct 2024
Can Watermarked LLMs be Identified by Users via Crafted Prompts?
Can Watermarked LLMs be Identified by Users via Crafted Prompts?
Aiwei Liu
Sheng Guan
Yang Liu
Leyi Pan
Yifei Zhang
Liancheng Fang
Lijie Wen
Philip S. Yu
Xuming Hu
WaLM
397
5
0
04 Oct 2024
MELODI: Exploring Memory Compression for Long Contexts
MELODI: Exploring Memory Compression for Long Contexts
Yinpeng Chen
DeLesley Hutchins
Aren Jansen
Andrey Zhmoginov
David Racz
Jesper Andersen
72
2
0
04 Oct 2024
ARB-LLM: Alternating Refined Binarizations for Large Language Models
ARB-LLM: Alternating Refined Binarizations for Large Language Models
Zhiteng Li
Xinyu Yan
Tianao Zhang
Haotong Qin
Dong Xie
Jiang Tian
Zhongchao Shi
Linghe Kong
Yulun Zhang
Xiaokang Yang
MQ
97
8
0
04 Oct 2024
Enhancing Short-Text Topic Modeling with LLM-Driven Context Expansion
  and Prefix-Tuned VAEs
Enhancing Short-Text Topic Modeling with LLM-Driven Context Expansion and Prefix-Tuned VAEs
Pritom Saha Akash
Kevin Chen-Chuan Chang
99
1
0
04 Oct 2024
Towards an Improved Metric for Evaluating Disentangled Representations
Towards an Improved Metric for Evaluating Disentangled Representations
Sahib Julka
Yashu Wang
Michael Granitzer
69
0
0
04 Oct 2024
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
Haoran Xu
Kenton W. Murray
Philipp Koehn
Hieu T. Hoang
Akiko Eriguchi
Huda Khayrallah
145
15
0
04 Oct 2024
Efficiently Identifying Watermarked Segments in Mixed-Source Texts
Efficiently Identifying Watermarked Segments in Mixed-Source Texts
Xuandong Zhao
Chenwen Liao
Yu-Xiang Wang
Lei Li
WaLM
100
1
0
04 Oct 2024
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs
Wei Wu
Chao Wang
L. Chen
Mingze Yin
Yiheng Zhu
Kun Fu
Jieping Ye
Hui Xiong
Zheng Wang
147
1
0
04 Oct 2024
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
Doohyuk Jang
Sihwan Park
J. Yang
Yeonsung Jung
Jihun Yun
Souvik Kundu
Sung-Yub Kim
Eunho Yang
142
10
0
04 Oct 2024
Guided Stream of Search: Learning to Better Search with Language Models
  via Optimal Path Guidance
Guided Stream of Search: Learning to Better Search with Language Models via Optimal Path Guidance
Seungyong Moon
Bumsoo Park
Hyun Oh Song
RALMAIFin
73
2
0
03 Oct 2024
Coal Mining Question Answering with LLMs
Coal Mining Question Answering with LLMs
Antonio Carlos Rivera
Anthony Moore
Steven Robinson
76
1
0
03 Oct 2024
CorPipe at CRAC 2024: Predicting Zero Mentions from Raw Text
CorPipe at CRAC 2024: Predicting Zero Mentions from Raw Text
Milan Straka
LRM
60
1
0
03 Oct 2024
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal
  Foundation Models
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
Zhengfeng Lai
Vasileios Saveris
Chen Chen
Hong-You Chen
Haotian Zhang
...
Wenze Hu
Zhe Gan
Peter Grasch
Meng Cao
Yinfei Yang
VLM
72
4
0
03 Oct 2024
Parameter Competition Balancing for Model Merging
Parameter Competition Balancing for Model Merging
Guodong DU
Junlin Lee
Jing Li
Runhua Jiang
Yifei Guo
...
Hanting Liu
Sim Kuan Goh
Jing Li
Daojing He
Min Zhang
MoMe
99
24
0
03 Oct 2024
Make Compound Sentences Simple to Analyze: Learning to Split Sentences
  for Aspect-based Sentiment Analysis
Make Compound Sentences Simple to Analyze: Learning to Split Sentences for Aspect-based Sentiment Analysis
Yongsik Seo
Sungwon Song
Ryang Heo
Jieyong Kim
Dongha Lee
CoGe
62
1
0
03 Oct 2024
Can Language Models Take A Hint? Prompting for Controllable
  Contextualized Commonsense Inference
Can Language Models Take A Hint? Prompting for Controllable Contextualized Commonsense Inference
Pedro Colon-Hernandez
Nanxi Liu
Chelsea Joe
Peter Chin
Claire Yin
H. Lieberman
Yida Xin
C. Breazeal
ReLMLRM
65
1
0
03 Oct 2024
POSIX: A Prompt Sensitivity Index For Large Language Models
POSIX: A Prompt Sensitivity Index For Large Language Models
Anwoy Chatterjee
H. S. V. N. S. K. Renduchintala
S. Bhatia
Tanmoy Chakraborty
AAML
82
10
0
03 Oct 2024
Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language Models
Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language Models
Guobin Shen
Dongcheng Zhao
Yiting Dong
Xiang He
Yi Zeng
AAML
120
4
0
03 Oct 2024
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Yuqing Wang
Tianwei Xiong
Daquan Zhou
Zhijie Lin
Yang Zhao
Bingyi Kang
Jiashi Feng
Xihui Liu
VGen
163
35
0
03 Oct 2024
Selective Attention Improves Transformer
Selective Attention Improves Transformer
Yaniv Leviathan
Matan Kalman
Yossi Matias
119
12
0
03 Oct 2024
Undesirable Memorization in Large Language Models: A Survey
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELMPILM
200
7
0
03 Oct 2024
A Survey on Point-of-Interest Recommendation: Models, Architectures, and Security
A Survey on Point-of-Interest Recommendation: Models, Architectures, and Security
Qianru Zhang
Peng Yang
Junliang Yu
Haixin Wang
Xingwei He
Siu-Ming Yiu
Hongzhi Yin
145
3
0
03 Oct 2024
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences
Genta Indra Winata
David Anugraha
Lucky Susanto
Garry Kuwanto
Derry Wijaya
178
11
0
03 Oct 2024
ControlAR: Controllable Image Generation with Autoregressive Models
ControlAR: Controllable Image Generation with Autoregressive Models
Zongming Li
Tianheng Cheng
Shoufa Chen
Peize Sun
Haocheng Shen
Longjin Ran
Xiaoxin Chen
Wenyu Liu
Xinggang Wang
DiffM
253
19
0
03 Oct 2024
Are Large Language Models Good Classifiers? A Study on Edit Intent
  Classification in Scientific Document Revisions
Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions
Qian Ruan
Ilia Kuznetsov
Iryna Gurevych
75
3
0
02 Oct 2024
Auto-Demo Prompting: Leveraging Generated Outputs as Demonstrations for
  Enhanced Batch Prompting
Auto-Demo Prompting: Leveraging Generated Outputs as Demonstrations for Enhanced Batch Prompting
Longyu Feng
Mengze Hong
Chen Jason Zhang
80
2
0
02 Oct 2024
Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank
  Constraint?
Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?
Xi Chen
Kaituo Feng
Changsheng Li
Xunhao Lai
Xiangyu Yue
Ye Yuan
Guoren Wang
94
15
0
02 Oct 2024
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model
  Compression
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression
Jingcun Wang
Yu-Guang Chen
Ing-Chao Lin
Bing Li
Grace Li Zhang
90
4
0
02 Oct 2024
Effective Tuning Strategies for Generalist Robot Manipulation Policies
Effective Tuning Strategies for Generalist Robot Manipulation Policies
Wenbo Zhang
Yang Li
Yanyuan Qiao
Siyuan Huang
Jiajun Liu
Feras Dayoub
Xiao Ma
Lingqiao Liu
69
0
0
02 Oct 2024
Long-range gene expression prediction with token alignment of large
  language model
Long-range gene expression prediction with token alignment of large language model
Edouardo Honig
Huixin Zhan
Ying Nian Wu
Zijun Zhang
23
0
0
02 Oct 2024
GADFA: Generator-Assisted Decision-Focused Approach for Opinion
  Expressing Timing Identification
GADFA: Generator-Assisted Decision-Focused Approach for Opinion Expressing Timing Identification
Chung-Chi Chen
Hiroya Takamura
Ichiro Kobayashi
Yusuke Miyao
66
0
0
02 Oct 2024
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing
Haotian Sun
Tao Lei
Bowen Zhang
Yanghao Li
Haoshuo Huang
Ruoming Pang
Bo Dai
Nan Du
DiffMMoE
197
9
0
02 Oct 2024
Denoising with a Joint-Embedding Predictive Architecture
Denoising with a Joint-Embedding Predictive Architecture
Dengsheng Chen
Jie Hu
Xiaoming Wei
Enhua Wu
DiffM
172
3
0
02 Oct 2024
KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models
KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models
Pouyan Navard
Amin Karimi Monsefi
Mengxi Zhou
Wei-Lun Chao
Alper Yilmaz
R. Ramnath
DiffM
133
3
0
02 Oct 2024
FlashMask: Efficient and Rich Mask Extension of FlashAttention
FlashMask: Efficient and Rich Mask Extension of FlashAttention
Guoxia Wang
Jinle Zeng
Xiyuan Xiao
Siming Wu
Jiabin Yang
Lujing Zheng
Zeyu Chen
Jiang Bian
Dianhai Yu
Haifeng Wang
386
3
0
02 Oct 2024
Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
Kemal Kurniawan
Bernhard Schölkopf
Michael Muehlebach
201
1
0
02 Oct 2024
Previous
123...353637...197198199
Next