ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,920 papers shown
Title
Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Susan Liang
Chao Huang
Yapeng Tian
Anurag Kumar
Chenliang Xu
DiffM
121
8
0
09 Oct 2024
Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data
  for LLM Pruning
Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning
Abhinav Bandari
L. Yin
Cheng-Yu Hsieh
Ajay Kumar Jaiswal
Tianlong Chen
Li Shen
Ranjay Krishna
Shiwei Liu
79
8
0
09 Oct 2024
SAGE: Scalable Ground Truth Evaluations for Large Sparse Autoencoders
SAGE: Scalable Ground Truth Evaluations for Large Sparse Autoencoders
Constantin Venhoff
Anisoara Calinescu
Philip Torr
Christian Schroeder de Witt
74
0
0
09 Oct 2024
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation
  Experts
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
115
6
0
09 Oct 2024
An Approach for Auto Generation of Labeling Functions for Software
  Engineering Chatbots
An Approach for Auto Generation of Labeling Functions for Software Engineering Chatbots
Ebube Alor
Ahmad Abdellatif
SayedHassan Khatoonabadi
Emad Shihab
114
0
0
09 Oct 2024
Answering Questions in Stages: Prompt Chaining for Contract QA
Answering Questions in Stages: Prompt Chaining for Contract QA
Adam Roegiest
Radha Chitta
AILawELM
58
0
0
09 Oct 2024
Generative Model for Less-Resourced Language with 1 billion parameters
Generative Model for Less-Resourced Language with 1 billion parameters
Domen Vreš
Martin Božič
Aljaž Potočnik
Tomaž Martinčič
Marko Robnik-Šikonja
49
1
0
09 Oct 2024
CoBa: Convergence Balancer for Multitask Finetuning of Large Language
  Models
CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models
Zi Gong
Hang Yu
Cong Liao
Bingchang Liu
Chaoyu Chen
Jianguo Li
MoMe
59
5
0
09 Oct 2024
The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield
  Better Language Models
The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield Better Language Models
Yanjun Chen
Dawei Zhu
Yirong Sun
Xinghao Chen
Wei Zhang
Xiaoyu Shen
ALM
72
4
0
09 Oct 2024
FreqMark: Frequency-Based Watermark for Sentence-Level Detection of
  LLM-Generated Text
FreqMark: Frequency-Based Watermark for Sentence-Level Detection of LLM-Generated Text
Zhenyu Xu
Kun Zhang
Victor S. Sheng
WaLM
78
2
0
09 Oct 2024
Signal Watermark on Large Language Models
Signal Watermark on Large Language Models
Zhenyu Xu
Victor S. Sheng
WaLM
35
0
0
09 Oct 2024
A Novel LLM-based Two-stage Summarization Approach for Long Dialogues
A Novel LLM-based Two-stage Summarization Approach for Long Dialogues
Yuan-Jhe Yin
Bo-Yu Chen
Berlin Chen
66
6
0
09 Oct 2024
WAPITI: A Watermark for Finetuned Open-Source LLMs
WAPITI: A Watermark for Finetuned Open-Source LLMs
Lingjie Chen
Ruizhong Qiu
Siyu Yuan
Zhining Liu
Tianxin Wei
Hyunsik Yoo
Zhichen Zeng
Deqing Yang
Hanghang Tong
WaLM
104
7
0
09 Oct 2024
Which Programming Language and What Features at Pre-training Stage Affect Downstream Logical Inference Performance?
Which Programming Language and What Features at Pre-training Stage Affect Downstream Logical Inference Performance?
Fumiya Uchiyama
Takeshi Kojima
Andrew Gambardella
Qi Cao
Yusuke Iwasawa
Y. Matsuo
LRMReLM
68
3
0
09 Oct 2024
TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training
TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training
Wanchao Liang
Tianyu Liu
Less Wright
Will Constable
Andrew Gu
...
Howard Huang
Junjie Wang
Sanket Purandare
Gokul Nadathur
Stratos Idreos
OffRL
126
19
0
09 Oct 2024
Glider: Global and Local Instruction-Driven Expert Router
Glider: Global and Local Instruction-Driven Expert Router
Pingzhi Li
Prateek Yadav
Jaehong Yoon
Jie Peng
Yi-Lin Sung
Joey Tianyi Zhou
Tianlong Chen
MoMeMoE
94
2
0
09 Oct 2024
Tackling the Abstraction and Reasoning Corpus with Vision Transformers:
  the Importance of 2D Representation, Positions, and Objects
Tackling the Abstraction and Reasoning Corpus with Vision Transformers: the Importance of 2D Representation, Positions, and Objects
Wenhao Li
Yudong Xu
Scott Sanner
Elias Boutros Khalil
ViT
100
5
0
08 Oct 2024
Trajectory Improvement and Reward Learning from Comparative Language
  Feedback
Trajectory Improvement and Reward Learning from Comparative Language Feedback
Zhaojing Yang
Miru Jun
J. Tien
Stuart J. Russell
Anca Dragan
Erdem Bıyık
92
7
0
08 Oct 2024
Language-Assisted Human Part Motion Learning for Skeleton-Based Temporal
  Action Segmentation
Language-Assisted Human Part Motion Learning for Skeleton-Based Temporal Action Segmentation
Bowen Chen
Haoyu Ji
Zhiyong Wang
Benjamin Filtjens
C. Wang
Weihong Ren
Bart Vanrumste
Honghai Liu
107
0
0
08 Oct 2024
Manual Verbalizer Enrichment for Few-Shot Text Classification
Manual Verbalizer Enrichment for Few-Shot Text Classification
Quang Anh Nguyen
Nadi Tomeh
M. Lebbah
Thierry Charnois
Hanene Azzag
Santiago Cordoba Muñoz
VLM
88
0
0
08 Oct 2024
Automatic Summarization of Long Documents
Automatic Summarization of Long Documents
Naman Chhibbar
Jugal Kalita
120
0
0
08 Oct 2024
CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models
  Using Discrete Concept
CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models Using Discrete Concept
YuXuan Wu
Bonaventure F. P. Dossou
Dianbo Liu
MU
49
0
0
08 Oct 2024
DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image
  Editing
DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing
June Suk Choi
Kyungmin Lee
Jongheon Jeong
Saining Xie
Jinwoo Shin
Kimin Lee
DiffMAAML
65
4
0
08 Oct 2024
Does RoBERTa Perform Better than BERT in Continual Learning: An
  Attention Sink Perspective
Does RoBERTa Perform Better than BERT in Continual Learning: An Attention Sink Perspective
Xueying Bai
Yifan Sun
Niranjan Balasubramanian
CLL
68
0
0
08 Oct 2024
DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing
  with Language Models
DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing with Language Models
Ranchi Zhao
Zhen Leng Thai
Yifan Zhang
Shengding Hu
Yunqi Ba
Jie Zhou
Jie Cai
Zhiyuan Liu
Maosong Sun
147
1
0
08 Oct 2024
Generating Synthetic Datasets for Few-shot Prompt Tuning
Generating Synthetic Datasets for Few-shot Prompt Tuning
Xu Guo
Zilin Du
Boyang Li
Chunyan Miao
95
2
0
08 Oct 2024
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Wei Huang
Yue Liao
Jianhui Liu
Ruifei He
Haoru Tan
Shiming Zhang
Hongsheng Li
Si Liu
Xiaojuan Qi
MoE
114
4
0
08 Oct 2024
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models
Muhammad Jehanzeb Mirza
Mengjie Zhao
Zhuoyuan Mao
Sivan Doveh
Wei Lin
...
Yuki Mitsufuji
Horst Possegger
Rogerio Feris
Leonid Karlinsky
James Glass
VLM
222
1
0
08 Oct 2024
Pyramidal Flow Matching for Efficient Video Generative Modeling
Pyramidal Flow Matching for Efficient Video Generative Modeling
Yang Jin
Zhicheng Sun
Ningyuan Li
Kun Xu
K. Xu
...
Nan Zhuang
Quzhe Huang
Yang Song
Yadong Mu
Zhouchen Lin
VGen
168
87
0
08 Oct 2024
PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches
PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches
Rana Muhammad Shahroz Khan
Pingzhi Li
Sukwon Yun
Zhenyu Wang
S. Nirjon
Chau-Wai Wong
Tianlong Chen
KELM
120
3
0
08 Oct 2024
Chain and Causal Attention for Efficient Entity Tracking
Chain and Causal Attention for Efficient Entity Tracking
Erwan Fagnou
Paul Caillon
Blaise Delattre
Alexandre Allauzen
95
5
0
07 Oct 2024
LoTLIP: Improving Language-Image Pre-training for Long Text
  Understanding
LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Wei Wu
Kecheng Zheng
Shuailei Ma
Fan Lu
Yuxin Guo
Yifei Zhang
Wei Chen
Qingpei Guo
Yujun Shen
Zheng-Jun Zha
VLM
135
9
0
07 Oct 2024
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal
  Instruction
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction
Leheng Li
Weichao Qiu
Xu Yan
Jing He
Kaiqiang Zhou
Yingjie Cai
Qing Lian
Bingbing Liu
Ying-Cong Chen
SyDaDiffM
85
1
0
07 Oct 2024
Computational design of target-specific linear peptide binders with
  TransformerBeta
Computational design of target-specific linear peptide binders with TransformerBeta
Haowen Zhao
Francesco A. Aprile
Barbara Bravi
77
0
0
07 Oct 2024
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Chuanyang Zheng
Yihang Gao
Han Shi
Jing Xiong
Jiankai Sun
...
Xiaozhe Ren
Michael Ng
Xin Jiang
Zhenguo Li
Yu Li
83
3
0
07 Oct 2024
Rule-based Data Selection for Large Language Models
Rule-based Data Selection for Large Language Models
Xiaomin Li
Mingye Gao
Zhiwei Zhang
Chang Yue
Hong Hu
75
7
0
07 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
628
1
0
07 Oct 2024
Timer-XL: Long-Context Transformers for Unified Time Series Forecasting
Timer-XL: Long-Context Transformers for Unified Time Series Forecasting
Yong Liu
Guo Qin
Xiangdong Huang
Jianmin Wang
Mingsheng Long
AI4TS
101
12
0
07 Oct 2024
On Evaluating LLMs' Capabilities as Functional Approximators: A Bayesian
  Perspective
On Evaluating LLMs' Capabilities as Functional Approximators: A Bayesian Perspective
Shoaib Ahmed Siddiqui
Yanzhi Chen
Juyeon Heo
Menglin Xia
Adrian Weller
47
0
0
06 Oct 2024
RevMUX: Data Multiplexing with Reversible Adapters for Efficient LLM
  Batch Inference
RevMUX: Data Multiplexing with Reversible Adapters for Efficient LLM Batch Inference
Yige Xu
Xu Guo
Zhiwei Zeng
Chunyan Miao
65
0
0
06 Oct 2024
Realizing Video Summarization from the Path of Language-based Semantic
  Understanding
Realizing Video Summarization from the Path of Language-based Semantic Understanding
Kuan-Chen Mu
Zhi-Yi Chin
Wei-Chen Chiu
49
0
0
06 Oct 2024
Inner-Probe: Discovering Copyright-related Data Generation in LLM Architecture
Inner-Probe: Discovering Copyright-related Data Generation in LLM Architecture
Qichao Ma
Rui-Jie Zhu
Peiye Liu
Renye Yan
Fahong Zhang
...
Meng Li
Zhaofei Yu
Zongwei Wang
Yimao Cai
Tiejun Huang
80
1
0
06 Oct 2024
A Reflection on the Impact of Misspecifying Unidentifiable Causal
  Inference Models in Surrogate Endpoint Evaluation
A Reflection on the Impact of Misspecifying Unidentifiable Causal Inference Models in Surrogate Endpoint Evaluation
Gokce Deliorman
Florian Stijven
Wim Van der Elst
Maria del Carmen Pardo
Ariel Alonso
CML
87
4
0
06 Oct 2024
An evaluation of LLM code generation capabilities through graded
  exercises
An evaluation of LLM code generation capabilities through graded exercises
Álvaro Barbero Jiménez
ELM
66
1
0
06 Oct 2024
Gradient Routing: Masking Gradients to Localize Computation in Neural
  Networks
Gradient Routing: Masking Gradients to Localize Computation in Neural Networks
Alex Cloud
Jacob Goldman-Wetzler
Evžen Wybitul
Joseph Miller
Alexander Matt Turner
65
3
0
06 Oct 2024
OD-Stega: LLM-Based Near-Imperceptible Steganography via Optimized
  Distributions
OD-Stega: LLM-Based Near-Imperceptible Steganography via Optimized Distributions
Yu-Shin Huang
Peter Just
Krishna Narayanan
Chao Tian
132
7
0
06 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
199
21
0
06 Oct 2024
Where are we in audio deepfake detection? A systematic analysis over generative and detection models
Where are we in audio deepfake detection? A systematic analysis over generative and detection models
Xiang Li
Pin-Yu Chen
Wenqi Wei
110
2
0
06 Oct 2024
Regularized Neural Ensemblers
Regularized Neural Ensemblers
Sebastian Pineda Arango
Maciej Janowski
Lennart Purucker
Arber Zela
Frank Hutter
Josif Grabocka
UQCV
95
0
0
06 Oct 2024
Taylor Unswift: Secured Weight Release for Large Language Models via Taylor Expansion
Taylor Unswift: Secured Weight Release for Large Language Models via Taylor Expansion
Guanchu Wang
Yu-Neng Chuang
Ruixiang Tang
Shaochen Zhong
Jiayi Yuan
...
Zirui Liu
Vipin Chaudhary
Shuai Xu
James Caverlee
Helen Zhou
PILM
165
2
0
06 Oct 2024
Previous
123...343536...197198199
Next