Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,959 papers shown
Title
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning
Yixiao Zhang
Yukara Ikemiya
Woosung Choi
Naoki Murata
Marco A. Martínez-Ramírez
Liwei Lin
Gus Xia
Wei-Hsiang Liao
Yuki Mitsufuji
Simon Dixon
111
12
0
28 May 2024
Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning
Phakphum Artkaew
LRM
60
0
0
28 May 2024
VITON-DiT: Learning In-the-Wild Video Try-On from Human Dance Videos via Diffusion Transformers
Jun Zheng
Fuwei Zhao
Youjiang Xu
Xin Dong
Xiaodan Liang
VGen
DiffM
71
7
0
28 May 2024
4-bit Shampoo for Memory-Efficient Network Training
Sike Wang
Jia Li
Pan Zhou
Hua Huang
MQ
163
9
0
28 May 2024
Instruction Tuning with Retrieval-based Examples Ranking for Aspect-based Sentiment Analysis
Guangmin Zheng
Jin Wang
Liang-Chih Yu
Xuejie Zhang
73
5
0
28 May 2024
MultiADE: A Multi-domain Benchmark for Adverse Drug Event Extraction
Xiang Dai
Sarvnaz Karimi
Abeed Sarker
Ben Hachey
Cécile Paris
84
3
0
28 May 2024
VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections
Roy Miles
Pradyumna Reddy
Ismail Elezi
Jiankang Deng
VLM
73
7
0
28 May 2024
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
Xing Hu
Yuan Cheng
Dawei Yang
Zhihang Yuan
Jiangyong Yu
Chen Xu
Sifan Zhou
MQ
86
8
0
28 May 2024
Detection-Correction Structure via General Language Model for Grammatical Error Correction
Wei Li
Houfeng Wang
107
6
0
28 May 2024
Exploring Activation Patterns of Parameters in Language Models
Yudong Wang
Damai Dai
Zhifang Sui
54
2
0
28 May 2024
XL3M: A Training-free Framework for LLM Length Extension Based on Segment-wise Inference
Shengnan Wang
Youhui Bai
Lin Zhang
Pingyi Zhou
Shixiong Zhao
Gong Zhang
Sen Wang
Renhai Chen
Hua Xu
Hongwei Sun
128
5
0
28 May 2024
C
3
^{3}
3
Bench: A Comprehensive Classical Chinese Understanding Benchmark for Large Language Models
Jiahuan Cao
Yongxin Shi
Dezhi Peng
Yang Liu
Lianwen Jin
ELM
77
0
0
28 May 2024
MockLLM: A Multi-Agent Behavior Collaboration Framework for Online Job Seeking and Recruiting
Hongda Sun
Hongzhan Lin
Haiyu Yan
Chen Zhu
Yang Song
Xin Gao
82
8
0
28 May 2024
Generative Query Reformulation Using Ensemble Prompting, Document Fusion, and Relevance Feedback
Kaustubh D. Dhole
Ramraj Chandradevan
Eugene Agichtein
71
1
0
27 May 2024
QUB-Cirdan at "Discharge Me!": Zero shot discharge letter generation by open-source LLM
Rui Guo
Greg Farnan
Niall McLaughlin
Barry Devereux
47
4
0
27 May 2024
Transformers Can Do Arithmetic with the Right Embeddings
Sean McLeish
Arpit Bansal
Alex Stein
Neel Jain
John Kirchenbauer
...
B. Kailkhura
A. Bhatele
Jonas Geiping
Avi Schwarzschild
Tom Goldstein
83
37
0
27 May 2024
DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution
Yulong Mao
Kaiyu Huang
Changhao Guan
Ganglin Bao
Fengran Mo
Jinan Xu
98
17
0
27 May 2024
Unisolver: PDE-Conditional Transformers Are Universal PDE Solvers
Zhou Hang
Yuezhou Ma
Haixu Wu
Haowen Wang
Mingsheng Long
AI4CE
83
11
0
27 May 2024
CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs
Haoyu Wang
Bei Liu
Hang Shao
Bo Xiao
Ke Zeng
Guanglu Wan
Yanmin Qian
MQ
55
1
0
27 May 2024
Empowering Character-level Text Infilling by Eliminating Sub-Tokens
Houxing Ren
Mingjie Zhan
Zhongyuan Wu
Hongsheng Li
AI4CE
72
1
0
27 May 2024
Leveraging small language models for Text2SPARQL tasks to improve the resilience of AI assistance
Felix Brei
Johannes Frey
Lars-Peter Meyer
63
13
0
27 May 2024
BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation
Chengxing Jia
Pengyuan Wang
Ziniu Li
Yi-Chen Li
Zhilong Zhang
Nan Tang
Yang Yu
OffRL
64
2
0
27 May 2024
Position: Foundation Agents as the Paradigm Shift for Decision Making
Xiaoqian Liu
Xingzhou Lou
Jianbin Jiao
Junge Zhang
OffRL
LLMAG
105
7
0
27 May 2024
Recent advances in text embedding: A Comprehensive Review of Top-Performing Methods on the MTEB Benchmark
Hongliu Cao
AI4TS
116
15
0
27 May 2024
UIT-DarkCow team at ImageCLEFmedical Caption 2024: Diagnostic Captioning for Radiology Images Efficiency with Transformer Models
Quan Van Nguyen
Huy Quang Pham
Dan Quang Tran
Thang Kien-Bao Nguyen
Nhat-Hao Nguyen-Dang
Bao-Thien Nguyen-Tat
MedIm
74
2
0
27 May 2024
Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language Tasks
Yunqi Zhang
Songda Li
Chunyuan Deng
Luyi Wang
Hui Zhao
133
0
0
27 May 2024
Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models
C. N. Vasconcelos
Abdullah Rashwan Austin Waters
Trevor Walker
Keyang Xu
Jimmy Yan
...
Wenlei Zhou
Kevin Swersky
David J. Fleet
Jason Baldridge
Oliver Wang
122
3
0
27 May 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
Mohammad Shoeybi
Bryan Catanzaro
Ming-Yu Liu
RALM
312
205
0
27 May 2024
A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training
Kai Wang
Yukun Zhou
Mingjia Shi
Zhihang Yuan
Yuzhang Shang
Yuzhang Shang
Hanwang Zhang
Hanwang Zhang
Yang You
166
14
0
27 May 2024
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation
Houxing Ren
Mingjie Zhan
Zhongyuan Wu
Aojun Zhou
Junting Pan
Hongsheng Li
SyDa
134
7
0
27 May 2024
ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance
Jiannan Huang
Jun Hao Liew
Hanshu Yan
Yuyang Yin
Yao Zhao
Yunchao Wei
Yunchao Wei
DiffM
209
7
0
27 May 2024
Zamba: A Compact 7B SSM Hybrid Model
Paolo Glorioso
Quentin G. Anthony
Yury Tokpanov
James Whittington
Jonathan Pilault
Adam Ibrahim
Beren Millidge
91
49
0
26 May 2024
A Survey of Multimodal Large Language Model from A Data-centric Perspective
Tianyi Bai
Hao Liang
Binwang Wan
Yanran Xu
Xi Li
...
Ping Huang
Jiulong Shan
Conghui He
Binhang Yuan
Wentao Zhang
152
45
0
26 May 2024
A Preliminary Empirical Study on Prompt-based Unsupervised Keyphrase Extraction
Mingyang Song
Yi Feng
Liping Jing
93
2
0
26 May 2024
Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration
Sunhao Dai
Weihao Liu
Yuqi Zhou
Liang Pang
Rongju Ruan
Gang Wang
Zhenhua Dong
Jun Xu
Jirong Wen
131
12
0
26 May 2024
User-Friendly Customized Generation with Multi-Modal Prompts
Linhao Zhong
Yan Hong
Wentao Chen
Binglin Zhou
Yiyi Zhang
Jianfu Zhang
Liqing Zhang
DiffM
82
1
0
26 May 2024
M
3
^3
3
GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation
Mingshuang Luo
Ruibing Hou
Hong Chang
Zimo Liu
Yaowei Wang
Shiguang Shan
92
10
0
25 May 2024
Uncovering LLM-Generated Code: A Zero-Shot Synthetic Code Detector via Code Rewriting
Tong Ye
Yangkai Du
Tengfei Ma
Lingfei Wu
Xuhong Zhang
Shouling Ji
Wenhai Wang
DeLMO
80
11
0
25 May 2024
How Well Do Deep Learning Models Capture Human Concepts? The Case of the Typicality Effect
Siddhartha K. Vemuri
Raj Sanjay Shah
Sashank Varma
VLM
80
5
0
25 May 2024
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Xudong Lu
Aojun Zhou
Yuhui Xu
Renrui Zhang
Peng Gao
Hongsheng Li
79
8
0
25 May 2024
MoEUT: Mixture-of-Experts Universal Transformers
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
Christopher Potts
Christopher D. Manning
MoE
88
11
0
25 May 2024
Streaming Long Video Understanding with Large Language Models
Rui Qian
Xiao-wen Dong
Pan Zhang
Yuhang Zang
Shuangrui Ding
Dahua Lin
Jiaqi Wang
VLM
142
49
0
25 May 2024
Optimizing Large Language Models for OpenAPI Code Completion
Bohdan Petryshyn
M. Lukoševičius
LLMAG
ALM
74
0
0
24 May 2024
Infinite Limits of Multi-head Transformer Dynamics
Blake Bordelon
Hamza Tahir Chaudhry
Cengiz Pehlevan
AI4CE
125
14
0
24 May 2024
The Impact of Geometric Complexity on Neural Collapse in Transfer Learning
Michael Munn
Benoit Dherin
Javier Gonzalvo
AAML
85
2
0
24 May 2024
GECKO: Generative Language Model for English, Code and Korean
Sungwoo Oh
Donggyu Kim
VLM
82
0
0
24 May 2024
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach
Huy V. Vo
Vasil Khalidov
Timothée Darcet
Théo Moutakanni
Nikita Smetanin
...
Maxime Oquab
Armand Joulin
Hervé Jégou
Patrick Labatut
Piotr Bojanowski
SSL
171
23
0
24 May 2024
ChatGPT Code Detection: Techniques for Uncovering the Source of Code
Marc Oedingen
Raphael C. Engelhardt
Robin Denz
Maximilian Hammer
Wolfgang Konen
DeLMO
100
9
0
24 May 2024
Language-Driven Interactive Traffic Trajectory Generation
Junkai Xia
Chenxin Xu
Qingyao Xu
Chen Xie
Yanfeng Wang
Siheng Chen
99
12
0
24 May 2024
BiSup: Bidirectional Quantization Error Suppression for Large Language Models
Minghui Zou
Ronghui Guo
Sai Zhang
Xiaowang Zhang
Zhiyong Feng
MQ
85
1
0
24 May 2024
Previous
1
2
3
...
57
58
59
...
198
199
200
Next