Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,903 papers shown
Title
Sparse Upcycling: Inference Inefficient Finetuning
Sasha Doubov
Nikhil Sardana
Vitaliy Chiley
MoE
71
1
0
13 Nov 2024
Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle
Hui Dai
Ryan Teehan
Mengye Ren
KELM
AIFin
ELM
47
1
0
13 Nov 2024
Leveraging Multimodal Models for Enhanced Neuroimaging Diagnostics in Alzheimer's Disease
Francesco Chiumento
Mingming Liu
LM&MA
71
0
0
12 Nov 2024
FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training
Philip Zmushko
Aleksandr Beznosikov
Martin Takáč
Samuel Horváth
78
2
0
12 Nov 2024
ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization
Weibo Zhao
Yubin Shi
Xinyu Lyu
Wanchen Sui
Shen Li
Shen Li
MQ
78
1
0
12 Nov 2024
New Emerged Security and Privacy of Pre-trained Model: a Survey and Outlook
Meng Yang
Tianqing Zhu
Chi Liu
Wanlei Zhou
Shui Yu
Philip S. Yu
AAML
ELM
PILM
112
1
0
12 Nov 2024
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models
Nvidia
:
Yuval Atzmon
Maciej Bala
Yogesh Balaji
...
Ting-Chun Wang
Shuran Song
Fangyin Wei
Yu Zeng
Qinsheng Zhang
89
9
0
11 Nov 2024
Zeroth-Order Adaptive Neuron Alignment Based Pruning without Re-Training
Elia Cunegatti
Leonardo Lucio Custode
Giovanni Iacca
166
0
0
11 Nov 2024
Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition
Yoshiki Masuyama
Koichi Miyazaki
Masato Murata
Mamba
81
0
0
11 Nov 2024
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Zanlin Ni
Yulin Wang
Renping Zhou
Yizeng Han
Jiayi Guo
Zhiyuan Liu
Yuan Yao
Gao Huang
105
5
0
11 Nov 2024
Model Fusion through Bayesian Optimization in Language Model Fine-Tuning
Chaeyun Jang
Hyungi Lee
Jungtaek Kim
Juho Lee
MoMe
147
4
0
11 Nov 2024
The Super Weight in Large Language Models
Mengxia Yu
De Wang
Qi Shan
Colorado Reed
Alvin Wan
MQ
MILM
88
13
0
11 Nov 2024
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement
Zhennan Chen
Yajie Li
Haofan Wang
Zheyu Chen
Zhengkai Jiang
Jun Yu Li
Qian Wang
Jian Yang
Ying Tai
DiffM
112
9
0
10 Nov 2024
Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation
Yu-Liang Zhan
Zhong-Yi Lu
Hao Sun
Ze-Feng Gao
82
0
0
10 Nov 2024
Robust Detection of LLM-Generated Text: A Comparative Analysis
Yongye Su
Yuqing Wu
DeLMO
78
1
0
09 Nov 2024
BreakGPT: Leveraging Large Language Models for Predicting Asset Price Surges
Aleksandr Simonyan
AI4TS
AIFin
57
0
0
09 Nov 2024
Zyda-2: a 5 Trillion Token High-Quality Dataset
Yury Tokpanov
Paolo Glorioso
Quentin Anthony
Beren Millidge
74
5
0
09 Nov 2024
Sentiment Analysis of Cyberbullying Data in Social Media
Arvapalli Sai Susmitha
Pradeep Pujari
87
0
0
08 Nov 2024
Multi-hop Evidence Pursuit Meets the Web: Team Papelo at FEVER 2024
Christopher Malon
LRM
65
2
0
08 Nov 2024
Improving Multi-Domain Task-Oriented Dialogue System with Offline Reinforcement Learning
Dharmendra Prajapat
Durga Toshniwal
OffRL
55
0
0
08 Nov 2024
Revisiting the Robustness of Watermarking to Paraphrasing Attacks
Saksham Rastogi
Danish Pruthi
WaLM
AAML
84
1
0
08 Nov 2024
Few-Shot Task Learning through Inverse Generative Modeling
Aviv Netanyahu
Yilun Du
Antonia Bronars
Jyothish Pari
J. Tenenbaum
Tianmin Shu
Pulkit Agrawal
142
4
0
07 Nov 2024
TAP-VL: Text Layout-Aware Pre-training for Enriched Vision-Language Models
Jonathan Fhima
Elad Ben Avraham
Oren Nuriel
Yair Kittenplon
Roy Ganz
Aviad Aberdam
Ron Litman
VLM
67
1
0
07 Nov 2024
TrajGPT: Controlled Synthetic Trajectory Generation Using a Multitask Transformer-Based Spatiotemporal Model
Shang-Ling Hsu
Emmanuel Tung
John Krumm
Cyrus Shahabi
Khurram Hassan-Shafique
43
5
0
07 Nov 2024
Robust and Efficient Fine-tuning of LLMs with Bayesian Reparameterization of Low-Rank Adaptation
Ayan Sengupta
Vaibhav Seth
Arinjay Pathak
Natraj Raman
Sriram Gopalakrishnan
Tanmoy Chakraborty
BDL
75
2
0
07 Nov 2024
Scaling Laws for Precision
Tanishq Kumar
Zachary Ankner
Benjamin Spector
Blake Bordelon
Niklas Muennighoff
Mansheej Paul
Cengiz Pehlevan
Christopher Ré
Aditi Raghunathan
AIFin
MoMe
106
29
0
07 Nov 2024
VTechAGP: An Academic-to-General-Audience Text Paraphrase Dataset and Benchmark Models
Ming Cheng
Jiaying Gong
Chenhan Yuan
William A. Ingram
Edward A. Fox
Hoda Eldardiry
240
1
0
07 Nov 2024
Summarization of Opinionated Political Documents with Varied Perspectives
Nicholas Deas
Kathleen McKeown
63
1
0
06 Nov 2024
Deploying Multi-task Online Server with Large Language Model
Yincen Qu
Chao Ma
Xiangying Dai
Hui Zhou
Yiting Wu
Hengyue Liu
58
0
0
06 Nov 2024
NeurIPS 2023 Competition: Privacy Preserving Federated Learning Document VQA
Marlon Tobaben
Mohamed Ali Souibgui
Rubèn Pérez Tito
Khanh Nguyen
Raouf Kerkouche
...
Josep Lladós
Ernest Valveny
Antti Honkela
Mario Fritz
Dimosthenis Karatzas
FedML
94
0
0
06 Nov 2024
Long-Form Text-to-Music Generation with Adaptive Prompts: A Case Study in Tabletop Role-Playing Games Soundtracks
Felipe Marra
Lucas N. Ferreira
109
0
0
06 Nov 2024
Beemo: Benchmark of Expert-edited Machine-generated Outputs
Ekaterina Artemova
Jason Samuel Lucas
Saranya Venkatraman
Jooyoung Lee
Sergei Tilga
Adaku Uchendu
Vladislav Mikhailov
DeLMO
MoE
158
8
0
06 Nov 2024
Two-Stage Pretraining for Molecular Property Prediction in the Wild
Kevin Tirta Wijaya
Minghao Guo
Michael Sun
Hans-Peter Seidel
Wojciech Matusik
Vahid Babaei
AI4CE
67
0
0
05 Nov 2024
Personalized Video Summarization by Multimodal Video Understanding
Brian Chen
Xiangyuan Zhao
Yingnan Zhu
77
1
0
05 Nov 2024
LASER: Attention with Exponential Transformation
Sai Surya Duvvuri
Inderjit Dhillon
55
1
0
05 Nov 2024
The Future of Intelligent Healthcare: A Systematic Analysis and Discussion on the Integration and Impact of Robots Using Large Language Models for Healthcare
Souren Pashangpour
Goldie Nejat
LM&MA
93
9
0
05 Nov 2024
Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation
Junchen Fu
Xuri Ge
Xin Xin
Alexandros Karatzoglou
Ioannis Arapakis
Kaiwen Zheng
Yongxin Ni
J. Jose
64
3
0
05 Nov 2024
Confidence Calibration of Classifiers with Many Classes
Adrien LeCoz
Stéphane Herbin
Faouzi Adjed
UQCV
82
1
0
05 Nov 2024
Photon: Federated LLM Pre-Training
Lorenzo Sani
Alex Iacob
Zeyu Cao
Royson Lee
Bill Marino
...
Dongqi Cai
Zexi Li
Wanru Zhao
Xinchi Qiu
Nicholas D. Lane
AI4CE
88
9
0
05 Nov 2024
Mixtures of In-Context Learners
Giwon Hong
Emile van Krieken
Edoardo Ponti
Nikolay Malkin
Pasquale Minervini
67
1
0
05 Nov 2024
Language Models and Cycle Consistency for Self-Reflective Machine Translation
Jianqiao Wangni
HILM
LRM
49
0
0
05 Nov 2024
Grounding Natural Language to SQL Translation with Data-Based Self-Explanations
Yuankai Fan
Tonghui Ren
Can Huang
Zhenying He
Xinyu Wang
LRM
130
2
0
05 Nov 2024
From Twitter to Reasoner: Understand Mobility Travel Modes and Sentiment Using Large Language Models
Kangrui Ruan
Xinyang Wang
Xuan Di
85
5
0
04 Nov 2024
Training-free Regional Prompting for Diffusion Transformers
Anthony Chen
Jianjin Xu
Wenzhao Zheng
Gaole Dai
Yun Wang
Renrui Zhang
Haofan Wang
Shanghang Zhang
VLM
102
5
0
04 Nov 2024
SIRA: Scalable Inter-frame Relation and Association for Radar Perception
Ryoma Yataka
Peng Wang
P. Boufounos
R. Takahashi
97
5
0
04 Nov 2024
Training Compute-Optimal Protein Language Models
Xingyi Cheng
Bo Chen
Pan Li
Jing Gong
Jie Tang
Le Song
123
17
0
04 Nov 2024
Model Integrity when Unlearning with T2I Diffusion Models
Andrea Schioppa
Emiel Hoogeboom
Jonathan Heek
140
3
0
04 Nov 2024
Enhancing ID-based Recommendation with Large Language Models
Lei Chen
Chen Gao
Xiaoyi Du
Hengliang Luo
Depeng Jin
Yong Li
Ming Wang
92
3
0
04 Nov 2024
Shortcut Learning in In-Context Learning: A Survey
Rui Song
Yingji Li
Fausto Giunchiglia
Fausto Giunchiglia
Hao Xu
128
3
0
04 Nov 2024
Can Language Models Enable In-Context Database?
Yu Pan
Hongfeng Yu
Tianjiao Zhao
Jianxin Sun
KELM
SyDa
LMTD
67
0
0
04 Nov 2024
Previous
1
2
3
...
28
29
30
...
197
198
199
Next