Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,959 papers shown
Title
LIDAO: Towards Limited Interventions for Debiasing (Large) Language Models
Tianci Liu
Haoyu Wang
Shiyang Wang
Yu Cheng
Jing Gao
ALM
86
1
0
01 Jun 2024
A Survey on Large Language Models for Code Generation
Juyong Jiang
Fan Wang
Jiasi Shen
Sungju Kim
Sunghun Kim
167
204
0
01 Jun 2024
DS@BioMed at ImageCLEFmedical Caption 2024: Enhanced Attention Mechanisms in Medical Caption Generation through Concept Detection Integration
Nhi Ngoc-Yen Nguyen
Le-Huy Tu
Dieu-Phuong Nguyen
Nhat-Tan Do
Minh Triet Thai
Bao-Thien Nguyen-Tat
MedIm
86
2
0
01 Jun 2024
Beyond Metrics: Evaluating LLMs' Effectiveness in Culturally Nuanced, Low-Resource Real-World Scenarios
Millicent Ochieng
Varun Gumma
Sunayana Sitaram
Jindong Wang
Vishrav Chaudhary
K. Ronen
Kalika Bali
Jacki OÑeill
65
4
0
01 Jun 2024
KGLink: A column type annotation method that combines knowledge graph and pre-trained language model
Yubo Wang
Hao Xin
Lei Chen
LMTD
130
3
0
01 Jun 2024
Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning
Sangwon Ryu
Heejin Do
Yunsu Kim
Gary Geunbae Lee
Jungseul Ok
109
3
0
01 Jun 2024
Cross-Table Pretraining towards a Universal Function Space for Heterogeneous Tabular Data
Jintai Chen
Zhen Lin
Qiyuan Chen
Jimeng Sun
LMTD
96
1
0
01 Jun 2024
SNED: Superposition Network Architecture Search for Efficient Video Diffusion Model
Zhengang Li
Yan Kang
Yuchen Liu
Difan Liu
Tobias Hinz
Feng Liu
Yanzhi Wang
DiffM
77
1
0
31 May 2024
Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling
Jiatao Gu
Ying Shen
Shuangfei Zhai
Yizhe Zhang
Navdeep Jaitly
J. Susskind
127
10
0
31 May 2024
Fast yet Safe: Early-Exiting with Risk Control
Metod Jazbec
Alexander Timans
Tin Hadvzi Veljković
K. Sakmann
Dan Zhang
C. A. Naesseth
Eric T. Nalisnick
112
5
0
31 May 2024
ABodyBuilder3: Improved and scalable antibody structure predictions
Henry Kenlay
Frédéric A. Dreyer
Daniel Cutting
Daniel A. Nissley
Charlotte M. Deane
43
10
0
31 May 2024
Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning
Cheng Tan
Jingxuan Wei
Linzhuang Sun
Zhangyang Gao
Siyuan Li
Bihui Yu
Ruifeng Guo
Stan Z. Li
ReLM
LRM
3DV
127
7
0
31 May 2024
FinGen: A Dataset for Argument Generation in Finance
Chung-Chi Chen
Hiroya Takamura
Ichiro Kobayashi
Yusuke Miyao
65
0
0
31 May 2024
Passage-specific Prompt Tuning for Passage Reranking in Question Answering with Large Language Models
Xuyang Wu
Zhiyuan Peng
Sravanthi Rajanala
Hsin-Tai Wu
Yi Fang
LRM
RALM
86
0
0
31 May 2024
GAMedX: Generative AI-based Medical Entity Data Extractor Using Large Language Models
Mohammed-Khalil Ghali
Abdelrahman Farrag
Hajar Sakai
Hicham El Baz
Yu Jin
Sarah Lam
LM&MA
MedIm
88
9
0
31 May 2024
LCQ: Low-Rank Codebook based Quantization for Large Language Models
Wen-Pu Cai
Wu-Jun Li
Wu-Jun Li
MQ
125
0
0
31 May 2024
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Zachary Ankner
Cody Blakeney
Kartik K. Sreenivasan
Max Marion
Matthew L. Leavitt
Mansheej Paul
120
34
0
30 May 2024
RapVerse: Coherent Vocals and Whole-Body Motions Generations from Text
Jiaben Chen
Xin Yan
Yihang Chen
Siyuan Cen
Qinwei Ma
Haoyu Zhen
Kaizhi Qian
Lie Lu
Chuang Gan
70
0
0
30 May 2024
Large Language Models Can Self-Improve At Web Agent Tasks
Ajay Patel
M. Hofmarcher
Claudiu Leoveanu-Condrei
Marius-Constantin Dinu
Chris Callison-Burch
Sepp Hochreiter
LLMAG
122
31
0
30 May 2024
ROAST: Review-level Opinion Aspect Sentiment Target Joint Detection for ABSA
Siva Uday Sampreeth Chebolu
Franck Dernoncourt
Nedim Lipka
Thamar Solorio
82
1
0
30 May 2024
Heidelberg-Boston @ SIGTYP 2024 Shared Task: Enhancing Low-Resource Language Analysis With Character-Aware Hierarchical Transformers
Frederick Riemenschneider
Kevin Krahn
67
2
0
30 May 2024
A Structure-Aware Lane Graph Transformer Model for Vehicle Trajectory Prediction
Zhanbo Sun
Caiyin Dong
Ang Ji
Ruibin Zhao
Yu Zhao
84
0
0
30 May 2024
GenKubeSec: LLM-Based Kubernetes Misconfiguration Detection, Localization, Reasoning, and Remediation
Ehud Malul
Yair Meidan
D. Mimran
Yuval Elovici
A. Shabtai
108
5
0
30 May 2024
From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems
Jianliang He
Siyu Chen
Fengzhuo Zhang
Zhuoran Yang
LM&Ro
LLMAG
99
3
0
30 May 2024
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model
Chaochen Gao
Xing Wu
Qingfang Fu
Songlin Hu
SyDa
120
7
0
30 May 2024
Just Rewrite It Again: A Post-Processing Method for Enhanced Semantic Similarity and Privacy Preservation of Differentially Private Rewritten Text
Stephen Meisenbacher
Florian Matthes
89
4
0
30 May 2024
Instruction-Guided Visual Masking
Jinliang Zheng
Jianxiong Li
Si Cheng
Yinan Zheng
Jiaming Li
Jihao Liu
Yu Liu
Jingjing Liu
Xianyuan Zhan
143
7
0
30 May 2024
Knowledge Graph Tuning: Real-time Large Language Model Personalization based on Human Feedback
Jingwei Sun
Zhixu Du
Yiran Chen
KELM
62
2
0
30 May 2024
Large Language Model Watermark Stealing With Mixed Integer Programming
Zhaoxi Zhang
Xiaomei Zhang
Yanjun Zhang
Leo Yu Zhang
Chao Chen
Shengshan Hu
Asif Gill
Shirui Pan
AAML
66
7
0
30 May 2024
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models
Yutao Zhu
Zhaoheng Huang
Zhicheng Dou
Ji-Rong Wen
RALM
96
6
0
30 May 2024
SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors
Vijay Lingam
Atula Tejaswi
Aditya Vavre
Aneesh Shetty
Gautham Krishna Gudur
Joydeep Ghosh
Alexandros G. Dimakis
Eunsol Choi
Aleksandar Bojchevski
Sujay Sanghavi
122
18
0
30 May 2024
Cascade-Aware Training of Language Models
Congchao Wang
Sean Augenstein
Keith Rush
Wittawat Jitkrittum
Harikrishna Narasimhan
A. S. Rawat
A. Menon
Alec Go
101
4
0
29 May 2024
X-VILA: Cross-Modality Alignment for Large Language Model
Hanrong Ye
De-An Huang
Yao Lu
Zhiding Yu
Ming-Yu Liu
...
Jan Kautz
Song Han
Dan Xu
Pavlo Molchanov
Hongxu Yin
MLLM
VLM
88
35
0
29 May 2024
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
Ge Zhang
Scott Qu
Jiaheng Liu
Chenchen Zhang
Chenghua Lin
...
Zi-Kai Zhao
Jiajun Zhang
Wanli Ouyang
Wenhao Huang
Wenhu Chen
ELM
126
46
0
29 May 2024
Faster Cascades via Speculative Decoding
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
Seungyeon Kim
Neha Gupta
A. Menon
Sanjiv Kumar
LRM
118
10
0
29 May 2024
Offline Regularised Reinforcement Learning for Large Language Models Alignment
Pierre Harvey Richemond
Yunhao Tang
Daniel Guo
Daniele Calandriello
M. G. Azar
...
Gil Shamir
Rishabh Joshi
Tianqi Liu
Rémi Munos
Bilal Piot
OffRL
126
29
0
29 May 2024
Faithful Chart Summarization with ChaTS-Pi
Syrine Krichene
Francesco Piccinno
Fangyu Liu
Julian Martin Eisenschlos
130
2
0
29 May 2024
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Jiaqi Xu
Xinyi Zou
Kunzhe Huang
Yunkuo Chen
Bo Liu
Mengli Cheng
Xing Shi
Jun Huang
VGen
110
51
0
29 May 2024
Are queries and keys always relevant? A case study on Transformer wave functions
Riccardo Rende
Luciano Loris Viteritti
100
7
0
29 May 2024
MoNDE: Mixture of Near-Data Experts for Large-Scale Sparse Models
Taehyun Kim
Kwanseok Choi
Youngmock Cho
Jaehoon Cho
Hyukzae Lee
Jaewoong Sim
MoE
56
6
0
29 May 2024
On the Role of Attention Masks and LayerNorm in Transformers
Xinyi Wu
A. Ajorlou
Yifei Wang
Stefanie Jegelka
Ali Jadbabaie
98
12
0
29 May 2024
Contextual Position Encoding: Learning to Count What's Important
O. Yu. Golovneva
Tianlu Wang
Jason Weston
Sainbayar Sukhbaatar
116
35
0
29 May 2024
Learning from Litigation: Graphs and LLMs for Retrieval and Reasoning in eDiscovery
Sounak Lahiri
Sumit Pai
Tim Weninger
Sanmitra Bhattacharya
AILaw
102
0
0
29 May 2024
MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification
Laura Fieback
Jakob Spiegelberg
Hanno Gottschalk
MLLM
240
5
0
29 May 2024
Understanding Intrinsic Socioeconomic Biases in Large Language Models
Mina Arzaghi
Florian Carichon
G. Farnadi
48
0
0
28 May 2024
Wavelet-Based Image Tokenizer for Vision Transformers
Zhenhai Zhu
Radu Soricut
ViT
109
5
0
28 May 2024
Low-rank finetuning for LLMs: A fairness perspective
Saswat Das
Marco Romanelli
Cuong Tran
Zarreen Reza
B. Kailkhura
Ferdinando Fioretto
74
2
0
28 May 2024
Understanding Transformer Reasoning Capabilities via Graph Algorithms
Clayton Sanford
Bahare Fatemi
Ethan Hall
Anton Tsitsulin
Seyed Mehran Kazemi
Jonathan J. Halcrow
Bryan Perozzi
Vahab Mirrokni
114
39
0
28 May 2024
QUEST: Quality-Aware Metropolis-Hastings Sampling for Machine Translation
Gonccalo R. A. Faria
Sweta Agrawal
António Farinhas
Ricardo Rei
José G. C. de Souza
André F. T. Martins
69
5
0
28 May 2024
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Alexander Hägele
Elie Bakouch
Atli Kosson
Loubna Ben Allal
Leandro von Werra
Martin Jaggi
127
45
0
28 May 2024
Previous
1
2
3
...
56
57
58
...
198
199
200
Next