Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,942 papers shown
Title
Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities
Shaltiel Shmidman
Avi Shmidman
Amir DN Cohen
Moshe Koppel
76
3
0
09 Jul 2024
Learn and Don't Forget: Adding a New Language to ASR Foundation Models
Mengjie Qian
Siyuan Tang
Rao Ma
Kate Knill
Mark Gales
CLL
85
7
0
09 Jul 2024
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore
Rulin Shao
Jacqueline He
Akari Asai
Weijia Shi
Tim Dettmers
Sewon Min
Luke Zettlemoyer
Pang Wei Koh
RALM
102
26
0
09 Jul 2024
Entropy Law: The Story Behind Data Compression and LLM Performance
Mingjia Yin
Chuhan Wu
Yufei Wang
Hao Wang
Wei Guo
Yasheng Wang
Yong Liu
Ruiming Tang
Defu Lian
Enhong Chen
125
27
0
09 Jul 2024
Combining Knowledge Graphs and Large Language Models
Amanda Kau
Xuzeng He
Aishwarya Nambissan
Aland Astudillo
Hui Yin
Amir Aryani
98
17
0
09 Jul 2024
Automated Justification Production for Claim Veracity in Fact Checking: A Survey on Architectures and Approaches
Islam Eldifrawi
Shengrui Wang
Amine Trabelsi
87
9
0
09 Jul 2024
Mobile Edge Intelligence for Large Language Models: A Contemporary Survey
Guanqiao Qu
Qiyuan Chen
Wei Wei
Zheng Lin
Xianhao Chen
Kaibin Huang
185
56
0
09 Jul 2024
Data, Data Everywhere: A Guide for Pretraining Dataset Construction
Jupinder Parmar
Shrimai Prabhumoye
Pritam Gundecha
Bo Liu
Aastha Jhunjhunwala
Zhilin Wang
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
126
10
0
08 Jul 2024
Noise-Free Explanation for Driving Action Prediction
Hongbo Zhu
Theodor Wulff
R. S. Maharjan
Jinpei Han
Angelo Cangelosi
AAML
FAtt
64
0
0
08 Jul 2024
JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation
Yu Zeng
Vishal M. Patel
Haochen Wang
Xun Huang
Ting-Chun Wang
Xuan Li
Yogesh Balaji
DiffM
73
23
0
08 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
119
27
0
08 Jul 2024
Vision-Braille: An End-to-End Tool for Chinese Braille Image-to-Text Translation
Alan Wu
Ye Yuan
Ming Zhang
37
0
0
08 Jul 2024
Large Language Models Understand Layout
Weiming Li
Manni Duan
Dong An
Yan Shao
89
3
0
08 Jul 2024
On the Limitations of Compute Thresholds as a Governance Strategy
Sara Hooker
136
19
0
08 Jul 2024
LLMBox: A Comprehensive Library for Large Language Models
Tianyi Tang
Yiwen Hu
Bingqian Li
Wenyang Luo
Zijing Qin
...
Chunxuan Xia
Junyi Li
Kun Zhou
Wayne Xin Zhao
Ji-Rong Wen
65
2
0
08 Jul 2024
LEVOS: Leveraging Vocabulary Overlap with Sanskrit to Generate Technical Lexicons in Indian Languages
Karthika N J
Krishnakant Bhatt
Ganesh Ramakrishnan
Preethi Jyothi
85
1
0
08 Jul 2024
The infrastructure powering IBM's Gen AI model development
Talia Gershon
Seetharami Seelam
Brian M. Belgodere
Milton Bonilla
Lan Hoang
...
Ruchir Puri
Dakshi Agrawal
Drew Thorstensen
Joel Belog
Brent Tang
VLM
108
6
0
07 Jul 2024
See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition
Chongjie Si
Xiaokang Yang
Wei Shen
84
8
0
07 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
102
19
0
06 Jul 2024
Recent Advancements and Challenges of Turkic Central Asian Language Processing
Yana Veitsman
75
2
0
06 Jul 2024
LoRA-GA: Low-Rank Adaptation with Gradient Approximation
Shaowen Wang
Linxi Yu
Jian Li
ALM
AI4CE
115
47
0
06 Jul 2024
TRACE: TRansformer-based Attribution using Contrastive Embeddings in LLMs
Cheng Wang
Xinyang Lu
Szu Hui Ng
Bryan Kian Hsiang Low
74
0
0
06 Jul 2024
Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression
Zhichao Xu
Ashim Gupta
Tao Li
Oliver Bentham
Vivek Srikumar
111
13
0
06 Jul 2024
Looking into Black Box Code Language Models
Muhammad Umair Haider
Umar Farooq
A.B. Siddique
Mark Marron
92
3
0
05 Jul 2024
Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs
Mihir Parmar
Hanieh Deilamsalehy
Franck Dernoncourt
Seunghyun Yoon
Ryan Rossi
Trung Bui
99
2
0
05 Jul 2024
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
Sungkyun Chang
Emmanouil Benetos
Holger Kirchhoff
Simon Dixon
96
3
0
05 Jul 2024
Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge
Yuanze Lin
Yunsheng Li
Dongdong Chen
Weijian Xu
Ronald Clark
Philip Torr
Lu Yuan
LRM
VLM
81
8
0
05 Jul 2024
Strengthening Structural Inductive Biases by Pre-training to Perform Syntactic Transformations
Matthias Lindemann
Alexander Koller
Ivan Titov
AI4CE
NAI
78
4
0
05 Jul 2024
Leveraging Graph Structures to Detect Hallucinations in Large Language Models
Noa Nonkes
Sergei Agaronian
Evangelos Kanoulas
Roxana Petcu
55
1
0
05 Jul 2024
Waterfall: Framework for Robust and Scalable Text Watermarking
Gregory Kang Ruey Lau
Xinyuan Niu
Hieu Dao
Jiangwei Chen
Chuan-Sheng Foo
Bryan Kian Hsiang Low
WaLM
82
6
0
05 Jul 2024
AMD: Automatic Multi-step Distillation of Large-scale Vision Models
Cheng Han
Qifan Wang
S. Dianat
Majid Rabbani
Raghuveer M. Rao
Yi Fang
Qiang Guan
Lifu Huang
Dongfang Liu
VLM
86
5
0
05 Jul 2024
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Xingrun Xing
Boyan Gao
Zheng Zhang
David A. Clifton
Shitao Xiao
Li Du
Guoqi Li
Jiajun Zhang
170
6
0
05 Jul 2024
Mixture of A Million Experts
Xu Owen He
MoE
105
33
0
04 Jul 2024
Hadamard Adapter: An Extreme Parameter-Efficient Adapter Tuning Method for Pre-trained Language Models
Yuyan Chen
Qiang Fu
Ge Fan
Lun Du
Jian-Guang Lou
Shi Han
Dongmei Zhang
Zhixu Li
Yanghua Xiao
MoE
81
18
0
04 Jul 2024
Can Pre-trained Language Models Understand Chinese Humor?
Yuyan Chen
Zhixu Li
Jiaqing Liang
Yanghua Xiao
Bang Liu
Yunwen Chen
89
23
0
04 Jul 2024
Deep Content Understanding Toward Entity and Aspect Target Sentiment Analysis on Foundation Models
Vorakit Vorakitphan
Milos Basic
Guilhaume Leroy-Meline
35
1
0
04 Jul 2024
Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks
Amit Parekh
Nikolas Vitsakis
Alessandro Suglia
Ioannis Konstas
AAML
94
6
0
04 Jul 2024
Meta-prompting Optimized Retrieval-augmented Generation
João Rodrigues
António Branco
RALM
89
0
0
04 Jul 2024
Uncertainty-Guided Optimization on Large Language Model Search Trees
Julia Grosse
Ruotian Wu
Ahmad Rashid
Philipp Hennig
Pascal Poupart
Agustinus Kristiadi
109
3
0
04 Jul 2024
TongGu: Mastering Classical Chinese Understanding with Knowledge-Grounded Large Language Models
Jiahuan Cao
Dezhi Peng
Peirong Zhang
Yongxin Shi
Yang Liu
Kai Ding
Lianwen Jin
53
1
0
04 Jul 2024
On the Benchmarking of LLMs for Open-Domain Dialogue Evaluation
John Mendonça
A. Lavie
Isabel Trancoso
ELM
55
3
0
04 Jul 2024
Functional Faithfulness in the Wild: Circuit Discovery with Differentiable Computation Graph Pruning
Lei Yu
Jingcheng Niu
Zining Zhu
Gerald Penn
84
7
0
04 Jul 2024
Text2TimeSeries: Enhancing Financial Forecasting through Time Series Prediction Updates with Event-Driven Insights from Large Language Models
Litton J. Kurisinkel
Pruthwik Mishra
Yue Zhang
101
5
0
04 Jul 2024
DSLR: Document Refinement with Sentence-Level Re-ranking and Reconstruction to Enhance Retrieval-Augmented Generation
Taeho Hwang
Soyeong Jeong
Sukmin Cho
SeungYoon Han
Jong C. Park
RALM
126
1
0
04 Jul 2024
MSfusion: A Dynamic Model Splitting Approach for Resource-Constrained Machines to Collaboratively Train Larger Models
Jin Xie
Songze Li
FedML
90
0
0
04 Jul 2024
Universal Length Generalization with Turing Programs
Kaiying Hou
David Brandfonbrener
Sham Kakade
Samy Jelassi
Eran Malach
121
11
0
03 Jul 2024
Single Character Perturbations Break LLM Alignment
Leon Lin
Hannah Brown
Kenji Kawaguchi
Michael Shieh
AAML
429
2
0
03 Jul 2024
Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment
Janghwan Lee
Seongmin Park
S. Hong
Minsoo Kim
Du-Seong Chang
Jungwook Choi
44
6
0
03 Jul 2024
SAFT: Towards Out-of-Distribution Generalization in Fine-Tuning
Bac Nguyen
Stefan Uhlich
Fabien Cardinaux
Lukas Mauch
Marzieh Edraki
Aaron Courville
OODD
CLL
VLM
131
5
0
03 Jul 2024
LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content Moderation of Large Language Models
Hayder Elesedy
Pedro M. Esperança
Silviu Vlad Oprea
Mete Ozay
KELM
94
4
0
03 Jul 2024
Previous
1
2
3
...
47
48
49
...
197
198
199
Next