Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,942 papers shown
Title
Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Jinrui Zhang
Teng Wang
Haigang Zhang
Ping Lu
Feng Zheng
MLLM
LRM
VLM
92
4
0
16 Jul 2024
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
Jung Hyun Lee
Jeonghoon Kim
J. Yang
S. Kwon
Eunho Yang
Kang Min Yoo
Dongsoo Lee
MQ
143
3
0
16 Jul 2024
OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models
Zijian Zhou
Zheng Zhu
Holger Caesar
Miaojing Shi
VLM
100
3
0
15 Jul 2024
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
Hongyu Wang
Shuming Ma
Ruiping Wang
Furu Wei
MoE
88
13
0
15 Jul 2024
Benchmarking Vision Language Models for Cultural Understanding
Shravan Nayak
Kanishk Jain
Rabiul Awal
Siva Reddy
Sjoerd van Steenkiste
Lisa Anne Hendricks
Karolina Stañczak
Aishwarya Agrawal
VLM
CoGe
124
38
0
15 Jul 2024
GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework
Hannah Sansford
Nicholas Richardson
Hermina Petric Maretic
Juba Nait Saada
89
17
0
15 Jul 2024
TCM-FTP: Fine-Tuning Large Language Models for Herbal Prescription Prediction
Xingzhi Zhou
Xin Dong
Chunhao Li
Yuning Bai
Yulong Xu
...
Simon See
Xinpeng Song
Runshun Zhang
Xuezhong Zhou
Nevin L. Zhang
LM&MA
69
5
0
15 Jul 2024
SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation
Jordan Juravsky
Yunrong Guo
Sanja Fidler
Xue Bin Peng
AI4CE
77
11
0
15 Jul 2024
DeepGate3: Towards Scalable Circuit Representation Learning
Zhengyuan Shi
Ziyang Zheng
Sadaf Khan
Jianyuan Zhong
Min Li
Qiang Xu
GNN
AI4CE
98
11
0
15 Jul 2024
Key-Point-Driven Mathematical Reasoning Distillation of Large Language Model
Xunyu Zhu
Jian Li
Can Ma
Weiping Wang
LRM
46
0
0
14 Jul 2024
Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation
Ge Gao
Jongin Kim
Sejin Paik
Ekaterina Novozhilova
Yi Liu
Sarah Bonna
Margrit Betke
Derry Wijaya
91
1
0
14 Jul 2024
AutoGRAMS: Autonomous Graphical Agent Modeling Software
Ben Krause
Lucia Chen
Emmanuel Kahembwe
72
1
0
14 Jul 2024
STGFormer: Spatio-Temporal GraphFormer for 3D Human Pose Estimation in Video
Yang Liu
Zhiyong Zhang
3DH
153
0
0
14 Jul 2024
Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers
Sukjun Hwang
Aakash Lahoti
Tri Dao
Albert Gu
Mamba
134
16
0
13 Jul 2024
An Autonomous GIS Agent Framework for Geospatial Data Retrieval
H. Ning
Zhenlong Li
Temitope Akinboyewa
M. Lessani
78
13
0
13 Jul 2024
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts
Zhenpeng Su
Zijia Lin
Xue Bai
Xing Wu
Yizhe Xiong
...
Guangyuan Ma
Hui Chen
Guiguang Ding
Wei Zhou
Songlin Hu
MoE
96
5
0
13 Jul 2024
Human-like Episodic Memory for Infinite Context LLMs
Zafeirios Fountas
Martin A Benfeghoul
Adnan Oomerjee
Fenia Christopoulou
Gerasimos Lampouras
Haitham Bou-Ammar
Jun Wang
93
21
0
12 Jul 2024
Instruction Following with Goal-Conditioned Reinforcement Learning in Virtual Environments
Zoya Volovikova
A. Skrynnik
Petr Kuderov
Aleksandr I. Panov
LLMAG
LM&Ro
88
1
0
12 Jul 2024
The Sociolinguistic Foundations of Language Modeling
Jack Grieve
Sara Bartl
Matteo Fuoli
Jason Grafmiller
Weihang Huang
A. Jawerbaum
Akira Murakami
Marcus Perlman
Dana Roemling
Bodo Winter
105
12
0
12 Jul 2024
A Survey on Symbolic Knowledge Distillation of Large Language Models
Kamal Acharya
Alvaro Velasquez
Haoze Song
SyDa
78
7
0
12 Jul 2024
Does Incomplete Syntax Influence Korean Language Model? Focusing on Word Order and Case Markers
Jong Myoung Kim
Young-Jun Lee
Yong-jin Han
Sangkeun Jung
Ho-Jin Choi
72
2
0
12 Jul 2024
Exploring the Effectiveness of Methods for Persona Extraction
Konstantin Zaitsev
61
0
0
12 Jul 2024
Domain-adaptive Video Deblurring via Test-time Blurring
Jin-Ting He
Fu-Jen Tsai
Jia-Hao Wu
Yan-Tsung Peng
Chung-Chi Tsai
Chia-Wen Lin
Yen-Yu Lin
107
1
0
12 Jul 2024
Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification
Ke Ji
Peng Wang
Wenjun Ke
Guozheng Li
Jiajun Liu
Jingsheng Gao
Ziyu Shang
BDL
84
2
0
12 Jul 2024
Bora: Biomedical Generalist Video Generation Model
Weixiang Sun
Xiaocao You
Ruizhe Zheng
Zhengqing Yuan
Xiang Li
Lifang He
Quanzheng Li
Lichao Sun
VGen
MedIm
86
9
0
12 Jul 2024
Surgical Text-to-Image Generation
C. Nwoye
Rupak Bose
K. Elgohary
Lorenzo Arboit
Giorgio Carlino
Joël L. Lavanchy
Pietro Mascagni
N. Padoy
MedIm
153
4
0
12 Jul 2024
AirSketch: Generative Motion to Sketch
Hui Xian Grace Lim
Xuanming Cui
Yogesh S Rawat
Ser-Nam Lim
DiffM
VGen
79
0
0
12 Jul 2024
Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts
Zeliang Zhang
Xiaodong Liu
Hao Cheng
Chenliang Xu
Jianfeng Gao
MoE
163
11
0
12 Jul 2024
Characterizing Prompt Compression Methods for Long Context Inference
Siddharth Jha
Lutfi Eren Erdogan
Sehoon Kim
Kurt Keutzer
A. Gholami
119
7
0
11 Jul 2024
Automatic Pruning of Fine-tuning Datasets for Transformer-based Language Models
Mohammadreza Tayaranian
S. H. Mozafari
Brett H. Meyer
J. Clark
Warren J. Gross
73
1
0
11 Jul 2024
GPT-4 is judged more human than humans in displaced and inverted Turing tests
Ishika Rathi
Sydney Taylor
Benjamin K. Bergen
Cameron R. Jones
DeLMO
55
6
0
11 Jul 2024
UICrit: Enhancing Automated Design Evaluation with a UICritique Dataset
Peitong Duan
Chin-Yi Cheng
Gang Li
Bjoern Hartmann
Yang Li
114
9
0
11 Jul 2024
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Valentin Hoffman
Tomasz Limisiewicz
Yulia Tsvetkov
Noah A. Smith
105
5
0
11 Jul 2024
Extracting Training Data from Document-Based VQA Models
Francesco Pinto
N. Rauschmayr
F. Tramèr
Philip Torr
Federico Tombari
92
6
0
11 Jul 2024
Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents
Haoyi Xiong
Zhiyuan Wang
Xuhong Li
Jiang Bian
Zeke Xie
Shahid Mumtaz
Laura E. Barnes
LLMAG
134
8
0
11 Jul 2024
HDT: Hierarchical Document Transformer
Haoyu He
Markus Flicke
Jan Buchmann
Iryna Gurevych
Andreas Geiger
88
0
0
11 Jul 2024
Intelligent Multi-Document Summarisation for Extracting Insights on Racial Inequalities from Maternity Incident Investigation Reports
Georgina Cosma
Mohit Kumar Singh
Patrick Waterson
G. T. Jun
Jonathan Back
52
0
0
11 Jul 2024
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Xiaotong Li
Fan Zhang
Haiwen Diao
Yueze Wang
Xinlong Wang
Ling-yu Duan
VLM
124
32
0
11 Jul 2024
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Zhenyu Zhang
Ajay Jaiswal
L. Yin
Shiwei Liu
Jiawei Zhao
Yuandong Tian
Zhangyang Wang
VLM
77
23
0
11 Jul 2024
Privacy-Preserving Data Deduplication for Enhancing Federated Learning of Language Models
Aydin Abadi
Vishnu Asutosh Dasu
Sumanta Sarkar
83
3
0
11 Jul 2024
Learning Program Behavioral Models from Synthesized Input-Output Pairs
Tural Mammadov
Dietrich Klakow
Alexander Koller
Andreas Zeller
99
3
0
11 Jul 2024
Uncovering Layer-Dependent Activation Sparsity Patterns in ReLU Transformers
Cody Wild
Jesper Anderson
MoE
65
0
0
10 Jul 2024
Fine-Tuning Large Language Models with User-Level Differential Privacy
Zachary Charles
Arun Ganesh
Ryan McKenna
H. B. McMahan
Nicole Mitchell
Krishna Pillutla
Keith Rush
88
14
0
10 Jul 2024
A Review of the Challenges with Massive Web-mined Corpora Used in Large Language Models Pre-Training
Michał Perełkiewicz
Rafał Poświata
81
3
0
10 Jul 2024
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Haiwen Diao
Bo Wan
Xu Jia
Yunzhi Zhuge
Ying Zhang
Huchuan Lu
Long Chen
VLM
95
4
0
10 Jul 2024
Secondary Structure-Guided Novel Protein Sequence Generation with Latent Graph Diffusion
Yutong Hu
Y. Tan
Andi Han
Lirong Zheng
Liang Hong
Bingxin Zhou
DiffM
83
2
0
10 Jul 2024
Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)
K. Kenthapadi
M. Sameki
Ankur Taly
HILM
ELM
AILaw
83
15
0
10 Jul 2024
Deconstructing What Makes a Good Optimizer for Language Models
Rosie Zhao
Depen Morwani
David Brandfonbrener
Nikhil Vyas
Sham Kakade
152
25
0
10 Jul 2024
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
Wentao Zhang
Junliang Guo
Tianyu He
Li Zhao
Linli Xu
Jiang Bian
120
4
0
10 Jul 2024
Training on the Test Task Confounds Evaluation and Emergence
Ricardo Dominguez-Olmedo
Florian E. Dorner
Moritz Hardt
ELM
164
9
1
10 Jul 2024
Previous
1
2
3
...
46
47
48
...
197
198
199
Next