Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,923 papers shown
Title
Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models
Angela Lopez-Cardona
Carlos Segura
Alexandros Karatzoglou
Sergi Abadal
Ioannis Arapakis
ALM
175
4
0
02 Oct 2024
ENTP: Encoder-only Next Token Prediction
Ethan Ewer
Daewon Chae
Thomas Zeng
Jinkyu Kim
Kangwook Lee
112
4
0
02 Oct 2024
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
Jiyeon Kim
Hyunji Lee
Hyowon Cho
Joel Jang
Hyeonbin Hwang
Seungpil Won
Youbin Ahn
Dohaeng Lee
Minjoon Seo
KELM
422
5
0
02 Oct 2024
House of Cards: Massive Weights in LLMs
Jaehoon Oh
Seungjun Shin
Dokwan Oh
125
1
0
02 Oct 2024
A Watermark for Black-Box Language Models
Dara Bahri
John Wieting
WaLM
151
6
0
02 Oct 2024
Do Music Generation Models Encode Music Theory?
Megan Wei
Michael Freeman
Chris Donahue
Chen Sun
MGen
68
6
0
01 Oct 2024
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models
Keivan Alizadeh
Iman Mirzadeh
Hooman Shahrokhi
Dmitry Belenko
Frank Sun
Minsik Cho
Mohammad Hossein Sekhavat
Moin Nabi
Mehrdad Farajtabar
MoE
92
1
0
01 Oct 2024
AlignSum: Data Pyramid Hierarchical Fine-tuning for Aligning with Human Summarization Preference
Yang Han
Yiming Wang
Rui Wang
Lu Chen
Kai Yu
AI4TS
ALM
59
2
0
01 Oct 2024
FedPT: Federated Proxy-Tuning of Large Language Models on Resource-Constrained Edge Devices
Zhidong Gao
Yu Zhang
Zhenxiao Zhang
Yanmin Gong
Yuanxiong Guo
69
1
0
01 Oct 2024
Self-controller: Controlling LLMs with Multi-round Step-by-step Self-awareness
Xiao Peng
Xufan Geng
LLMAG
92
1
0
01 Oct 2024
Integrating Text-to-Music Models with Language Models: Composing Long Structured Music Pieces
Lilac Atassi
84
0
0
01 Oct 2024
Semantic Parsing with Candidate Expressions for Knowledge Base Question Answering
Daehwan Nam
Gary Geunbae Lee
121
0
0
01 Oct 2024
Causal Representation Learning with Generative Artificial Intelligence: Application to Texts as Treatments
Kosuke Imai
Kentaro Nakamura
CML
131
6
0
01 Oct 2024
Delving Deep into Engagement Prediction of Short Videos
Dasong Li
Wenjie Li
Baili Lu
Hongsheng Li
Sizhuo Ma
Gurunandan Krishnan
Jian Wang
106
0
0
30 Sep 2024
Comprehensive Performance Modeling and System Design Insights for Foundation Models
Shashank Subramanian
Ermal Rrapaj
Peter Harrington
Smeet Chheda
S. Farrell
Brian Austin
Samuel Williams
N. Wright
W. Bhimji
85
0
0
30 Sep 2024
ACE: All-round Creator and Editor Following Instructions via Diffusion Transformer
Zhen Han
Zeyinzi Jiang
Yulin Pan
Jingfeng Zhang
Chaojie Mao
Chenwei Xie
Yu Liu
Jingren Zhou
DiffM
102
21
0
30 Sep 2024
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers
Lirui Wang
Xinlei Chen
Jialiang Zhao
Kaiming He
73
44
0
30 Sep 2024
Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs
Mehdi Ali
Michael Fromm
Klaudia Thellmann
Jan Ebert
Alexander Arno Weber
...
René Jäkel
Georg Rehm
Stefan Kesselheim
Joachim Köhler
Nicolas Flores-Herr
102
7
0
30 Sep 2024
Wait, but Tylenol is Acetaminophen... Investigating and Improving Language Models' Ability to Resist Requests for Misinformation
Shan Chen
Mingye Gao
Kuleen Sasse
Thomas Hartvigsen
Brian Anthony
Lizhou Fan
Hugo J. W. L. Aerts
Jack Gallifant
Danielle S. Bitterman
LM&MA
86
1
0
30 Sep 2024
The Perfect Blend: Redefining RLHF with Mixture of Judges
Tengyu Xu
Eryk Helenowski
Karthik Abinav Sankararaman
Di Jin
Kaiyan Peng
...
Gabriel Cohen
Yuandong Tian
Hao Ma
Sinong Wang
Han Fang
142
14
0
30 Sep 2024
Analysing Zero-Shot Readability-Controlled Sentence Simplification
Abdullah Barayan
Jose Camacho-Collados
Fernando Alva-Manchego
83
3
0
30 Sep 2024
Aggressive Post-Training Compression on Extremely Large Language Models
Zining Zhang
Yao Chen
Bingsheng He
Zhenjie Zhang
38
0
0
30 Sep 2024
Beyond Scores: A Modular RAG-Based System for Automatic Short Answer Scoring with Feedback
Menna Fateen
Bo Wang
Tsunenori Mine
AI4Ed
97
6
0
30 Sep 2024
Mitigating Backdoor Threats to Large Language Models: Advancement and Challenges
Qin Liu
Wenjie Mo
Terry Tong
Lyne Tchapmi
Fei Wang
Chaowei Xiao
Muhao Chen
AAML
94
4
0
30 Sep 2024
CONTESTS: a Framework for Consistency Testing of Span Probabilities in Language Models
Eitan Wagner
Yuli Slavutsky
Omri Abend
78
1
0
30 Sep 2024
Enhancing High-order Interaction Awareness in LLM-based Recommender Model
Xinfeng Wang
Jin Cui
Fumiyo Fukumoto
Yoshimi Suzuki
65
4
0
30 Sep 2024
Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function
Chenyi Zhuang
Ying Hu
Pan Gao
DiffM
VLM
113
11
0
30 Sep 2024
JaPOC: Japanese Post-OCR Correction Benchmark using Vouchers
Masato Fujitake
70
0
0
30 Sep 2024
Illustrious: an Open Advanced Illustration Model
Sang Hyun Park
Jun Young Koh
Junha Lee
Joy Song
Dongha Kim
Hoyeon Moon
Hyunju Lee
Min Song
VLM
56
1
0
30 Sep 2024
MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation
Wenchao Chen
Liqiang Niu
Ziyao Lu
Fandong Meng
Jie Zhou
Mamba
96
4
0
30 Sep 2024
Realtime, multimodal invasive ventilation risk monitoring using language models and BoXHED
Arash Pakbin
Aaron Su
Donald K. K. Lee
B. Mortazavi
47
0
0
29 Sep 2024
Natural Language Generation for Visualizations: State of the Art, Challenges and Future Directions
Enamul Hoque
Mohammed Saidul Islam
75
3
0
29 Sep 2024
A Certified Robust Watermark For Large Language Models
Xianheng Feng
Jian Liu
Kui Ren
Chun Chen
AAML
WaLM
77
0
0
29 Sep 2024
CERD: A Comprehensive Chinese Rhetoric Dataset for Rhetorical Understanding and Generation in Essays
Nuowei Liu
Xinhao Chen
Hongyi Wu
Changzhi Sun
Man Lan
Yuanbin Wu
Xiaopeng Bai
Shaoguang Mao
Yan Xia
70
1
0
29 Sep 2024
Learning Attentional Mixture of LoRAs for Language Model Continual Learning
Jialin Liu
Jianhua Wu
Jie Liu
Yutai Duan
KELM
CLL
MoMe
43
2
0
29 Sep 2024
Mitigating the Negative Impact of Over-association for Conversational Query Production
Ante Wang
Linfeng Song
Zijun Min
Ge Xu
Xiaoli Wang
Junfeng Yao
Jinsong Su
125
1
0
29 Sep 2024
Transforming Scholarly Landscapes: Influence of Large Language Models on Academic Fields beyond Computer Science
Aniket Pramanick
Yufang Hou
Saif M. Mohammad
Iryna Gurevych
110
2
0
29 Sep 2024
A Critical Look at Meta-evaluating Summarisation Evaluation Metrics
Xiang Dai
Sarvnaz Karimi
Biaoyan Fang
66
0
0
29 Sep 2024
'Simulacrum of Stories': Examining Large Language Models as Qualitative Research Participants
Shivani Kapania
William Agnew
Motahhare Eslami
Hoda Heidari
Sarah E Fox
81
5
0
28 Sep 2024
Designing Domain-Specific Large Language Models: The Critical Role of Fine-Tuning in Public Opinion Simulation
Haocheng Lin
ALM
57
1
0
28 Sep 2024
Scalable Fine-tuning from Multiple Data Sources: A First-Order Approximation Approach
Dongyue Li
Ziniu Zhang
Lu Wang
Hongyang R. Zhang
105
1
0
28 Sep 2024
Conditional Image Synthesis with Diffusion Models: A Survey
Zheyuan Zhan
Defang Chen
Jian-Ping Mei
Zhenghe Zhao
Jiawei Chen
Chun-Yen Chen
Siwei Lyu
Can Wang
VLM
116
10
0
28 Sep 2024
Show and Guide: Instructional-Plan Grounded Vision and Language Model
Diogo Glória-Silva
David Semedo
João Magalhães
48
0
0
27 Sep 2024
On the Inductive Bias of Stacking Towards Improving Reasoning
Nikunj Saunshi
Stefani Karp
Shankar Krishnan
Sobhan Miryoosefi
Sashank J. Reddi
Sanjiv Kumar
LRM
AI4CE
90
7
0
27 Sep 2024
From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding
Heqing Zou
Tianze Luo
Guiyang Xie
Victor
Zhang
...
Guangcong Wang
Juanyang Chen
Zhuochen Wang
Hansheng Zhang
Huaijian Zhang
VLM
120
7
0
27 Sep 2024
Emu3: Next-Token Prediction is All You Need
Xinlong Wang
Xiaosong Zhang
Zhengxiong Luo
Quan-Sen Sun
Yufeng Cui
...
Xi Yang
Jingjing Liu
Yonghua Lin
Tiejun Huang
Zhongyuan Wang
MLLM
116
233
0
27 Sep 2024
Harmonizing knowledge Transfer in Neural Network with Unified Distillation
Yaomin Huang
Zaomin Yan
Yaxin Peng
Faming Fang
Guixu Zhang
95
0
0
27 Sep 2024
MIMII-Gen: Generative Modeling Approach for Simulated Evaluation of Anomalous Sound Detection System
Harsh Purohit
Tomoya Nishida
Kota Dohi
Takashi Endo
Yohei Kawaguchi
DiffM
70
1
0
27 Sep 2024
Leveraging Long-Context Large Language Models for Multi-Document Understanding and Summarization in Enterprise Applications
Aditi Godbole
Jabin Geevarghese George
Smita Shandilya
84
5
0
27 Sep 2024
Multimodal Pragmatic Jailbreak on Text-to-image Models
Tong Liu
Zhixin Lai
Jiawen Wang
Gengyuan Zhang
Shuo Chen
Philip Torr
Vera Demberg
Volker Tresp
Jindong Gu
73
5
0
27 Sep 2024
Previous
1
2
3
...
36
37
38
...
197
198
199
Next