Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,891 papers shown
Title
Grams: Gradient Descent with Adaptive Momentum Scaling
Yang Cao
Xiaoyu Li
Zhao Song
ODL
213
3
0
22 Dec 2024
DragonVerseQA: Open-Domain Long-Form Context-Aware Question-Answering
A. Lahiri
Qinmin Vivian Hu
116
1
0
21 Dec 2024
Two-in-One: Unified Multi-Person Interactive Motion Generation by Latent Diffusion Transformer
Yangqiu Song
Xihua Wang
Ruihua Song
Wenbing Huang
DiffM
VGen
122
1
0
21 Dec 2024
Transformer-based toxin-protein interaction analysis prioritizes airborne particulate matter components with potential adverse health effects
Yan Zhu
Shihao Wang
Yong Han
Yao Lu
Shulan Qiu
Ling Jin
Xiangdong Li
Weixiong Zhang
115
1
0
21 Dec 2024
Automated CVE Analysis: Harnessing Machine Learning In Designing Question-Answering Models For Cybersecurity Information Extraction
Tanjim Bin Faruk
81
0
0
21 Dec 2024
Identifying Cyberbullying Roles in Social Media
Manuel Sandoval
Mohammed Abuhamad
Patrick Furman
Mujtaba Nazari
Deborah L. Hall
Yasin N. Silva
85
0
0
21 Dec 2024
AlzheimerRAG: Multimodal Retrieval Augmented Generation for Clinical Use Cases using PubMed articles
A. Lahiri
Qinmin Vivian Hu
139
10
0
21 Dec 2024
Overview of the First Workshop on Language Models for Low-Resource Languages (LoResLM 2025)
Hansi Hettiarachchi
Tharindu Ranasinghe
Paul Rayson
R. Mitkov
M. Gaber
Damith Premasiri
Fiona Anting Tan
Lasitha Uyangodage
AI4CE
162
1
0
20 Dec 2024
Personalized Representation from Personalized Generation
Shobhita Sundaram
Julia Chae
Yonglong Tian
Sara Beery
Phillip Isola
116
1
0
20 Dec 2024
Continual Learning Using a Kernel-Based Method Over Foundation Models
Saleh Momeni
Sahisnu Mazumder
Bing-Quan Liu
CLL
145
2
0
20 Dec 2024
PreNeT: Leveraging Computational Features to Predict Deep Neural Network Training Time
Alireza Pourali
Arian Boukani
Hamzeh Khazaei
115
0
0
20 Dec 2024
ADEQA: A Question Answer based approach for joint ADE-Suspect Extraction using Sequence-To-Sequence Transformers
Vinayak Arannil
Tomal Deb
Atanu Roy
160
1
0
20 Dec 2024
GCA-3D: Towards Generalized and Consistent Domain Adaptation of 3D Generators
Hengjia Li
Yang Liu
Yibo Zhao
Haoran Cheng
Yang Yang
...
Qibo Qiu
Boxi Wu
Tu Zheng
Zheng Yang
D. Cai
149
0
0
20 Dec 2024
Self-Evolution Knowledge Distillation for LLM-based Machine Translation
Yuncheng Song
Liang Ding
Changtong Zan
Shujian Huang
192
0
0
19 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
255
10
0
19 Dec 2024
Future Research Avenues for Artificial Intelligence in Digital Gaming: An Exploratory Report
Markus Dablander
164
0
0
18 Dec 2024
Compositional Generalization Across Distributional Shifts with Sparse Tree Operations
Paul Soulos
Henry Conklin
Mattia Opper
P. Smolensky
Jianfeng Gao
Roland Fernandez
138
5
0
18 Dec 2024
Hansel: Output Length Controlling Framework for Large Language Models
Seoha Song
Junhyun Lee
Hyeonmok Ko
137
0
0
18 Dec 2024
SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor
Chenyu Yang
Shuai Wang
Hangting Chen
Jianwei Yu
Wei Tan
Rongzhi Gu
Yongjun Xu
Yizhi Zhou
Haina Zhu
Haoyang Li
KELM
425
2
0
18 Dec 2024
On the Compression of Language Models for Code: An Empirical Study on CodeBERT
Giordano dÁloisio
Luca Traini
Federica Sarro
A. Marco
89
1
0
18 Dec 2024
MedCoT: Medical Chain of Thought via Hierarchical Expert
Jiaxiang Liu
Yuan Wang
Jiawei Du
Qiufeng Wang
Zuozhu Liu
LRM
153
20
0
18 Dec 2024
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Benjamin Warner
Antoine Chaffin
Benjamin Clavié
Orion Weller
Oskar Hallström
...
Tom Aarsen
Nathan Cooper
Griffin Adams
Jeremy Howard
Iacopo Poli
162
130
0
18 Dec 2024
Information-Theoretic Generative Clustering of Documents
Xin Du
Kumiko Tanaka-Ishii
109
0
0
18 Dec 2024
Deploying Foundation Model Powered Agent Services: A Survey
Wenchao Xu
Jinyu Chen
Peirong Zheng
Xiaoquan Yi
Tianyi Tian
...
Quan Wan
Yining Qi
Yunfeng Fan
Qinliang Su
Xuemin Shen
AI4CE
181
2
0
18 Dec 2024
Do Language Models Understand Time?
Xi Ding
Lei Wang
335
2
0
18 Dec 2024
Prompt Categories Cluster for Weakly Supervised Semantic Segmentation
Wangyu Wu
Xianglin Qiu
Siqi Song
Xiaowei Huang
Fei Ma
Jimin Xiao
VLM
202
6
0
18 Dec 2024
CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers
Dimitrios Mallis
Ahmet Serdar Karadeniz
Sebastian Cavada
Danila Rukhovich
Niki Maria Foteinopoulou
K. Cherenkova
Anis Kacem
Djamila Aouada
182
7
0
18 Dec 2024
Extending LLMs to New Languages: A Case Study of Llama and Persian Adaptation
Samin Mahdizadeh Sani
Pouya Sadeghi
Thuy-Trang Vu
Yadollah Yaghoobzadeh
Gholamreza Haffari
183
2
0
17 Dec 2024
MOPO: Multi-Objective Prompt Optimization for Affective Text Generation
Yarik Menchaca Resendiz
Roman Klinger
122
1
0
17 Dec 2024
LLMs are Also Effective Embedding Models: An In-depth Overview
Chongyang Tao
Tao Shen
Shen Gao
Junshuo Zhang
Zhen Li
Zhengwei Tao
Shuai Ma
143
11
0
17 Dec 2024
Understanding Emotional Body Expressions via Large Language Models
Haifeng Lu
Jiuyi Chen
Feng Liang
Mingkui Tan
Runhao Zeng
Xiping Hu
111
0
0
17 Dec 2024
ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models
Yuxi Sun
Wei Gao
Jing Ma
Hongzhan Lin
Ziyang Luo
Wenxuan Zhang
ELM
175
0
0
17 Dec 2024
IDEA-Bench: How Far are Generative Models from Professional Designing?
C. Liang
Lianghua Huang
Jingwu Fang
Huanzhang Dou
Wei Wang
Zhi-Fan Wu
Yupeng Shi
Junge Zhang
Xin Zhao
Yu Liu
3DV
142
1
0
16 Dec 2024
QUENCH: Measuring the gap between Indic and Non-Indic Contextual General Reasoning in LLMs
Mohammad Aflah Khan
Neemesh Yadav
Sarah Masud
Md. Shad Akhtar
169
0
0
16 Dec 2024
Context Filtering with Reward Modeling in Question Answering
Sangryul Kim
James Thorne
155
0
0
16 Dec 2024
A comprehensive GeoAI review: Progress, Challenges and Outlooks
Anasse Boutayeb
Iyad Lahsen-cherif
Ahmed El Khadimi
107
0
0
16 Dec 2024
Embodied CoT Distillation From LLM To Off-the-shelf Agents
Wonje Choi
Woo Kyung Kim
Minjong Yoo
Honguk Woo
OffRL
LM&Ro
163
3
0
16 Dec 2024
Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs
Yuchen Fu
Zifeng Cheng
Zhiwei Jiang
Zhonghui Wang
Yafeng Yin
Zhengliang Li
Qing Gu
LLMAG
127
2
0
16 Dec 2024
LLMs Can Simulate Standardized Patients via Agent Coevolution
Zhuoyun Du
Lujie Zheng
Renjun Hu
Yuyang Xu
Xiaochen Li
Ying Sun
Wei Chen
Jian Wu
Haolei Cai
Haohao Ying
LM&MA
117
5
0
16 Dec 2024
Rethinking Associative Memory Mechanism in Induction Head
Shuo Wang
Issei Sato
185
0
0
16 Dec 2024
RoLargeSum: A Large Dialect-Aware Romanian News Dataset for Summary, Headline, and Keyword Generation
Andrei-Marius Avram
Mircea Timpuriu
Andreea Iuga
Vlad-Cristian Matei
Iulian-Marius Taiatu
Tudor Găină
Dumitru-Clementin Cercel
Florin-Catalin Pop
Mihaela-Claudia Cercel
189
1
0
15 Dec 2024
LAW: Legal Agentic Workflows for Custody and Fund Services Contracts
William Watson
Nicole Cho
Nishan Srishankar
Zhen Zeng
Lucas Cecchi
Daniel Scott
S. Siddagangappa
Rachneet Kaur
T. Balch
Manuela Veloso
AILaw
118
0
0
15 Dec 2024
An Enhanced Text Compression Approach Using Transformer-based Language Models
C. M. Rahman
Mahbub E Sobhani
Anika Tasnim Rodela
Swakkhar Shatabda
131
1
0
15 Dec 2024
SceneLLM: Implicit Language Reasoning in LLM for Dynamic Scene Graph Generation
Hang Zhang
Zhuoling Li
Jun Liu
LRM
178
1
0
15 Dec 2024
Large Language Models for Medical Forecasting -- Foresight 2
Z. Kraljevic
Joshua Au Yeung
D. Bean
James T. Teo
Richard J. B. Dobson
LM&MA
113
0
0
14 Dec 2024
Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages
Poulami Ghosh
Raj Dabre
Pushpak Bhattacharyya
AAML
120
0
0
14 Dec 2024
VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation
Saksham Singh Kushwaha
Yapeng Tian
DiffM
VGen
127
2
0
14 Dec 2024
Rebalanced Vision-Language Retrieval Considering Structure-Aware Distillation
Yang Yang
Wenjuan Xi
Luping Zhou
Jinhui Tang
148
0
0
14 Dec 2024
Accelerating Retrieval-Augmented Generation
Derrick Quinn
Mohammad Nouri
Neel Patel
John Salihu
Alireza Salemi
Sukhan Lee
Hamed Zamani
Mohammad Alian
RALM
3DV
142
8
0
14 Dec 2024
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGen
DiffM
414
3
0
14 Dec 2024
Previous
1
2
3
...
25
26
27
...
196
197
198
Next