ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,948 papers shown
Title
What Teaches Robots to Walk, Teaches Them to Trade too -- Regime
  Adaptive Execution using Informed Data and LLMs
What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs
Raeid Saqur
82
3
0
20 Jun 2024
Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation
Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation
Eyal Michaeli
Ohad Fried
125
1
0
20 Jun 2024
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Luxi He
Yangsibo Huang
Weijia Shi
Tinghao Xie
Haotian Liu
Yue Wang
Luke Zettlemoyer
Chiyuan Zhang
Danqi Chen
Peter Henderson
120
12
0
20 Jun 2024
Unmasking Database Vulnerabilities: Zero-Knowledge Schema Inference Attacks in Text-to-SQL Systems
Unmasking Database Vulnerabilities: Zero-Knowledge Schema Inference Attacks in Text-to-SQL Systems
Đorđe Klisura
Anthony Rios
AAML
118
2
0
20 Jun 2024
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
Assaf Ben-Kish
Itamar Zimerman
Shady Abu Hussein
Nadav Cohen
Amir Globerson
Lior Wolf
Raja Giryes
Mamba
223
20
0
20 Jun 2024
Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
Han Jiang
Xiaoyuan Yi
Zhihua Wei
Ziang Xiao
Shu Wang
Xing Xie
ELMALM
175
8
0
20 Jun 2024
Knowledge Tagging System on Math Questions via LLMs with Flexible
  Demonstration Retriever
Knowledge Tagging System on Math Questions via LLMs with Flexible Demonstration Retriever
Hang Li
Tianlong Xu
Jiliang Tang
Qingsong Wen
57
6
0
19 Jun 2024
Knowledge Graph-Enhanced Large Language Models via Path Selection
Knowledge Graph-Enhanced Large Language Models via Path Selection
Haochen Liu
Song Wang
Yaochen Zhu
Yushun Dong
Jundong Li
KELM
77
24
0
19 Jun 2024
A Primal-Dual Framework for Transformers and Neural Networks
A Primal-Dual Framework for Transformers and Neural Networks
Tan M. Nguyen
Tam Nguyen
Nhat Ho
Andrea L. Bertozzi
Richard G. Baraniuk
Stanley J. Osher
ViT
76
14
0
19 Jun 2024
Elliptical Attention
Elliptical Attention
Stefan K. Nielsen
Laziz U. Abdullaev
R. Teo
Tan M. Nguyen
87
4
0
19 Jun 2024
Unveiling the Hidden Structure of Self-Attention via Kernel Principal
  Component Analysis
Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis
R. Teo
Tan M. Nguyen
93
4
0
19 Jun 2024
GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual
  Generation
GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation
Baiqi Li
Zhiqiu Lin
Deepak Pathak
Jiayao Li
Yixin Fei
...
Tiffany Ling
Xide Xia
Pengchuan Zhang
Graham Neubig
Deva Ramanan
EGVM
144
39
0
19 Jun 2024
Model Internals-based Answer Attribution for Trustworthy
  Retrieval-Augmented Generation
Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation
Jirui Qi
Gabriele Sarti
Raquel Fernández
Arianna Bisazza
RALM
90
8
0
19 Jun 2024
Improving Visual Commonsense in Language Models via Multiple Image
  Generation
Improving Visual Commonsense in Language Models via Multiple Image Generation
Guy Yariv
Idan Schwartz
Yossi Adi
Sagie Benaim
VLMLRM
50
0
0
19 Jun 2024
Enhancing Distractor Generation for Multiple-Choice Questions with
  Retrieval Augmented Pretraining and Knowledge Graph Integration
Enhancing Distractor Generation for Multiple-Choice Questions with Retrieval Augmented Pretraining and Knowledge Graph Integration
Han-Cheng Yu
Yu-An Shih
Kin-Man Law
Kai-Yu Hsieh
Yu-Chen Cheng
Hsin-Chih Ho
Zih-An Lin
Wen-Chuan Hsu
Yao-Chung Fan
79
10
0
19 Jun 2024
ALiiCE: Evaluating Positional Fine-grained Citation Generation
ALiiCE: Evaluating Positional Fine-grained Citation Generation
Yilong Xu
Jinhua Gao
Xiaoming Yu
Baolong Bi
Huawei Shen
Xueqi Cheng
HILM
77
6
0
19 Jun 2024
ARDuP: Active Region Video Diffusion for Universal Policies
ARDuP: Active Region Video Diffusion for Universal Policies
Shuaiyi Huang
Mara Levy
Zhenyu Jiang
Anima Anandkumar
Yuke Zhu
Linxi Fan
De-An Huang
Abhinav Shrivastava
VGen
123
4
0
19 Jun 2024
Enhancing Collaborative Semantics of Language Model-Driven
  Recommendations via Graph-Aware Learning
Enhancing Collaborative Semantics of Language Model-Driven Recommendations via Graph-Aware Learning
Zhong Guan
Likang Wu
Hongke Zhao
Ming He
Jianpin Fan
90
3
0
19 Jun 2024
Towards Robust Evaluation: A Comprehensive Taxonomy of Datasets and
  Metrics for Open Domain Question Answering in the Era of Large Language
  Models
Towards Robust Evaluation: A Comprehensive Taxonomy of Datasets and Metrics for Open Domain Question Answering in the Era of Large Language Models
Akchay Srivastava
Atif Memon
ELM
85
1
0
19 Jun 2024
APPL: A Prompt Programming Language for Harmonious Integration of
  Programs and Large Language Model Prompts
APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts
Honghua Dong
Qidong Su
Yubo Gao
Zhaoyu Li
Yangjun Ruan
Gennady Pekhimenko
Chris J. Maddison
Xujie Si
LLMAG
66
1
0
19 Jun 2024
BoA: Attention-aware Post-training Quantization without Backpropagation
BoA: Attention-aware Post-training Quantization without Backpropagation
Junhan Kim
Ho-Young Kim
Eulrang Cho
Chungman Lee
Joonyoung Kim
Yongkweon Jeon
MQ
126
0
0
19 Jun 2024
Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal
  Quantization levels and Rank Values trough Differentiable Bayesian Gates
Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal Quantization levels and Rank Values trough Differentiable Bayesian Gates
Cristian Meo
Ksenia Sycheva
Anirudh Goyal
Justin Dauwels
MQ
78
5
0
18 Jun 2024
From Insights to Actions: The Impact of Interpretability and Analysis
  Research on NLP
From Insights to Actions: The Impact of Interpretability and Analysis Research on NLP
Marius Mosbach
Vagrant Gautam
Tomás Vergara-Browne
Dietrich Klakow
Mor Geva
AI4CE
82
10
0
18 Jun 2024
Applying Ensemble Methods to Model-Agnostic Machine-Generated Text
  Detection
Applying Ensemble Methods to Model-Agnostic Machine-Generated Text Detection
Ivan Ong
Boon King Quek
DeLMO
64
2
0
18 Jun 2024
RichRAG: Crafting Rich Responses for Multi-faceted Queries in
  Retrieval-Augmented Generation
RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation
Shuting Wang
Xin Yu
Mang Wang
Weipeng Chen
Yutao Zhu
Zhicheng Dou
RALM
105
9
0
18 Jun 2024
MMUTF: Multimodal Multimedia Event Argument Extraction with Unified
  Template Filling
MMUTF: Multimodal Multimedia Event Argument Extraction with Unified Template Filling
Philipp Seeberger
Dominik Wagner
Korbinian Riedhammer
85
0
0
18 Jun 2024
QOG:Question and Options Generation based on Language Model
QOG:Question and Options Generation based on Language Model
Jincheng Zhou
86
3
0
18 Jun 2024
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory
Haoze Wu
Zihan Qiu
Zili Wang
Hang Zhao
Jie Fu
MoE
100
3
0
18 Jun 2024
Cross-Lingual Unlearning of Selective Knowledge in Multilingual Language
  Models
Cross-Lingual Unlearning of Selective Knowledge in Multilingual Language Models
Minseok Choi
Kyunghyun Min
Jaegul Choo
MUAAML
90
2
0
18 Jun 2024
Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for
  Large Language Models
Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models
Dongwon Jo
Taesu Kim
Yulhwa Kim
Jae-Joon Kim
137
5
0
18 Jun 2024
Knowledge Fusion By Evolving Weights of Language Models
Knowledge Fusion By Evolving Weights of Language Models
Guodong DU
Yiyao Cao
Hanting Liu
Runhua Jiang
Shuyang Yu
Yifei Guo
Sim Kuan Goh
Jing Li
MoMe
91
15
0
18 Jun 2024
Improving Text-To-Audio Models with Synthetic Captions
Improving Text-To-Audio Models with Synthetic Captions
Zhifeng Kong
Sang-gil Lee
Deepanway Ghosal
Navonil Majumder
Ambuj Mehrish
Rafael Valle
Soujanya Poria
Bryan Catanzaro
115
13
0
18 Jun 2024
PromptDSI: Prompt-based Rehearsal-free Continual Learning for Document Retrieval
PromptDSI: Prompt-based Rehearsal-free Continual Learning for Document Retrieval
Tuan-Luc Huynh
Thuy-Trang Vu
Weiqing Wang
Yinwei Wei
T. Le
D. Gašević
Yuan-Fang Li
Thanh-Toan Do
VLMCLL
117
1
0
18 Jun 2024
Generative Artificial Intelligence-Guided User Studies: An Application for Air Taxi Services
Generative Artificial Intelligence-Guided User Studies: An Application for Air Taxi Services
Shengdi Xiao
Jingjing Li
Tatsuki Fushimi
Yoichi Ochiai
106
0
0
18 Jun 2024
From Instance Training to Instruction Learning: Task Adapters Generation from Instructions
From Instance Training to Instruction Learning: Task Adapters Generation from Instructions
Huanxuan Liao
Yao Xu
Shizhu He
Yuanzhe Zhang
Yanchao Hao
Shengping Liu
Kang Liu
Jun Zhao
163
1
0
18 Jun 2024
AI "News" Content Farms Are Easy to Make and Hard to Detect: A Case
  Study in Italian
AI "News" Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian
Giovanni Puccetti
Anna Rogers
Chiara Alzetta
F. Dell’Orletta
Andrea Esuli
97
11
0
17 Jun 2024
Language Models are Surprisingly Fragile to Drug Names in Biomedical
  Benchmarks
Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks
Jack Gallifant
Shan Chen
Pedro Moreira
Nikolaj Munch
Mingye Gao
Jackson Pond
Leo Anthony Celi
Hugo J. W. L. Aerts
Thomas Hartvigsen
Danielle S. Bitterman
115
13
0
17 Jun 2024
Large Scale Transfer Learning for Tabular Data via Language Modeling
Large Scale Transfer Learning for Tabular Data via Language Modeling
Josh Gardner
Juan C. Perdomo
Ludwig Schmidt
LMTD
107
24
0
17 Jun 2024
LiLiuM: eBay's Large Language Models for e-commerce
LiLiuM: eBay's Large Language Models for e-commerce
Christian Herold
Michael Kozielski
Leonid Ekimov
Pavel Petrushkov
P. Vandenbussche
Shahram Khadivi
98
3
0
17 Jun 2024
Prefixing Attention Sinks can Mitigate Activation Outliers for Large
  Language Model Quantization
Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization
Seungwoo Son
Wonpyo Park
Woohyun Han
Kyuyeun Kim
Jaeho Lee
MQ
78
13
0
17 Jun 2024
LLaNA: Large Language and NeRF Assistant
LLaNA: Large Language and NeRF Assistant
Andrea Amaduzzi
Pierluigi Zama Ramirez
Giuseppe Lisanti
Samuele Salti
Luigi Di Stefano
111
4
0
17 Jun 2024
Exploring the Role of Large Language Models in Prompt Encoding for
  Diffusion Models
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models
Bingqi Ma
Zhuofan Zong
Guanglu Song
Hongsheng Li
Yu Liu
91
23
0
17 Jun 2024
How Do Large Language Models Acquire Factual Knowledge During
  Pretraining?
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Hoyeon Chang
Jinho Park
Seonghyeon Ye
Sohee Yang
Youngkyung Seo
Du-Seong Chang
Minjoon Seo
KELM
100
46
0
17 Jun 2024
Measuring memorization in RLHF for code completion
Measuring memorization in RLHF for code completion
Aneesh Pappu
Billy Porter
Ilia Shumailov
Jamie Hayes
101
3
0
17 Jun 2024
Nemotron-4 340B Technical Report
Nemotron-4 340B Technical Report
Nvidia
:
Bo Adler
Niket Agarwal
Ashwath Aithal
...
Jimmy Zhang
Jing Zhang
Vivienne Zhang
Yian Zhang
Chen Zhu
128
69
0
17 Jun 2024
DELLA-Merging: Reducing Interference in Model Merging through
  Magnitude-Based Sampling
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Pala Tej Deep
Rishabh Bhardwaj
Soujanya Poria
MoMe
94
31
0
17 Jun 2024
CrAM: Credibility-Aware Attention Modification in LLMs for Combating
  Misinformation in RAG
CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAG
Boyi Deng
Wenjie Wang
Fengbin Zhu
Qifan Wang
Fuli Feng
103
9
0
17 Jun 2024
Promises, Outlooks and Challenges of Diffusion Language Modeling
Promises, Outlooks and Challenges of Diffusion Language Modeling
Justin Deschenaux
Çağlar Gülçehre
DiffM
86
3
0
17 Jun 2024
$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with
  Sparse Mixture-of-Experts
MoE-RBench\texttt{MoE-RBench}MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Guanjie Chen
Xinyu Zhao
Tianlong Chen
Yu Cheng
MoE
116
5
0
17 Jun 2024
An Empirical Investigation of Matrix Factorization Methods for
  Pre-trained Transformers
An Empirical Investigation of Matrix Factorization Methods for Pre-trained Transformers
Ashim Gupta
Sina Mahdipour Saravani
P. Sadayappan
Vivek Srikumar
57
2
0
17 Jun 2024
Previous
123...515253...197198199
Next