ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 9,929 papers shown
Title
GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding
GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding
Ziyin Zhang
Hang Yu
Shijie Li
Peng Di
Jianguo Li
Rui Wang
138
3
0
06 Sep 2024
CACER: Clinical Concept Annotations for Cancer Events and Relations
CACER: Clinical Concept Annotations for Cancer Events and Relations
Yujuan Fu
Giridhar Kaushik Ramachandran
Ahmad Halwani
Bridget T. McInnes
Fei Xia
K. Lybarger
Meliha Yetisgen
Özlem Uzuner
64
2
0
05 Sep 2024
LAST: Language Model Aware Speech Tokenization
LAST: Language Model Aware Speech Tokenization
A. Turetzky
Yossi Adi
83
3
0
05 Sep 2024
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-Xiong Wang
151
23
0
05 Sep 2024
Pooling And Attention: What Are Effective Designs For LLM-Based
  Embedding Models?
Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models?
Yixuan Tang
Yi Yang
78
4
0
04 Sep 2024
Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for
  PostNL
Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for PostNL
Mohammad Reshadati
78
0
0
04 Sep 2024
Diversify-verify-adapt: Efficient and Robust Retrieval-Augmented Ambiguous Question Answering
Diversify-verify-adapt: Efficient and Robust Retrieval-Augmented Ambiguous Question Answering
Yeonjun In
Sungchul Kim
Ryan Rossi
Md Mehrab Tanjim
Tong Yu
Ritwik Sinha
Chanyoung Park
96
3
0
04 Sep 2024
Initial Development and Evaluation of the Creative Artificial
  Intelligence through Recurring Developments and Determinations (CAIRDD)
  System
Initial Development and Evaluation of the Creative Artificial Intelligence through Recurring Developments and Determinations (CAIRDD) System
Jeremy Straub
Zach Johnson
94
0
0
03 Sep 2024
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through
  Corpus Retrieval and Augmentation
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation
Ingo Ziegler
Abdullatif Köksal
Desmond Elliott
Hinrich Schütze
82
6
0
03 Sep 2024
Foundations of Large Language Model Compression -- Part 1: Weight
  Quantization
Foundations of Large Language Model Compression -- Part 1: Weight Quantization
Sean I. Young
MQ
70
1
0
03 Sep 2024
A Fresh Take on Stale Embeddings: Improving Dense Retriever Training
  with Corrector Networks
A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networks
Nicholas Monath
Will Grathwohl
Michael Boratko
Rob Fergus
Andrew McCallum
Manzil Zaheer
64
0
0
03 Sep 2024
Can we only use guideline instead of shot in prompt?
Can we only use guideline instead of shot in prompt?
Jiaxiang Chen
Song Wang
Zhucong Li
Wayne Xiong
Zhuang Li
Zenglin Xu
Yuan Qi
62
1
0
03 Sep 2024
DiVE: DiT-based Video Generation with Enhanced Control
DiVE: DiT-based Video Generation with Enhanced Control
Junpeng Jiang
Gangyi Hong
Lijun Zhou
Enhui Ma
Hengtong Hu
...
Kaicheng Yu
Haiyang Sun
Kun Zhan
Peng Jia
Miao Zhang
VGenDiffM
54
14
0
03 Sep 2024
Dynamic Motion Synthesis: Masked Audio-Text Conditioned Spatio-Temporal
  Transformers
Dynamic Motion Synthesis: Masked Audio-Text Conditioned Spatio-Temporal Transformers
Sohan Anisetty
James Hays
77
0
0
03 Sep 2024
VProChart: Answering Chart Question through Visual Perception Alignment Agent and Programmatic Solution Reasoning
VProChart: Answering Chart Question through Visual Perception Alignment Agent and Programmatic Solution Reasoning
Muye Huang
L. Zhang
Lai Han
Wenjun Wu
Xinyu Zhang
Jun Liu
86
1
0
03 Sep 2024
Efficient LLM Context Distillation
Efficient LLM Context Distillation
Rajesh Upadhayayaya
Zachary Smith
Chritopher Kottmyer
Manish Raj Osti
145
2
0
03 Sep 2024
Membership Inference Attacks Against In-Context Learning
Membership Inference Attacks Against In-Context Learning
Rui Wen
Hui Yuan
Michael Backes
Yang Zhang
126
14
0
02 Sep 2024
Imitating Language via Scalable Inverse Reinforcement Learning
Imitating Language via Scalable Inverse Reinforcement Learning
Markus Wulfmeier
Michael Bloesch
Nino Vieillard
Arun Ahuja
Jorg Bornschein
...
Jost Tobias Springenberg
Nikola Momchev
Olivier Bachem
Matthieu Geist
Martin Riedmiller
112
10
0
02 Sep 2024
CHESS: Optimizing LLM Inference via Channel-Wise Thresholding and
  Selective Sparsification
CHESS: Optimizing LLM Inference via Channel-Wise Thresholding and Selective Sparsification
Junhui He
Shangyu Wu
Weidong Wen
Chun Jason Xue
Qingan Li
48
5
0
02 Sep 2024
CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models
CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models
Rui Zeng
Xi Chen
Yuwen Pu
Xuhong Zhang
Tianyu Du
Shouling Ji
115
6
0
02 Sep 2024
Pre-Trained Language Models for Keyphrase Prediction: A Review
Pre-Trained Language Models for Keyphrase Prediction: A Review
Muhammad Umair
Tangina Sultana
Young-Koo Lee
80
4
0
02 Sep 2024
Unveiling the Vulnerability of Private Fine-Tuning in Split-Based
  Frameworks for Large Language Models: A Bidirectionally Enhanced Attack
Unveiling the Vulnerability of Private Fine-Tuning in Split-Based Frameworks for Large Language Models: A Bidirectionally Enhanced Attack
Guanzhong Chen
Zhenghan Qin
Mingxin Yang
Yajie Zhou
Tao Fan
Tianyu Du
Zenglin Xu
AAML
125
6
0
02 Sep 2024
LuWu: An End-to-End In-Network Out-of-Core Optimizer for 100B-Scale
  Model-in-Network Data-Parallel Training on Distributed GPUs
LuWu: An End-to-End In-Network Out-of-Core Optimizer for 100B-Scale Model-in-Network Data-Parallel Training on Distributed GPUs
Mo Sun
Zihan Yang
Changyue Liao
Yingtao Li
Leilei Gan
Zeke Wang
110
1
0
02 Sep 2024
User-Specific Dialogue Generation with User Profile-Aware Pre-Training
  Model and Parameter-Efficient Fine-Tuning
User-Specific Dialogue Generation with User Profile-Aware Pre-Training Model and Parameter-Efficient Fine-Tuning
Atsushi Otsuka
Kazuya Matsuo
Ryo Ishii
Narichika Nomoto
Hiroaki Sugiyama
53
0
0
02 Sep 2024
Dissecting Temporal Understanding in Text-to-Audio Retrieval
Dissecting Temporal Understanding in Text-to-Audio Retrieval
Andreea-Maria Oncescu
João F. Henriques
A. Sophia Koepke
91
2
0
01 Sep 2024
Hound: Hunting Supervision Signals for Few and Zero Shot Node
  Classification on Text-attributed Graph
Hound: Hunting Supervision Signals for Few and Zero Shot Node Classification on Text-attributed Graph
Yuxiang Wang
Xiao Yan
Shiyu Jin
Quanqing Xu
Chuanhui Yang
Yuanyuan Zhu
Chuang Hu
Bo Du
Jiawei Jiang
VLM
63
0
0
01 Sep 2024
FLUX that Plays Music
FLUX that Plays Music
Zhengcong Fei
Mingyuan Fan
Changqian Yu
Junshi Huang
143
9
0
01 Sep 2024
Enhancing Source Code Security with LLMs: Demystifying The Challenges
  and Generating Reliable Repairs
Enhancing Source Code Security with LLMs: Demystifying The Challenges and Generating Reliable Repairs
Nafis Tanveer Islam
Joseph Khoury
Andrew Seong
E. Bou-Harb
Peyman Najafirad
AAML
118
4
0
01 Sep 2024
Self-evolving Agents with reflective and memory-augmented abilities
Self-evolving Agents with reflective and memory-augmented abilities
Xuechen Liang
Yangfan He
Yinghui Xia
Xinyuan Song
Jianhui Wang
...
Keqin Li
Jiaqi Chen
Jinsong Yang
Siyuan Chen
Tianyu Shi
LLMAGKELMCLL
153
4
0
01 Sep 2024
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni
Yulin Wang
Renping Zhou
Rui Lu
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Yuan Yao
Gao Huang
99
8
0
31 Aug 2024
Training-Free Sketch-Guided Diffusion with Latent Optimization
Training-Free Sketch-Guided Diffusion with Latent Optimization
Sandra Zhang Ding
Jiafeng Mao
Kiyoharu Aizawa
DiffM
188
3
0
31 Aug 2024
Building Better Datasets: Seven Recommendations for Responsible Design
  from Dataset Creators
Building Better Datasets: Seven Recommendations for Responsible Design from Dataset Creators
Will Orr
Kate Crawford
102
3
0
30 Aug 2024
Pre-Training Multimodal Hallucination Detectors with Corrupted Grounding
  Data
Pre-Training Multimodal Hallucination Detectors with Corrupted Grounding Data
Spencer Whitehead
Jacob Phillips
Sean Hendryx
75
0
0
30 Aug 2024
Unintentional Security Flaws in Code: Automated Defense via Root Cause
  Analysis
Unintentional Security Flaws in Code: Automated Defense via Root Cause Analysis
Nafis Tanveer Islam
Mazal Bethany
Dylan Manuel
Murtuza Jadliwala
Peyman Najafirad
102
0
0
30 Aug 2024
NDP: Next Distribution Prediction as a More Broad Target
NDP: Next Distribution Prediction as a More Broad Target
Junhao Ruan
Abudukeyumu Abudula
Xinyu Liu
Bei Li
Yinqiao Li
Chenglong Wang
Yuchun Fan
Yuan Ge
Tong Xiao
Jingbo Zhu
57
1
0
30 Aug 2024
MaFeRw: Query Rewriting with Multi-Aspect Feedbacks for
  Retrieval-Augmented Large Language Models
MaFeRw: Query Rewriting with Multi-Aspect Feedbacks for Retrieval-Augmented Large Language Models
Yujing Wang
Hainan Zhang
Liang Pang
Liang Pang
Hongwei Zheng
Zhiming Zheng
79
2
0
30 Aug 2024
An Empirical Study of Scaling Laws for Transfer
An Empirical Study of Scaling Laws for Transfer
Matthew Barnett
64
3
0
30 Aug 2024
A Survey of the Self Supervised Learning Mechanisms for Vision Transformers
A Survey of the Self Supervised Learning Mechanisms for Vision Transformers
Asifullah Khan
A. Sohail
Mustansar Fiaz
Mehdi Hassan
Tariq Habib Afridi
...
Muhammad Zaigham Zaheer
Kamran Ali
Tangina Sultana
Ziaurrehman Tanoli
Naeem Akhter
284
5
0
30 Aug 2024
Event Extraction for Portuguese: A QA-driven Approach using ACE-2005
Event Extraction for Portuguese: A QA-driven Approach using ACE-2005
L. F. Cunha
Ricardo Campos
A. Jorge
56
1
0
29 Aug 2024
LLaVA-Chef: A Multi-modal Generative Model for Food Recipes
LLaVA-Chef: A Multi-modal Generative Model for Food Recipes
Fnu Mohbat
Mohammed J. Zaki
84
8
0
29 Aug 2024
Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning
  of Large Language Models
Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning of Large Language Models
Alec Solway
ALM
85
0
0
29 Aug 2024
A Gradient Analysis Framework for Rewarding Good and Penalizing Bad
  Examples in Language Models
A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models
Yi-Lin Tuan
William Yang Wang
98
1
0
29 Aug 2024
A Survey for Large Language Models in Biomedicine
A Survey for Large Language Models in Biomedicine
Chong Wang
Mengyao Li
Junjun He
Zhongruo Wang
Erfan Darzi
...
Yi Yu
Pietro Liò
Tianyun Wang
Yu Guang Wang
Yiqing Shen
LM&MA
136
13
0
29 Aug 2024
M4CXR: Exploring Multi-task Potentials of Multi-modal Large Language
  Models for Chest X-ray Interpretation
M4CXR: Exploring Multi-task Potentials of Multi-modal Large Language Models for Chest X-ray Interpretation
Jonggwon Park
Soobum Kim
Byungmu Yoon
Jihun Hyun
Kyoyun Choi
LM&MA
96
6
0
29 Aug 2024
Revisit Micro-batch Clipping: Adaptive Data Pruning via Gradient
  Manipulation
Revisit Micro-batch Clipping: Adaptive Data Pruning via Gradient Manipulation
Lun Wang
61
1
0
29 Aug 2024
Learning from Negative Samples in Generative Biomedical Entity Linking
Learning from Negative Samples in Generative Biomedical Entity Linking
Chanhwi Kim
Hyunjae Kim
Sihyeon Park
Jiwoo Lee
Mujeen Sung
Jaewoo Kang
MedIm
79
0
0
29 Aug 2024
Evaluating Computational Representations of Character: An Austen
  Character Similarity Benchmark
Evaluating Computational Representations of Character: An Austen Character Similarity Benchmark
Funing Yang
Carolyn Jane Anderson
44
0
0
28 Aug 2024
Nexus: Specialization meets Adaptability for Efficiently Training
  Mixture of Experts
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
Nikolas Gritsch
Qizhen Zhang
Acyr Locatelli
Sara Hooker
Ahmet Üstün
MoE
91
3
0
28 Aug 2024
BELT-2: Bootstrapping EEG-to-Language representation alignment for
  multi-task brain decoding
BELT-2: Bootstrapping EEG-to-Language representation alignment for multi-task brain decoding
Jinzhao Zhou
Yiqun Duan
Fred Chang
T. Do
Yu-Kai Wang
Chin-Teng Lin
72
5
0
28 Aug 2024
Autoregressive model path dependence near Ising criticality
Autoregressive model path dependence near Ising criticality
Yi Hong Teoh
R. Melko
AI4CE
72
3
0
28 Aug 2024
Previous
123...404142...197198199
Next