Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,944 papers shown
Title
Too Late to Train, Too Early To Use? A Study on Necessity and Viability of Low-Resource Bengali LLMs
Tamzeed Mahfuz
Satak Kumar Dey
Ruwad Naswan
Hasnaen Adil
Khondker Salman Sayeed
Haz Sameen Shahgir
84
1
0
29 Jun 2024
Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP
Omer Goldman
Alon Jacovi
Aviv Slobodkin
Aviya Maimon
Ido Dagan
Reut Tsarfaty
133
11
0
29 Jun 2024
Prompt Refinement with Image Pivot for Text-to-Image Generation
Jingtao Zhan
Qingyao Ai
Yiqun Liu
Yingwei Pan
Ting Yao
Jiaxin Mao
Shaoping Ma
Tao Mei
EGVM
102
4
0
28 Jun 2024
PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration
Yuxuan Sun
Yunlong Zhang
Yixuan Si
Chenglu Zhu
Zhongyi Shui
Kai Zhang
Jingxiong Li
Xingheng Lyu
Tao Lin
Lin Yang
LM&MA
VLM
MedIm
119
12
0
28 Jun 2024
PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators
Kuo-Hao Zeng
Zichen Zhang
Kiana Ehsani
Rose Hendrix
Jordi Salvador
Alvaro Herrasti
Ross Girshick
Aniruddha Kembhavi
Luca Weihs
LM&Ro
OffRL
83
25
0
28 Jun 2024
Into the Unknown: Generating Geospatial Descriptions for New Environments
Tzuf Paz-Argaman
John Palowitch
Sayali Kulkarni
Reut Tsarfaty
Jason Baldridge
117
1
0
28 Jun 2024
BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Zhehuai Chen
He Huang
Oleksii Hrinchuk
Krishna Puvvada
Nithin Rao Koluguri
Piotr Żelasko
Jagadeesh Balam
Boris Ginsburg
AuLLM
RALM
92
11
0
28 Jun 2024
Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood
Yang Xu
Yu Wang
Hao An
Zhichen Liu
Yongyuan Li
78
6
0
28 Jun 2024
Scalable and Domain-General Abstractive Proposition Segmentation
Mohammad Javad Hosseini
Yang Gao
Tim Baumgärtner
Alex Fabrikant
Reinald Kim Amplayo
85
0
0
28 Jun 2024
Protein Representation Learning with Sequence Information Embedding: Does it Always Lead to a Better Performance?
Y. Tan
Lirong Zheng
Bozitao Zhong
Liang Hong
Bingxin Zhou
62
5
0
28 Jun 2024
IDT: Dual-Task Adversarial Attacks for Privacy Protection
Pedro Faustini
Shakila Mahjabin Tonni
Annabelle McIver
Xingliang Yuan
Mark Dras
SILM
AAML
88
0
0
28 Jun 2024
Leveraging Machine-Generated Rationales to Facilitate Social Meaning Detection in Conversations
Ritam Dutt
Zhen Wu
Kelly Shi
Divyanshu Sheth
Prakhar Gupta
Carolyn Rose
106
2
0
27 Jun 2024
LoPT: Low-Rank Prompt Tuning for Parameter Efficient Language Models
Shouchang Guo
Sonam Damani
Keng-hao Chang
VLM
51
1
0
27 Jun 2024
The Odyssey of Commonsense Causality: From Foundational Benchmarks to Cutting-Edge Reasoning
Shaobo Cui
Zhijing Jin
Bernhard Schölkopf
Boi Faltings
CML
LRM
96
4
0
27 Jun 2024
RuBLiMP: Russian Benchmark of Linguistic Minimal Pairs
Ekaterina Taktasheva
Maxim Bazhukov
Kirill Koncha
Alena Fenogenova
Ekaterina Artemova
Vladislav Mikhailov
114
13
0
27 Jun 2024
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents
Zihao Wang
Shaofei Cai
Zhancun Mu
Haowei Lin
Ceyao Zhang
Xuejie Liu
Qing Li
Hoang Trung-Dung
Xiaojian Ma
Yitao Liang
LM&Ro
115
14
0
27 Jun 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Tomer Porian
Mitchell Wortsman
J. Jitsev
Ludwig Schmidt
Y. Carmon
177
26
0
27 Jun 2024
Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs
Lokesh Mishra
Sohayl Dhibi
Yusik Kim
Cesar Berrospi Ramis
Shubham Gupta
Michele Dolfi
Peter W. J. Staar
81
2
0
27 Jun 2024
Fairness and Bias in Multimodal AI: A Survey
Tosin Adewumi
Lama Alkhaled
Namrata Gurung
G. V. Boven
Irene Pagliai
119
10
0
27 Jun 2024
Fine-tuned network relies on generic representation to solve unseen cognitive task
Dongyan Lin
86
0
0
27 Jun 2024
On Discrete Prompt Optimization for Diffusion Models
Ruochen Wang
Ting Liu
Cho-Jui Hsieh
Boqing Gong
DiffM
97
8
0
27 Jun 2024
Learning Retrieval Augmentation for Personalized Dialogue Generation
Qiushi Huang
Shuai Fu
Xubo Liu
Wenwu Wang
Tom Ko
Yu Zhang
Lilian H. Y. Tang
RALM
119
18
0
27 Jun 2024
OutlierTune: Efficient Channel-Wise Quantization for Large Language Models
Jinguang Wang
Yuexi Yin
Haifeng Sun
Qi Qi
Jingyu Wang
Zirui Zhuang
Tingting Yang
Jianxin Liao
81
2
0
27 Jun 2024
Implicit Discourse Relation Classification For Nigerian Pidgin
Muhammed Saeed
Peter Bourgonje
Vera Demberg
41
2
0
26 Jun 2024
Fast Optimizer Benchmark
Simon Blauth
Tobias Bürger
Zacharias Häringer
Jörg Franke
Frank Hutter
53
0
0
26 Jun 2024
Unveiling and Controlling Anomalous Attention Distribution in Transformers
Ruiqing Yan
Xingbo Du
Haoyu Deng
Linghan Zheng
Qiuzhuang Sun
Jifang Hu
Yuhang Shao
Penghao Jiang
Jinrong Jiang
Lian Zhao
69
1
0
26 Jun 2024
Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems
Italo Luis da Silva
Hanqi Yan
Lin Gui
Yulan He
CML
109
0
0
26 Jun 2024
Zero-shot prompt-based classification: topic labeling in times of foundation models in German Tweets
Simon Münker
Kai Kugler
Achim Rettinger
VLM
79
1
0
26 Jun 2024
ConvoCache: Smart Re-Use of Chatbot Responses
Conor Atkins
Ian D. Wood
M. Kâafar
Hassan Jameel Asghar
Nardine Basta
Michal Kepkowski
103
0
0
26 Jun 2024
Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction
Yice Zhang
Jie Zeng
Weiming Hu
Ziyi Wang
Shiwei Chen
Ruifeng Xu
60
6
0
26 Jun 2024
PharmaGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical and Chemistry
Linqing Chen
Weilei Wang
Zilong Bai
Peng Xu
Yan Fang
...
Lisha Zhang
Fu Bian
Zhongkai Ye
Lidong Pei
Changyang Tu
AI4MH
LM&MA
107
3
0
26 Jun 2024
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models
Haibo Jin
Leyang Hu
Xinuo Li
Peiyan Zhang
Chonghan Chen
Jun Zhuang
Haohan Wang
PILM
115
33
0
26 Jun 2024
Catching Chameleons: Detecting Evolving Disinformation Generated using Large Language Models
Bohan Jiang
Chengshuai Zhao
Zhen Tan
Huan Liu
88
2
0
26 Jun 2024
MALSIGHT: Exploring Malicious Source Code and Benign Pseudocode for Iterative Binary Malware Summarization
Haolang Lu
Hongrui Peng
Guoshun Nan
Jiaoyang Cui
Cheng Wang
Weifei Jin
Songtao Wang
Shengli Pan
Xiaofeng Tao
69
4
0
26 Jun 2024
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment
Paarth Neekhara
Shehzeen Samarah Hussain
Subhankar Ghosh
Jason Chun Lok Li
Rafael Valle
Rohan Badlani
Boris Ginsburg
83
14
0
25 Jun 2024
Text-Animator: Controllable Visual Text Video Generation
Lin Liu
Quande Liu
Shengju Qian
Yuan Zhou
Wengang Zhou
Houqiang Li
Lingxi Xie
Qi Tian
VGen
101
1
0
25 Jun 2024
Interpreting Attention Layer Outputs with Sparse Autoencoders
Connor Kissane
Robert Krzyzanowski
Joseph Isaac Bloom
Arthur Conmy
Neel Nanda
MILM
87
24
0
25 Jun 2024
Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning
Arijit Sehanobish
Avinava Dubey
Krzysztof Choromanski
Somnath Basu Roy Chowdhury
Deepali Jain
Vikas Sindhwani
Snigdha Chaturvedi
ALM
91
3
0
25 Jun 2024
ViANLI: Adversarial Natural Language Inference for Vietnamese
Tin Van Huynh
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
73
0
0
25 Jun 2024
Data curation via joint example selection further accelerates multimodal learning
Talfan Evans
Nikhil Parthasarathy
Hamza Merzic
Olivier J. Hénaff
127
15
0
25 Jun 2024
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients
Aashiq Muhamed
Oscar Li
David Woodruff
Mona Diab
Virginia Smith
101
13
0
25 Jun 2024
MoESD: Mixture of Experts Stable Diffusion to Mitigate Gender Bias
Guorun Wang
Lucia Specia
DiffM
MoE
88
0
0
25 Jun 2024
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Guilherme Penedo
Hynek Kydlícek
Loubna Ben Allal
Anton Lozhkov
Margaret Mitchell
Colin Raffel
Leandro von Werra
Thomas Wolf
144
265
0
25 Jun 2024
Improving Grammatical Error Correction via Contextual Data Augmentation
Yixuan Wang
Baoxin Wang
Yijun Liu
Qingfu Zhu
Dayong Wu
Wanxiang Che
73
5
0
25 Jun 2024
BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and Optimizing the Right Coordinate Blocks
A. Ramesh
Vignesh Ganapathiraman
I. Laradji
Mark Schmidt
141
3
0
25 Jun 2024
SetBERT: Enhancing Retrieval Performance for Boolean Logic and Set Operation Queries
Quan Mai
Susan Gauch
Douglas Adams
AAML
110
4
0
25 Jun 2024
Leveraging Parameter-Efficient Transfer Learning for Multi-Lingual Text-to-Speech Adaptation
Yingting Li
Ambuj Mehrish
Bryan Chew
Bo Cheng
Soujanya Poria
73
0
0
25 Jun 2024
MPCODER: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning
Zhenlong Dai
Chang Yao
Wenkang Han
Ying Yuan
Zhipeng Gao
Jingyuan Chen
67
16
0
25 Jun 2024
Unlocking Continual Learning Abilities in Language Models
Wenyu Du
Shuang Cheng
Tongxu Luo
Zihan Qiu
Zeyu Huang
Ka Chun Cheung
Reynold Cheng
Jie Fu
KELM
CLL
120
10
0
25 Jun 2024
OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure
Jikai Wang
Yi Su
Juntao Li
Qingrong Xia
Zi Ye
Xinyu Duan
Zhefeng Wang
Min Zhang
150
19
0
25 Jun 2024
Previous
1
2
3
...
49
50
51
...
197
198
199
Next