Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10683
Cited By
v1
v2
v3
v4 (latest)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
50 / 9,877 papers shown
Title
Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis
Jiaqi Zhao
Ming Wang
Miao Zhang
Yuzhang Shang
Xuebo Liu
Yaowei Wang
Min Zhang
Liqiang Nie
MQ
246
2
0
18 Feb 2025
Flow-of-Options: Diversified and Improved LLM Reasoning by Thinking Through Options
Lakshmi Nair
Ian Trase
Mark Kim
AIFin
LRM
AI4CE
118
2
0
18 Feb 2025
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation
Ziqiang Liu
Shuangrui Ding
Zhixiong Zhang
Xiaoyi Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Dahua Lin
Jiaqi Wang
132
3
0
18 Feb 2025
UniGenCoder: Merging Seq2Seq and Seq2Tree Paradigms for Unified Code Generation
Liangying Shao
Yanfu Yan
Denys Poshyvanyk
Jinsong Su
115
2
0
18 Feb 2025
NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation
Zhiyuan Liu
Yanchen Luo
Han Huang
Enzhi Zhang
Changhao Nai
Sihang Li
Yaorui Shi
Xiang Wang
Kenji Kawaguchi
Tat-Seng Chua
198
4
0
18 Feb 2025
Q-STRUM Debate: Query-Driven Contrastive Summarization for Recommendation Comparison
George Saad
Scott Sanner
19
0
0
18 Feb 2025
VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation
Xinlong Chen
Yuanxing Zhang
Chongling Rao
Yushuo Guan
Qingbin Liu
Fuzheng Zhang
Chengru Song
Qiang Liu
Di Zhang
Tieniu Tan
113
2
0
18 Feb 2025
Savaal: Scalable Concept-Driven Question Generation to Enhance Human Learning
Kimia Noorbakhsh
Joseph Chandler
Pantea Karimi
M. Alizadeh
H. Balakrishnan
LRM
109
1
0
18 Feb 2025
Oreo: A Plug-in Context Reconstructor to Enhance Retrieval-Augmented Generation
Sha Li
Naren Ramakrishnan
RALM
KELM
265
2
0
18 Feb 2025
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
Jiaqi Zhao
Miao Zhang
Ming Wang
Yuzhang Shang
Kaihao Zhang
Weili Guan
Yaowei Wang
Min Zhang
MQ
114
1
0
18 Feb 2025
Lightweight Online Adaption for Time Series Foundation Model Forecasts
Thomas L. Lee
William Toner
Rajkarn Singh
Artjom Joosem
Martin Asenov
AI4TS
123
1
0
18 Feb 2025
Positional Encoding in Transformer-Based Time Series Models: A Survey
Habib Irani
Vangelis Metsis
AI4TS
80
2
0
17 Feb 2025
Object-Centric Image to Video Generation with Language Guidance
Angel Villar-Corrales
Gjergj Plepi
Sven Behnke
DiffM
VGen
OCL
252
1
0
17 Feb 2025
What Are They Filtering Out? A Survey of Filtering Strategies for Harm Reduction in Pretraining Datasets
Marco Antonio Stranisci
Christian Hardmeier
165
1
0
17 Feb 2025
DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection
Yingli Shen
Wen Lai
Shuo Wang
Xueren Zhang
Kangyang Luo
Alexander Fraser
Maosong Sun
210
1
0
17 Feb 2025
Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey
Ruiyao Xu
Kaize Ding
123
7
0
17 Feb 2025
Learning to Sample Effective and Diverse Prompts for Text-to-Image Generation
Taeyoung Yun
Dinghuai Zhang
Jinkyoo Park
Ling Pan
DiffM
108
6
0
17 Feb 2025
SuperMerge: An Approach For Gradient-Based Model Merging
Haoyu Yang
Zheng Zhang
Saket Sathe
MoMe
221
0
0
17 Feb 2025
Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control
Jinyan Su
Jennifer Healey
Preslav Nakov
Claire Cardie
3DV
115
1
0
17 Feb 2025
Unknown Word Detection for English as a Second Language (ESL) Learners Using Gaze and Pre-trained Language Models
Jiexin Ding
Bowen Zhao
Yuntao wang
Xinyun Liu
Rui Hao
Ishan Chatterjee
Yuanchun Shi
110
0
0
17 Feb 2025
On Creating a Causally Grounded Usable Rating Method for Assessing the Robustness of Foundation Models Supporting Time Series
Kausik Lakkaraju
Rachneet Kaur
Parisa Zehtabi
Sunandita Patra
Siva Likitha Valluru
Zhen Zeng
Biplav Srivastava
Marco Valtorta
AI4TS
118
0
0
17 Feb 2025
Idiosyncrasies in Large Language Models
Mingjie Sun
Yida Yin
Zhiqiu Xu
J. Zico Kolter
Zhuang Liu
119
7
0
17 Feb 2025
STRIVE: Structured Reasoning for Self-Improvement in Claim Verification
Haisong Gong
Jing Li
Junfei Wu
Qiang Liu
Shu Wu
Liang Wang
LRM
80
0
0
17 Feb 2025
Multi-Turn Multi-Modal Question Clarification for Enhanced Conversational Understanding
Kimia Ramezan
Alireza Amiri Bavandpour
Yifei Yuan
Clemencia Siro
Mohammad Aliannejadi
89
0
0
17 Feb 2025
Precise Parameter Localization for Textual Generation in Diffusion Models
Łukasz Staniszewski
Bartosz Cywiński
Franziska Boenisch
Kamil Deja
Adam Dziedzic
DiffM
471
1
0
17 Feb 2025
An Efficient Large Recommendation Model: Towards a Resource-Optimal Scaling Law
Songpei Xu
Shijia Wang
Da Guo
Xianwen Guo
Qiang Xiao
Fangjian Li
Chuanjiang Luo
113
0
0
17 Feb 2025
Mimicking the Familiar: Dynamic Command Generation for Information Theft Attacks in LLM Tool-Learning System
Ziyou Jiang
Mingyang Li
Guowei Yang
Junjie Wang
Yuekai Huang
Zhiyuan Chang
Qing Wang
AAML
76
1
0
17 Feb 2025
Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale
Fan Zhou
Zengzhi Wang
Qian Liu
Junlong Li
Pengfei Liu
ALM
219
15
0
17 Feb 2025
Factual Inconsistency in Data-to-Text Generation Scales Exponentially with LLM Size: A Statistical Validation
Joy Mahapatra
Soumyajit Roy
Utpal Garain
HILM
ALM
145
0
0
17 Feb 2025
REAL-MM-RAG: A Real-World Multi-Modal Retrieval Benchmark
Navve Wasserman
Roi Pony
O. Naparstek
Adi Raz Goldfarb
Eli Schwartz
Udi Barzelay
Leonid Karlinsky
3DV
VLM
141
3
0
17 Feb 2025
Diversity-oriented Data Augmentation with Large Language Models
Zaitian Wang
Jinghan Zhang
Xinhao Zhang
Kunpeng Liu
Pengfei Wang
Yuanchun Zhou
123
3
0
17 Feb 2025
RA-MTR: A Retrieval Augmented Multi-Task Reader based Approach for Inspirational Quote Extraction from Long Documents
Sayantan Adak
Animesh Mukherjee
RALM
47
0
0
17 Feb 2025
CL-MFAP: A Contrastive Learning-Based Multimodal Foundation Model for Molecular Property Prediction and Antibiotic Screening
Gen Zhou
Sugitha Janarthanan
Yutong Lu
Pingzhao Hu
106
0
0
16 Feb 2025
FinMTEB: Finance Massive Text Embedding Benchmark
Yixuan Tang
Yi Yang
AIFin
160
2
0
16 Feb 2025
PlanGenLLMs: A Modern Survey of LLM Planning Capabilities
Hui Wei
Zihao Zhang
Shenghua He
Tian Xia
Shijia Pan
Fei Liu
199
9
0
16 Feb 2025
TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking
Shahriar Kabir Nahin
R. N. Nandi
Sagor Sarker
Quazi Sarwar Muhtaseem
Md. Kowsher
Apu Chandraw Shill
Md Ibrahim
Mehadi Hasan Menon
Tareq Al Muntasir
Firoj Alam
189
0
0
16 Feb 2025
Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM Training
Yao-Ching Yu
Tsun-Han Chiang
Cheng-Wei Tsai
Chien-Ming Huang
Wen-Kwang Tsao
119
7
0
16 Feb 2025
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
Hieu Nguyen
Zihao He
Shoumik Atul Gandre
Ujjwal Pasupulety
Sharanya Kumari Shivakumar
Kristina Lerman
HILM
130
2
0
16 Feb 2025
Superpose Singular Features for Model Merging
Haiquan Qiu
You Wu
Quanming Yao
MoMe
173
0
0
15 Feb 2025
Order-agnostic Identifier for Large Language Model-based Generative Recommendation
Xinyu Lin
Haihan Shi
Wenjie Wang
Fuli Feng
Qifan Wang
See-Kiong Ng
Tat-Seng Chua
52
3
0
15 Feb 2025
A Tutorial on LLM Reasoning: Relevant Methods behind ChatGPT o1
Jun Wang
LRM
KELM
158
8
0
15 Feb 2025
MixMin: Finding Data Mixtures via Convex Minimization
Anvith Thudi
Evianne Rovers
Yangjun Ruan
Tristan Thrush
Chris J. Maddison
111
0
0
14 Feb 2025
MorphNLI: A Stepwise Approach to Natural Language Inference Using Text Morphing
Vlad-Andrei Negru
Robert Vacareanu
Camelia Lemnaru
Mihai Surdeanu
Rodica Potolea
165
2
0
13 Feb 2025
GoRA: Gradient-driven Adaptive Low Rank Adaptation
Haonan He
Peng Ye
Yuchen Ren
Yuan Yuan
Luyang Zhou
Shucun Ju
Lei Chen
AI4TS
AI4CE
472
1
0
13 Feb 2025
Matina: A Large-Scale 73B Token Persian Text Corpus
Sara Bourbour Hosseinbeigi
Fatemeh Taherinezhad
Heshaam Faili
Hamed Baghbani
Fatemeh Nadi
Mostafa Amiri
164
0
0
13 Feb 2025
AttentionSmithy: A Modular Framework for Rapid Transformer Development and Customization
Caleb Cranney
Jesse G. Meyer
170
0
0
13 Feb 2025
PathFinder: A Multi-Modal Multi-Agent System for Medical Diagnostic Decision-Making Applied to Histopathology
Fatemeh Ghezloo
M. S. Seyfioglu
Rustin Soraki
Wisdom O. Ikezogwo
Beibin Li
Tejoram Vivekanandan
J. Elmore
Ranjay Krishna
Linda G. Shapiro
175
7
0
13 Feb 2025
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation
H. Seo
Wongi Jeong
Jae-sun Seo
Se Young Chun
140
0
0
12 Feb 2025
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers
Siddharth Singh
Prajwal Singhania
Aditya K. Ranjan
John Kirchenbauer
Jonas Geiping
...
Abhimanyu Hans
Manli Shu
Aditya Tomar
Tom Goldstein
A. Bhatele
180
3
0
12 Feb 2025
Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning
Qifan Yu
Zhenyu He
Sijie Li
Xun Zhou
Jun Zhang
Jingjing Xu
Di He
OffRL
LRM
139
5
0
12 Feb 2025
Previous
1
2
3
...
20
21
22
...
196
197
198
Next