Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.00522
Cited By
Text Rendering Strategies for Pixel Language Models
1 November 2023
Jonas F. Lotz
Elizabeth Salesky
Phillip Rust
Desmond Elliott
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Text Rendering Strategies for Pixel Language Models"
20 / 20 papers shown
Title
Vision-centric Token Compression in Large Language Model
Ling Xing
Alex Jinpeng Wang
Rui Yan
Xiangbo Shu
Jinhui Tang
VLM
108
0
0
02 Feb 2025
Bytes Are All You Need: Transformers Operating Directly On File Bytes
Maxwell Horton
Sachin Mehta
Ali Farhadi
Mohammad Rastegari
VLM
54
6
0
31 May 2023
Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer
Elizabeth Salesky
Neha Verma
Philipp Koehn
Matt Post
48
16
0
23 May 2023
Towards Climate Awareness in NLP Research
Daniel Hershcovich
Nicolas Webersinke
Mathias Kraus
J. Bingler
Markus Leippold
67
33
0
10 May 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
427
7,705
0
11 Nov 2021
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Andrew Jaegle
Sebastian Borgeaud
Jean-Baptiste Alayrac
Carl Doersch
Catalin Ionescu
...
Olivier J. Hénaff
M. Botvinick
Andrew Zisserman
Oriol Vinyals
João Carreira
MLLM
VLM
GNN
52
579
0
30 Jul 2021
ByT5: Towards a token-free future with pre-trained byte-to-byte models
Linting Xue
Aditya Barua
Noah Constant
Rami Al-Rfou
Sharan Narang
Mihir Kale
Adam Roberts
Colin Raffel
83
502
0
28 May 2021
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Tianyu Gao
Xingcheng Yao
Danqi Chen
AILaw
SSL
251
3,371
0
18 Apr 2021
Robust Open-Vocabulary Translation from Visual Text Representations
Elizabeth Salesky
David Etter
Matt Post
VLM
36
42
0
16 Apr 2021
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
J. Clark
Dan Garrette
Iulia Turc
John Wieting
85
218
0
11 Mar 2021
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
357
6,731
0
23 Dec 2020
Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection
Joakim Nivre
M. Marneffe
Filip Ginter
Jan Hajivc
Christopher D. Manning
S. Pyysalo
Sebastian Schuster
Francis M. Tyers
Daniel Zeman
VLM
46
513
0
22 Apr 2020
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives
Elena Voita
Rico Sennrich
Ivan Titov
266
186
0
03 Sep 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
534
24,351
0
26 Jul 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
220
8,415
0
19 Jun 2019
FRAGE: Frequency-Agnostic Word Representation
Chengyue Gong
Di He
Xu Tan
Tao Qin
Liwei Wang
Tie-Yan Liu
OOD
59
144
0
18 Sep 2018
WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations
Mohammad Taher Pilehvar
Jose Camacho-Collados
171
485
0
28 Aug 2018
Deep contextualized word representations
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
NAI
194
11,542
0
15 Feb 2018
Enriching Word Vectors with Subword Information
Piotr Bojanowski
Edouard Grave
Armand Joulin
Tomas Mikolov
NAI
SSL
VLM
220
9,957
0
15 Jul 2016
Charagram: Embedding Words and Sentences via Character n-grams
John Wieting
Joey Tianyi Zhou
Kevin Gimpel
Karen Livescu
NAI
GNN
143
193
0
10 Jul 2016
1