ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.06991
  4. Cited By
Language Modelling with Pixels

Language Modelling with Pixels

14 July 2022
Phillip Rust
Jonas F. Lotz
Emanuele Bugliarello
Elizabeth Salesky
Miryam de Lhoneux
Desmond Elliott
    VLM
ArXivPDFHTML

Papers citing "Language Modelling with Pixels"

26 / 26 papers shown
Title
Joint Low-level and High-level Textual Representation Learning with Multiple Masking Strategies
Joint Low-level and High-level Textual Representation Learning with Multiple Masking Strategies
Zhengmi Tang
Yuto Mitsui
Tomo Miyazaki
S. Omachi
34
0
0
11 May 2025
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Longxu Dou
Qian Liu
Fan Zhou
Changyu Chen
Zili Wang
...
Tianyu Pang
Chao Du
Xinyi Wan
Wei Lu
Min Lin
101
1
0
18 Feb 2025
PixelWorld: Towards Perceiving Everything as Pixels
PixelWorld: Towards Perceiving Everything as Pixels
Zhiheng Lyu
Xueguang Ma
Wenhu Chen
143
0
0
31 Jan 2025
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini
Shikhar Murty
Christopher D. Manning
Christopher Potts
Róbert Csordás
37
2
0
28 Oct 2024
Tur[k]ingBench: A Challenge Benchmark for Web Agents
Tur[k]ingBench: A Challenge Benchmark for Web Agents
Kevin Xu
Yeganeh Kordi
Kate Sanders
Yizhong Wang
Adam Byerly
Kate Sanders
Adam Byerly
Jingyu Zhang
Benjamin Van Durme
Daniel Khashabi
LLMAG
69
6
0
18 Mar 2024
Text as Images: Can Multimodal Large Language Models Follow Printed
  Instructions in Pixels?
Text as Images: Can Multimodal Large Language Models Follow Printed Instructions in Pixels?
Xiujun Li
Yujie Lu
Zhe Gan
Jianfeng Gao
William Yang Wang
Yejin Choi
VLM
MLLM
33
1
0
29 Nov 2023
Byte-Level Grammatical Error Correction Using Synthetic and Curated
  Corpora
Byte-Level Grammatical Error Correction Using Synthetic and Curated Corpora
Svanhvít Lilja Ingólfsdóttir
Pétur Orri Ragnarsson
H. Jónsson
Haukur Barri Símonarson
Vilhjálmur Þorsteinsson
Vésteinn Snæbjarnarson
SyDa
30
9
0
29 May 2023
OneCAD: One Classifier for All image Datasets using multimodal learning
OneCAD: One Classifier for All image Datasets using multimodal learning
S. Wadekar
Eugenio Culurciello
34
0
0
11 May 2023
The MiniPile Challenge for Data-Efficient Language Models
The MiniPile Challenge for Data-Efficient Language Models
Jean Kaddour
MoE
ALM
24
40
0
17 Apr 2023
Efficient OCR for Building a Diverse Digital History
Efficient OCR for Building a Diverse Digital History
Jacob Carlson
Tom Bryan
Melissa Dell
23
11
0
05 Apr 2023
Incorporating Context into Subword Vocabularies
Incorporating Context into Subword Vocabularies
Shaked Yehezkel
Yuval Pinter
39
8
0
13 Oct 2022
Layer or Representation Space: What makes BERT-based Evaluation Metrics
  Robust?
Layer or Representation Space: What makes BERT-based Evaluation Metrics Robust?
Doan Nam Long Vu
N. Moosavi
Steffen Eger
21
9
0
06 Sep 2022
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
Feng Liang
Yangguang Li
Diana Marculescu
SSL
TPM
ViT
51
22
0
28 May 2022
Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
  Multilingual Language Models
Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models
Terra Blevins
Hila Gonen
Luke Zettlemoyer
LRM
54
26
0
24 May 2022
Does Transliteration Help Multilingual Language Modeling?
Does Transliteration Help Multilingual Language Modeling?
Ibraheem Muhammad Moosa
Mahmud Elahi Akhter
Ashfia Binte Habib
40
11
0
29 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,434
0
11 Nov 2021
IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with
  Effective Domain-Specific Vocabulary Initialization
IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization
Fajri Koto
Jey Han Lau
Timothy Baldwin
VLM
55
82
0
10 Sep 2021
Subword Mapping and Anchoring across Languages
Subword Mapping and Anchoring across Languages
Giorgos Vernikos
Andrei Popescu-Belis
62
12
0
09 Sep 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
253
1,989
0
31 Dec 2020
How Good is Your Tokenizer? On the Monolingual Performance of
  Multilingual Language Models
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
74
235
0
31 Dec 2020
When Being Unseen from mBERT is just the Beginning: Handling New
  Languages With Multilingual Language Models
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
Benjamin Muller
Antonis Anastasopoulos
Benoît Sagot
Djamé Seddah
LRM
126
165
0
24 Oct 2020
Improving Multilingual Models with Language-Clustered Vocabularies
Improving Multilingual Models with Language-Clustered Vocabularies
Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
VLM
72
65
0
24 Oct 2020
Char2Subword: Extending the Subword Embedding Space Using Robust
  Character Compositionality
Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality
Gustavo Aguilar
Bryan McCann
Tong Niu
Nazneen Rajani
N. Keskar
Thamar Solorio
47
12
0
24 Oct 2020
Towards End-to-End In-Image Neural Machine Translation
Towards End-to-End In-Image Neural Machine Translation
Elman Mansimov
Mitchell Stern
M. Chen
Orhan Firat
Jakob Uszkoreit
Puneet Jain
24
25
0
20 Oct 2020
CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary
  Representations From Characters
CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters
Hicham El Boukkouri
Olivier Ferret
Thomas Lavergne
Hiroshi Noji
Pierre Zweigenbaum
Junichi Tsujii
71
156
0
20 Oct 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1