Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.01449
Cited By
v1
v2
v3
v4
v5 (latest)
ColPali: Efficient Document Retrieval with Vision Language Models
27 June 2024
Manuel Faysse
Hugues Sibille
Tony Wu
Bilel Omrani
Gautier Viaud
C´eline Hudelot
Pierre Colombo
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ColPali: Efficient Document Retrieval with Vision Language Models"
18 / 68 papers shown
Title
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Yupan Huang
Tengchao Lv
Lei Cui
Yutong Lu
Furu Wei
123
464
0
18 Apr 2022
OCR-free Document Understanding Transformer
Geewook Kim
Teakgyu Hong
Moonbin Yim
Jeongyeon Nam
Jinyoung Park
Jinyeong Yim
Wonseok Hwang
Sangdoo Yun
Dongyoon Han
Seunghyun Park
ViT
144
274
0
30 Nov 2021
FILIP: Fine-grained Interactive Language-Image Pre-Training
Lewei Yao
Runhu Huang
Lu Hou
Guansong Lu
Minzhe Niu
Hang Xu
Xiaodan Liang
Zhenguo Li
Xin Jiang
Chunjing Xu
VLM
CLIP
113
643
0
09 Nov 2021
YOLOX: Exceeding YOLO Series in 2021
Zheng Ge
Songtao Liu
Feng Wang
Zeming Li
Jian Sun
ObjD
203
4,123
0
18 Jul 2021
DocFormer: End-to-End Transformer for Document Understanding
Srikar Appalaraju
Bhavan A. Jasani
Bhargava Urala Kota
Yusheng Xie
R. Manmatha
ViT
100
281
0
22 Jun 2021
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
589
10,625
0
17 Jun 2021
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Nandan Thakur
Nils Reimers
Andreas Rucklé
Abhishek Srivastava
Iryna Gurevych
VLM
435
1,059
0
17 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
1.0K
30,032
0
26 Feb 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
723
41,796
0
22 Oct 2020
DocVQA: A Dataset for VQA on Document Images
Minesh Mathew
Dimosthenis Karatzas
C. V. Jawahar
163
748
0
01 Jul 2020
ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
Omar Khattab
Matei A. Zaharia
142
1,383
0
27 Apr 2020
Dense Passage Retrieval for Open-Domain Question Answering
Vladimir Karpukhin
Barlas Oğuz
Sewon Min
Patrick Lewis
Ledell Yu Wu
Sergey Edunov
Danqi Chen
Wen-tau Yih
RALM
227
3,810
0
10 Apr 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
221
1,285
0
25 Feb 2020
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,348
0
27 Aug 2019
Table2Vec: Neural Word and Entity Embeddings for Table Population and Retrieval
Li Deng
Shuo Zhang
K. Balog
LMTD
DRL
60
114
0
31 May 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.9K
95,531
0
11 Oct 2018
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset
Payal Bajaj
Daniel Fernando Campos
Nick Craswell
Li Deng
Jianfeng Gao
...
Mir Rosenberg
Xia Song
Alina Stoica
Saurabh Tiwary
Tong Wang
RALM
205
2,750
0
28 Nov 2016
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
468
43,961
0
01 May 2014
Previous
1
2