Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.09852
Cited By
M&M Mix: A Multimodal Multiview Transformer Ensemble
20 June 2022
Xuehan Xiong
Anurag Arnab
Arsha Nagrani
Cordelia Schmid
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"M&M Mix: A Multimodal Multiview Transformer Ensemble"
10 / 10 papers shown
Title
X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization
Anna Kukleva
Fadime Sener
Edoardo Remelli
Bugra Tekin
Eric Sauser
Bernt Schiele
Shugao Ma
VLM
EgoV
42
1
0
28 Mar 2024
Training a Large Video Model on a Single Machine in a Day
Yue Zhao
Philipp Krahenbuhl
VLM
34
15
0
28 Sep 2023
IndGIC: Supervised Action Recognition under Low Illumination
Jing-Teng Zeng
29
1
0
29 Aug 2023
An Outlook into the Future of Egocentric Vision
Chiara Plizzari
Gabriele Goletto
Antonino Furnari
Siddhant Bansal
Francesco Ragusa
G. Farinella
Dima Damen
Tatiana Tommasi
EgoV
40
38
0
14 Aug 2023
Multimodal Distillation for Egocentric Action Recognition
Gorjan Radevski
Dusan Grujicic
Marie-Francine Moens
Matthew Blaschko
Tinne Tuytelaars
EgoV
23
23
0
14 Jul 2023
Epic-Sounds: A Large-scale Dataset of Actions That Sound
Jaesung Huh
Jacob Chalk
Evangelos Kazakos
Dima Damen
Andrew Zisserman
EgoV
18
41
0
01 Feb 2023
Vision Transformers for Action Recognition: A Survey
Anwaar Ulhaq
Naveed Akhtar
Ganna Pogrebna
Ajmal Saeed Mian
ViT
19
44
0
13 Sep 2022
Omnivore: A Single Model for Many Visual Modalities
Rohit Girdhar
Mannat Singh
Nikhil Ravi
L. V. D. van der Maaten
Armand Joulin
Ishan Misra
223
225
0
20 Jan 2022
SCENIC: A JAX Library for Computer Vision Research and Beyond
Mostafa Dehghani
A. Gritsenko
Anurag Arnab
Matthias Minderer
Yi Tay
46
68
0
18 Oct 2021
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
424
596
0
21 Jul 2020
1