Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.03206
Cited By
Perceiver: General Perception with Iterative Attention
4 March 2021
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Perceiver: General Perception with Iterative Attention"
50 / 682 papers shown
Title
3DShape2VecSet: A 3D Shape Representation for Neural Fields and Generative Diffusion Models
Biao Zhang
Jiapeng Tang
Matthias Niessner
Peter Wonka
DiffM
25
196
0
26 Jan 2023
Modelling Long Range Dependencies in
N
N
N
D: From Task-Specific to a General Purpose CNN
David M. Knigge
David W. Romero
Albert Gu
E. Gavves
Erik J. Bekkers
Jakub M. Tomczak
Mark Hoogendoorn
J. Sonke
3DV
27
21
0
25 Jan 2023
Zorro: the masked multimodal transformer
Adrià Recasens
Jason Lin
João Carreira
Drew Jaegle
Luyu Wang
...
Pauline Luc
Antoine Miech
Lucas Smaira
Ross Hemsley
Andrew Zisserman
39
20
0
23 Jan 2023
Multiview Compressive Coding for 3D Reconstruction
Chaozheng Wu
Justin Johnson
Jitendra Malik
Christoph Feichtenhofer
Georgia Gkioxari
26
71
0
19 Jan 2023
Laser: Latent Set Representations for 3D Generative Modeling
Pol Moreno
Adam R. Kosiorek
Heiko Strathmann
Daniel Zoran
Rosália G. Schneider
Bjorn Winckler
L. Markeeva
T. Weber
Danilo Jimenez Rezende
BDL
3DV
DRL
32
5
0
13 Jan 2023
TarViS: A Unified Approach for Target-based Video Segmentation
A. Athar
Alexander Hermans
Jonathon Luiten
Deva Ramanan
Bastian Leibe
VOS
23
29
0
06 Jan 2023
All in Tokens: Unifying Output Space of Visual Tasks via Soft Token
Jia Ning
Chen Li
Zheng-Wei Zhang
Zigang Geng
Qi Dai
Kun He
Han Hu
35
44
0
05 Jan 2023
Test of Time: Instilling Video-Language Models with a Sense of Time
Piyush Bagad
Makarand Tapaswi
Cees G. M. Snoek
83
36
0
05 Jan 2023
Transformers in Action Recognition: A Review on Temporal Modeling
Elham Shabaninia
Hossein Nezamabadi-pour
Fatemeh Shafizadegan
ViT
24
8
0
29 Dec 2022
Scalable Adaptive Computation for Iterative Generation
Allan Jabri
David Fleet
Ting-Li Chen
DiffM
35
107
0
22 Dec 2022
Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios
Yiren Lu
Justin Fu
George Tucker
Xinlei Pan
Eli Bronstein
...
Brandyn White
Aleksandra Faust
Shimon Whiteson
Drago Anguelov
Sergey Levine
OffRL
28
92
0
21 Dec 2022
MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
Difei Gao
Luowei Zhou
Lei Ji
Linchao Zhu
Yezhou Yang
Mike Zheng Shou
44
60
0
19 Dec 2022
Medical Diagnosis with Large Scale Multimodal Transformers: Leveraging Diverse Data for More Accurate Diagnosis
Firas Khader
Gustav Mueller-Franzes
Tian Wang
T. Han
Soroosh Tayebi Arasteh
...
Keno Bressem
Christiane Kuhl
S. Nebelung
Jakob Nikolas Kather
Daniel Truhn
16
6
0
18 Dec 2022
Inductive Attention for Video Action Anticipation
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
Simon See
Oswald Lanz
39
1
0
17 Dec 2022
MAViL: Masked Audio-Video Learners
Po-Yao (Bernie) Huang
Vasu Sharma
Hu Xu
Chaitanya K. Ryali
Haoqi Fan
Yanghao Li
Shang-Wen Li
Gargi Ghosh
Jitendra Malik
Christoph Feichtenhofer
26
51
0
15 Dec 2022
Vision Transformers are Parameter-Efficient Audio-Visual Learners
Yan-Bo Lin
Yi-Lin Sung
Jie Lei
Joey Tianyi Zhou
Gedas Bertasius
28
73
0
15 Dec 2022
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Alexei Baevski
Arun Babu
Wei-Ning Hsu
Michael Auli
VLM
SSL
32
92
0
14 Dec 2022
Structured 3D Features for Reconstructing Controllable Avatars
Enric Corona
M. Zanfir
Thiemo Alldieck
Eduard Gabriel Bazavan
Andrei Zanfir
C. Sminchisescu
3DH
44
16
0
13 Dec 2022
Egocentric Video Task Translation
Zihui Xue
Yale Song
Kristen Grauman
Lorenzo Torresani
EgoV
29
13
0
13 Dec 2022
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Ziniu Hu
Ahmet Iscen
Chen Sun
Zirui Wang
Kai-Wei Chang
Yizhou Sun
Cordelia Schmid
David A. Ross
Alireza Fathi
RALM
VLM
40
89
0
10 Dec 2022
Audiovisual Masked Autoencoders
Mariana-Iuliana Georgescu
Eduardo Fonseca
Radu Tudor Ionescu
Mario Lucic
Cordelia Schmid
Anurag Arnab
SSL
37
43
0
09 Dec 2022
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners
Shen Yan
Tao Zhu
Zirui Wang
Yuan Cao
Mi Zhang
Soham Ghosh
Yonghui Wu
Jiahui Yu
VLM
VGen
32
46
0
09 Dec 2022
A Flexible Nadaraya-Watson Head Can Offer Explainable and Calibrated Classification
Alan Q. Wang
M. Sabuncu
30
5
0
07 Dec 2022
Framework-agnostic Semantically-aware Global Reasoning for Segmentation
Mir Rayat Imtiaz Hossain
Leonid Sigal
James J. Little
ViT
25
0
0
06 Dec 2022
Images Speak in Images: A Generalist Painter for In-Context Visual Learning
Xinlong Wang
Wen Wang
Yue Cao
Chunhua Shen
Tiejun Huang
VLM
MLLM
66
244
0
05 Dec 2022
Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula
Eli Bronstein
S. Srinivasan
Supratik Paul
Aman Sinha
Matthew O'Kelly
Payam Nikdel
Shimon Whiteson
OffRL
8
18
0
02 Dec 2022
Survey on Self-Supervised Multimodal Representation Learning and Foundation Models
Sushil Thapa
AI4TS
SSL
18
1
0
29 Nov 2022
A Light Touch Approach to Teaching Transformers Multi-view Geometry
Yash Bhalgat
Joao F. Henriques
Andrew Zisserman
ViT
27
6
0
28 Nov 2022
Continuous diffusion for categorical data
Sander Dieleman
Laurent Sartran
Arman Roshannai
Nikolay Savinov
Yaroslav Ganin
...
Conor Durkan
Curtis Hawthorne
Rémi Leblond
Will Grathwohl
J. Adler
DiffM
26
98
0
28 Nov 2022
Interaction Region Visual Transformer for Egocentric Action Anticipation
Debaditya Roy
Ramanathan Rajendiran
Basura Fernando
36
15
0
25 Nov 2022
A Self-Attention Ansatz for Ab-initio Quantum Chemistry
Ingrid von Glehn
J. Spencer
David Pfau
21
60
0
24 Nov 2022
Event Transformer+. A multi-purpose solution for efficient event data processing
Alberto Sabater
Luis Montesano
Ana C. Murillo
ViT
31
8
0
22 Nov 2022
Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention
Zineng Tang
Jaemin Cho
Jie Lei
Joey Tianyi Zhou
VLM
24
9
0
21 Nov 2022
Discovering Evolution Strategies via Meta-Black-Box Optimization
R. T. Lange
Tom Schaul
Yutian Chen
Tom Zahavy
Valenti Dallibard
Chris Xiaoxuan Lu
Satinder Singh
Sebastian Flennerhag
44
47
0
21 Nov 2022
PointResNet: Residual Network for 3D Point Cloud Segmentation and Classification
Aadesh Desai
Saagar Parikh
S. Kumari
Shanmuganathan Raman
3DPC
3DV
21
2
0
20 Nov 2022
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
Limin Wang
Yu Qiao
ViT
30
107
0
17 Nov 2022
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
Hyeong-Seok Choi
Jinhyeok Yang
Juheon Lee
Hyeongju Kim
20
46
0
17 Nov 2022
Token Turing Machines
Michael S. Ryoo
K. Gopalakrishnan
Kumara Kahatapitiya
Ted Xiao
Kanishka Rao
Austin Stone
Yao Lu
Julian Ibarz
Anurag Arnab
27
21
0
16 Nov 2022
Latent Bottlenecked Attentive Neural Processes
Leo Feng
Hossein Hajimirsadeghi
Yoshua Bengio
Mohamed Osama Ahmed
BDL
19
19
0
15 Nov 2022
NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research
J. Bornschein
Alexandre Galashov
Ross Hemsley
Amal Rannen-Triki
Yutian Chen
...
Angeliki Lazaridou
Yee Whye Teh
Andrei A. Rusu
Razvan Pascanu
MarcÁurelio Ranzato
OOD
VLM
AI4TS
39
16
0
15 Nov 2022
The ProfessionAl Go annotation datasEt (PAGE)
Yifan Gao
Danni Zhang
Haoyue Li
20
0
0
03 Nov 2022
Efficient Speech Translation with Dynamic Latent Perceivers
Ioannis Tsiamas
Gerard I. Gállego
José A. R. Fonollosa
Marta R. Costa-jussá
17
2
0
28 Oct 2022
A single-cell gene expression language model
Will Connell
Umair W Khan
Michael J. Keiser
14
8
0
25 Oct 2022
Solving Reasoning Tasks with a Slot Transformer
Ryan Faulkner
Daniel Zoran
LRM
26
1
0
20 Oct 2022
Play It Back: Iterative Attention for Audio Recognition
Alexandros Stergiou
Dima Damen
37
4
0
20 Oct 2022
Coordinates Are NOT Lonely -- Codebook Prior Helps Implicit Neural 3D Representations
Fukun Yin
Wen Liu
Zilong Huang
Pei Cheng
Tao Chen
Gang Yu
19
19
0
20 Oct 2022
Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving
Eli Bronstein
Mark Palatucci
Dominik Notz
Brandyn White
Alex Kuefler
...
Punit Shah
Evan Racah
Benjamin Frenkel
Shimon Whiteson
Drago Anguelov
45
58
0
18 Oct 2022
Improving Object-centric Learning with Query Optimization
Baoxiong Jia
Yu Liu
Siyuan Huang
OCL
26
49
0
17 Oct 2022
Linear Video Transformer with Feature Fixation
Kaiyue Lu
Zexia Liu
Jianyuan Wang
Weixuan Sun
Zhen Qin
...
Xuyang Shen
Huizhong Deng
Xiaodong Han
Yuchao Dai
Yiran Zhong
30
4
0
15 Oct 2022
Neural Attentive Circuits
Nasim Rahaman
M. Weiß
Francesco Locatello
C. Pal
Yoshua Bengio
Bernhard Schölkopf
Erran L. Li
Nicolas Ballas
32
6
0
14 Oct 2022
Previous
1
2
3
...
10
11
12
13
14
9
Next