Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.03206
Cited By
Perceiver: General Perception with Iterative Attention
4 March 2021
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Perceiver: General Perception with Iterative Attention"
50 / 682 papers shown
Title
RecipeMind: Guiding Ingredient Choices from Food Pairing to Recipe Completion using Cascaded Set Transformer
Mogan Gim
Donghee Choi
Kana Maruyama
Jihun Choi
Hajung Kim
Donghyeon Park
Jaewoo Kang
40
5
0
14 Oct 2022
Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors
Vladimir E. Iashin
Weidi Xie
Esa Rahtu
Andrew Zisserman
28
20
0
13 Oct 2022
A Generalist Framework for Panoptic Segmentation of Images and Videos
Ting-Li Chen
Lala Li
Saurabh Saxena
Geoffrey E. Hinton
David J. Fleet
VGen
MLLM
43
102
0
12 Oct 2022
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
Ling Li
D. Thorsley
Joseph Hassoun
ViT
25
17
0
11 Oct 2022
Turbo Training with Token Dropout
Tengda Han
Weidi Xie
Andrew Zisserman
ViT
31
10
0
10 Oct 2022
SCAM! Transferring humans between images with Semantic Cross Attention Modulation
Nicolas Dufour
David Picard
Vicky Kalogeiton
51
13
0
10 Oct 2022
ConTra: (Con)text (Tra)nsformer for Cross-Modal Video Retrieval
A. Fragomeni
Michael Wray
Dima Damen
CLIP
ViT
25
3
0
09 Oct 2022
Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling
Hsin-Ying Lee
Hung-Ting Su
Bing-Chen Tsai
Tsung-Han Wu
Jia-Fong Yeh
Winston H. Hsu
27
2
0
08 Oct 2022
VIMA: General Robot Manipulation with Multimodal Prompts
Yunfan Jiang
Agrim Gupta
Zichen Zhang
Guanzhi Wang
Yongqiang Dou
Yanjun Chen
Li Fei-Fei
Anima Anandkumar
Yuke Zhu
Linxi Fan
LM&Ro
28
335
0
06 Oct 2022
SPARC: Sparse Render-and-Compare for CAD model alignment in a single RGB image
Florian Langer
Gwangbin Bae
Ignas Budvytis
R. Cipolla
3DPC
47
10
0
03 Oct 2022
Benign Autoencoders
Semyon Malamud
Teng Andrea Xu
Antoine Didisheim
DRL
AI4CE
14
0
0
02 Oct 2022
Contrastive Audio-Visual Masked Autoencoder
Yuan Gong
Andrew Rouditchenko
Alexander H. Liu
David Harwath
Leonid Karlinsky
Hilde Kuehne
James R. Glass
35
120
0
02 Oct 2022
Construction and Evaluation of a Self-Attention Model for Semantic Understanding of Sentence-Final Particles
Shuhei Mandokoro
N. Oka
Akane Matsushima
Chie Fukada
Yuko Yoshimura
Koji Kawahara
Kazuaki Tanaka
20
1
0
01 Oct 2022
Cascaded Multi-Modal Mixing Transformers for Alzheimer's Disease Classification with Incomplete Data
Linfeng Liu
Siyu Liu
Lu Zhang
X. To
F. Nasrallah
Shekhar S. Chandra
MedIm
34
52
0
01 Oct 2022
Real-time Online Video Detection with Temporal Smoothing Transformers
Yue Zhao
Philipp Krahenbuhl
ViT
69
57
0
19 Sep 2022
Distribution Aware Metrics for Conditional Natural Language Generation
David M. Chan
Yiming Ni
David A. Ross
Sudheendra Vijayanarasimhan
Austin Myers
John F. Canny
45
4
0
15 Sep 2022
Can We Solve 3D Vision Tasks Starting from A 2D Vision Transformer?
Yi Wang
Zhiwen Fan
Tianlong Chen
Hehe Fan
Zhangyang Wang
ViT
53
9
0
15 Sep 2022
A patch-based architecture for multi-label classification from single label annotations
Warren Jouanneau
Aurélie Bugeau
Marc Palyart
Nicolas Papadakis
Laurent Vézard
28
0
0
14 Sep 2022
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Mohit Shridhar
Lucas Manuelli
D. Fox
LM&Ro
163
457
0
12 Sep 2022
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
18
60
0
07 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
28
109
0
31 Aug 2022
A Circular Window-based Cascade Transformer for Online Action Detection
Shuyuan Cao
Weihua Luo
Bairui Wang
Wei Emma Zhang
Lin Ma
42
6
0
30 Aug 2022
Improving Small Molecule Generation using Mutual Information Machine
Daniel A. Reidenbach
M. Livne
Rajesh Ilango
M. Gill
Johnny Israeli
28
14
0
18 Aug 2022
Efficient Multimodal Transformer with Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis
Guoying Zhao
Zheng Lian
B. Liu
Jianhua Tao
32
47
0
16 Aug 2022
Teacher Guided Training: An Efficient Framework for Knowledge Transfer
Manzil Zaheer
A. S. Rawat
Seungyeon Kim
Chong You
Himanshu Jain
Andreas Veit
Rob Fergus
Surinder Kumar
VLM
16
2
0
14 Aug 2022
Learning to Generalize with Object-centric Agents in the Open World Survival Game Crafter
Aleksandar Stanić
Yujin Tang
David R Ha
Jürgen Schmidhuber
ELM
29
13
0
05 Aug 2022
COPER: Continuous Patient State Perceiver
V. Chauhan
Anshul Thakur
Odhran O'Donoghue
David A. Clifton
AI4TS
OOD
30
5
0
05 Aug 2022
Impact Makes a Sound and Sound Makes an Impact: Sound Guides Representations and Explorations
Xufeng Zhao
C. Weber
Muhammad Burhan Hafez
S. Wermter
18
8
0
04 Aug 2022
CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point Cloud Learning
Mahdi Saleh
Yige Wang
Nassir Navab
Benjamin Busam
F. Tombari
3DPC
26
3
0
31 Jul 2022
UAVM: Towards Unifying Audio and Visual Models
Yuan Gong
Alexander H. Liu
Andrew Rouditchenko
James R. Glass
30
21
0
29 Jul 2022
Depth Field Networks for Generalizable Multi-view Scene Representation
Vitor Campagnolo Guizilini
Igor Vasiljevic
Jiading Fang
Rares Ambrus
G. Shakhnarovich
Matthew R. Walter
Adrien Gaidon
3DV
MDE
32
15
0
28 Jul 2022
Temporal and cross-modal attention for audio-visual zero-shot learning
Otniel-Bogdan Mercea
Thomas Hummel
A. Sophia Koepke
Zeynep Akata
38
25
0
20 Jul 2022
Residual and Attentional Architectures for Vector-Symbols
W. Olin-Ammentorp
22
3
0
18 Jul 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
Wei-Ning Hsu
Bowen Shi
SSL
VLM
24
41
0
14 Jul 2022
Transformer-based Context Condensation for Boosting Feature Pyramids in Object Detection
Zhe Chen
Jing Zhang
Yufei Xu
Dacheng Tao
ViT
19
11
0
14 Jul 2022
MM-ALT: A Multimodal Automatic Lyric Transcription System
Xiangming Gu
Longshen Ou
Danielle Ong
Ye Wang
11
13
0
13 Jul 2022
Wayformer: Motion Forecasting via Simple & Efficient Attention Networks
Nigamaa Nayakanti
Rami Al-Rfou
Aurick Zhou
Kratarth Goel
Khaled S. Refaat
Benjamin Sapp
AI4TS
42
235
0
12 Jul 2022
MaiT: Leverage Attention Masks for More Efficient Image Transformers
Ling Li
Ali Shafiee Ardestani
Joseph Hassoun
14
1
0
06 Jul 2022
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
43
189
0
06 Jul 2022
Softmax-free Linear Transformers
Jiachen Lu
Junge Zhang
Xiatian Zhu
Jianfeng Feng
Tao Xiang
Li Zhang
ViT
16
7
0
05 Jul 2022
Conditioned Human Trajectory Prediction using Iterative Attention Blocks
A. Postnikov
A. Gamayunov
Gonzalo Ferrer
10
3
0
29 Jun 2022
Deformable Graph Transformer
Jinyoung Park
Seongjun Yun
Hyeon-ju Park
Jaewoo Kang
Jisu Jeong
KyungHyun Kim
Jung-Woo Ha
Hyunwoo J. Kim
90
7
0
29 Jun 2022
A Unified Sequence Interface for Vision Tasks
Ting-Li Chen
Saurabh Saxena
Lala Li
Nayeon Lee
David J. Fleet
Geoffrey E. Hinton
VLM
MLLM
16
148
0
15 Jun 2022
Human Eyes Inspired Recurrent Neural Networks are More Robust Against Adversarial Noises
Minkyu Choi
Yizhen Zhang
Kuan Han
Xiaokai Wang
Zhongming Liu
AAML
GAN
35
4
0
15 Jun 2022
It's Time for Artistic Correspondence in Music and Video
Dídac Surís
Carl Vondrick
Bryan C. Russell
Justin Salamon
16
37
0
14 Jun 2022
Peripheral Vision Transformer
Juhong Min
Yucheng Zhao
Chong Luo
Minsu Cho
ViT
MDE
32
30
0
14 Jun 2022
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
60
527
0
13 Jun 2022
Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Elad Ben-Avraham
Roei Herzig
K. Mangalam
Amir Bar
Anna Rohrbach
Leonid Karlinsky
Trevor Darrell
Amir Globerson
19
0
0
13 Jun 2022
ChordMixer: A Scalable Neural Attention Model for Sequences with Different Lengths
Ruslan Khalitov
Tong Yu
Lei Cheng
Zhirong Yang
25
12
0
12 Jun 2022
Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs
Jinguo Zhu
Xizhou Zhu
Wenhai Wang
Xiaohua Wang
Hongsheng Li
Xiaogang Wang
Jifeng Dai
MoMe
MoE
21
66
0
09 Jun 2022
Previous
1
2
3
...
10
11
12
13
14
Next