ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.03206
  4. Cited By
Perceiver: General Perception with Iterative Attention

Perceiver: General Perception with Iterative Attention

4 March 2021
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
    VLM
    ViT
    MDE
ArXivPDFHTML

Papers citing "Perceiver: General Perception with Iterative Attention"

50 / 683 papers shown
Title
Does Visual Pretraining Help End-to-End Reasoning?
Does Visual Pretraining Help End-to-End Reasoning?
Chen Sun
Calvin Luo
Xingyi Zhou
Anurag Arnab
Cordelia Schmid
OCL
LRM
ViT
38
3
0
17 Jul 2023
Transformers are Universal Predictors
Transformers are Universal Predictors
Sourya Basu
Moulik Choraria
L. Varshney
28
4
0
15 Jul 2023
Dual-Query Multiple Instance Learning for Dynamic Meta-Embedding based
  Tumor Classification
Dual-Query Multiple Instance Learning for Dynamic Meta-Embedding based Tumor Classification
Simon Holdenried-Krafft
Peter Somers
Ivonne A. Montes-Majarro
Diana Silimon
Cristina Tarín
F. Fend
Hendrik P. A. Lensch
MedIm
33
3
0
14 Jul 2023
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action
  Recognition
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Syed Talal Wasim
Muhammad Uzair Khattak
Muzammal Naseer
Salman Khan
M. Shah
Fahad Shahbaz Khan
ViT
54
19
0
13 Jul 2023
PolyLM: An Open Source Polyglot Large Language Model
PolyLM: An Open Source Polyglot Large Language Model
Xiangpeng Wei
Hao-Ran Wei
Huan Lin
Tianhao Li
Pei Zhang
...
Yu Bowen
Dayiheng Liu
Baosong Yang
Fei Huang
Jun Xie
LRM
48
55
0
12 Jul 2023
One-Versus-Others Attention: Scalable Multimodal Integration for
  Clinical Data
One-Versus-Others Attention: Scalable Multimodal Integration for Clinical Data
Michal Golovanevsky
Eva Schiller
Akira Nair
Ritambhara Singh
Carsten Eickhoff
19
2
0
11 Jul 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
41
151
0
05 Jul 2023
What Matters in Training a GPT4-Style Language Model with Multimodal
  Inputs?
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?
Yan Zeng
Hanbo Zhang
Jiani Zheng
Jiangnan Xia
Guoqiang Wei
Yang Wei
Yuchen Zhang
Tao Kong
MLLM
27
71
0
05 Jul 2023
Act3D: 3D Feature Field Transformers for Multi-Task Robotic Manipulation
Act3D: 3D Feature Field Transformers for Multi-Task Robotic Manipulation
Théophile Gervet
Zhou Xian
N. Gkanatsios
Katerina Fragkiadaki
43
63
0
30 Jun 2023
An Efficient General-Purpose Modular Vision Model via Multi-Task
  Heterogeneous Training
An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training
Z. Chen
Mingyu Ding
Songlin Yang
Wei Zhan
Masayoshi Tomizuka
Erik Learned-Miller
Chuang Gan
MoE
24
8
0
29 Jun 2023
Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text
  Aligned Latent Representation
Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation
Zibo Zhao
Wen Liu
Xin Chen
Xi Zeng
Rui Wang
Pei Cheng
Bin-Bin Fu
Tao Chen
Gang Yu
Shenghua Gao
DiffM
28
87
0
29 Jun 2023
Semi-supervised Multimodal Representation Learning through a Global
  Workspace
Semi-supervised Multimodal Representation Learning through a Global Workspace
Benjamin Devillers
Léopold Maytié
R. V. Rullen
SSL
27
5
0
27 Jun 2023
BatchGFN: Generative Flow Networks for Batch Active Learning
BatchGFN: Generative Flow Networks for Batch Active Learning
Shreshth A. Malik
Salem Lahlou
Andrew Jesson
Moksh Jain
Nikolay Malkin
T. Deleu
Yoshua Bengio
Y. Gal
AI4CE
26
2
0
26 Jun 2023
RVT: Robotic View Transformer for 3D Object Manipulation
RVT: Robotic View Transformer for 3D Object Manipulation
Ankit Goyal
Jie Xu
Yijie Guo
Valts Blukis
Yu-Wei Chao
Dieter Fox
LM&Ro
42
120
0
26 Jun 2023
AR2-D2:Training a Robot Without a Robot
AR2-D2:Training a Robot Without a Robot
Jiafei Duan
Yi Ru Wang
Mohit Shridhar
Dieter Fox
Ranjay Krishna
38
28
0
23 Jun 2023
ProRes: Exploring Degradation-aware Visual Prompt for Universal Image
  Restoration
ProRes: Exploring Degradation-aware Visual Prompt for Universal Image Restoration
Jiaqi Ma
Tianheng Cheng
Guoli Wang
Qian Zhang
Xinggang Wang
Lefei Zhang
DiffM
VLM
14
43
0
23 Jun 2023
LightGlue: Local Feature Matching at Light Speed
LightGlue: Local Feature Matching at Light Speed
Philipp Lindenberger
Paul-Edouard Sarlin
Marc Pollefeys
3DV
VLM
25
396
0
23 Jun 2023
Learning Unseen Modality Interaction
Learning Unseen Modality Interaction
Yunhua Zhang
Hazel Doughty
Cees G. M. Snoek
27
3
0
22 Jun 2023
Constant Memory Attention Block
Constant Memory Attention Block
Leo Feng
Frederick Tung
Hossein Hajimirsadeghi
Yoshua Bengio
Mohamed Osama Ahmed
22
0
0
21 Jun 2023
Exploring the Role of Audio in Video Captioning
Exploring the Role of Audio in Video Captioning
Yuhan Shen
Linjie Yang
Longyin Wen
Haichao Yu
Ehsan Elhamifar
Heng Wang
18
2
0
21 Jun 2023
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text
  Documents
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Hugo Laurenccon
Lucile Saulnier
Léo Tronchon
Stas Bekman
Amanpreet Singh
...
Siddharth Karamcheti
Alexander M. Rush
Douwe Kiela
Matthieu Cord
Victor Sanh
25
230
0
21 Jun 2023
Dynamic Perceiver for Efficient Visual Recognition
Dynamic Perceiver for Efficient Visual Recognition
Yizeng Han
Dongchen Han
Zeyu Liu
Yulin Wang
Xuran Pan
Yifan Pu
Chaorui Deng
Junlan Feng
S. Song
Gao Huang
32
29
0
20 Jun 2023
Sparse Modular Activation for Efficient Sequence Modeling
Sparse Modular Activation for Efficient Sequence Modeling
Liliang Ren
Yang Liu
Shuohang Wang
Yichong Xu
Chenguang Zhu
Chengxiang Zhai
45
13
0
19 Jun 2023
Multitrack Music Transcription with a Time-Frequency Perceiver
Multitrack Music Transcription with a Time-Frequency Perceiver
Weiyi Lu
Ju-Chiang Wang
Yun-Ning Hung
ViT
AI4TS
31
24
0
19 Jun 2023
RedMotion: Motion Prediction via Redundancy Reduction
RedMotion: Motion Prediction via Redundancy Reduction
Royden Wagner
Omer Sahin Tas
Marvin Klemp
Carlos Fernandez Lopez
Christoph Stiller
48
6
0
19 Jun 2023
The Big Data Myth: Using Diffusion Models for Dataset Generation to
  Train Deep Detection Models
The Big Data Myth: Using Diffusion Models for Dataset Generation to Train Deep Detection Models
Roy Voetman
Maya Aghaei
K. Dijkstra
DiffM
19
11
0
16 Jun 2023
FedMultimodal: A Benchmark For Multimodal Federated Learning
FedMultimodal: A Benchmark For Multimodal Federated Learning
Tiantian Feng
Digbalay Bose
Tuo Zhang
Rajat Hebbar
Anil Ramakrishna
Rahul Gupta
Mi Zhang
Salman Avestimehr
Shrikanth Narayanan
32
48
0
15 Jun 2023
High-performance deep spiking neural networks with 0.3 spikes per neuron
High-performance deep spiking neural networks with 0.3 spikes per neuron
A. Stanojević
Stanislaw Wo'zniak
G. Bellec
G. Cherubini
A. Pantazi
W. Gerstner
31
15
0
14 Jun 2023
A Survey of Vision-Language Pre-training from the Lens of Multimodal
  Machine Translation
A Survey of Vision-Language Pre-training from the Lens of Multimodal Machine Translation
Jeremy Gwinnup
Kevin Duh
VLM
22
3
0
12 Jun 2023
Learning Probabilistic Symmetrization for Architecture Agnostic
  Equivariance
Learning Probabilistic Symmetrization for Architecture Agnostic Equivariance
Jinwoo Kim
Tien Dat Nguyen
Ayhan Suleymanzade
Hyeokjun An
Seunghoon Hong
50
23
0
05 Jun 2023
Transformer-Based UNet with Multi-Headed Cross-Attention Skip
  Connections to Eliminate Artifacts in Scanned Documents
Transformer-Based UNet with Multi-Headed Cross-Attention Skip Connections to Eliminate Artifacts in Scanned Documents
David Kreuzer
M. Munz
ViT
MedIm
21
0
0
05 Jun 2023
Systematic Visual Reasoning through Object-Centric Relational
  Abstraction
Systematic Visual Reasoning through Object-Centric Relational Abstraction
Taylor Webb
S. S. Mondal
Jonathan D. Cohen
OCL
30
24
0
04 Jun 2023
A Transformer-based representation-learning model with unified
  processing of multimodal input for clinical diagnostics
A Transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics
Hong-Yu Zhou
Yizhou Yu
Chengdi Wang
Shu Zhen Zhang
Yuanxu Gao
Jia-Yu Pan
Jun Shao
Guangming Lu
Kang Zhang
Weimin Li
MedIm
19
150
0
01 Jun 2023
Bytes Are All You Need: Transformers Operating Directly On File Bytes
Bytes Are All You Need: Transformers Operating Directly On File Bytes
Maxwell Horton
Sachin Mehta
Ali Farhadi
Mohammad Rastegari
VLM
22
6
0
31 May 2023
Joint Adaptive Representations for Image-Language Learning
Joint Adaptive Representations for Image-Language Learning
A. Piergiovanni
A. Angelova
VLM
34
0
0
31 May 2023
Gemtelligence: Accelerating Gemstone classification with Deep Learning
Gemtelligence: Accelerating Gemstone classification with Deep Learning
Tommaso Bendinelli
Luca Biggio
D. Nyfeler
Abhigyan Ghosh
P. Tollan
M. Kirschmann
Olga Fink
25
1
0
31 May 2023
Blockwise Parallel Transformer for Large Context Models
Blockwise Parallel Transformer for Large Context Models
Hao Liu
Pieter Abbeel
49
11
0
30 May 2023
NetHack is Hard to Hack
NetHack is Hard to Hack
Ulyana Piterbarg
Lerrel Pinto
Rob Fergus
21
7
0
30 May 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive
  Transformers
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurelien Lucchi
Thomas Hofmann
39
53
0
25 May 2023
Lucy-SKG: Learning to Play Rocket League Efficiently Using Deep
  Reinforcement Learning
Lucy-SKG: Learning to Play Rocket League Efficiently Using Deep Reinforcement Learning
V. Moschopoulos
Pantelis Kyriakidis
A. Lazaridis
I. Vlahavas
18
0
0
25 May 2023
Concept-Centric Transformers: Enhancing Model Interpretability through
  Object-Centric Concept Learning within a Shared Global Workspace
Concept-Centric Transformers: Enhancing Model Interpretability through Object-Centric Concept Learning within a Shared Global Workspace
Jinyung Hong
Keun Hee Park
Theodore P. Pavlic
29
5
0
25 May 2023
Pento-DIARef: A Diagnostic Dataset for Learning the Incremental
  Algorithm for Referring Expression Generation from Examples
Pento-DIARef: A Diagnostic Dataset for Learning the Incremental Algorithm for Referring Expression Generation from Examples
P. Sadler
David Schlangen
23
2
0
24 May 2023
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator
Ziwei He
Meng-Da Yang
Minwei Feng
Jingcheng Yin
Xinbing Wang
Jingwen Leng
Zhouhan Lin
ViT
35
13
0
24 May 2023
Memory Efficient Neural Processes via Constant Memory Attention Block
Memory Efficient Neural Processes via Constant Memory Attention Block
Leo Feng
Frederick Tung
Hossein Hajimirsadeghi
Yoshua Bengio
Mohamed Osama Ahmed
31
5
0
23 May 2023
Training Transitive and Commutative Multimodal Transformers with LoReTTa
Training Transitive and Commutative Multimodal Transformers with LoReTTa
Manuel Tran
Yashin Dicente Cid
Amal Lahiani
Fabian J. Theis
Tingying Peng
Eldad Klaiman
26
2
0
23 May 2023
VideoLLM: Modeling Video Sequence with Large Language Models
VideoLLM: Modeling Video Sequence with Large Language Models
Guo Chen
Yin-Dong Zheng
Jiahao Wang
Jilan Xu
Yifei Huang
...
Yi Wang
Yali Wang
Yu Qiao
Tong Lu
Limin Wang
MLLM
103
77
0
22 May 2023
RWKV: Reinventing RNNs for the Transformer Era
RWKV: Reinventing RNNs for the Transformer Era
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
...
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
90
562
0
22 May 2023
FIT: Far-reaching Interleaved Transformers
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
29
12
0
22 May 2023
What Makes for Good Visual Tokenizers for Large Language Models?
What Makes for Good Visual Tokenizers for Large Language Models?
Guangzhi Wang
Yixiao Ge
Xiaohan Ding
Mohan S. Kankanhalli
Ying Shan
MLLM
VLM
30
38
0
20 May 2023
ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross
  Attention
ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross Attention
J. Yip
Tuan Truong
Dianwen Ng
Chong Zhang
Yukun Ma
Trung Hieu Nguyen
Chongjia Ni
Shengkui Zhao
Chng Eng Siong
Bin Ma
17
2
0
20 May 2023
Previous
123...789...121314
Next