ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.01311
  4. Cited By
High-Modality Multimodal Transformer: Quantifying Modality & Interaction
  Heterogeneity for High-Modality Representation Learning

High-Modality Multimodal Transformer: Quantifying Modality & Interaction Heterogeneity for High-Modality Representation Learning

2 March 2022
Paul Pu Liang
Yiwei Lyu
Xiang Fan
Jeffrey Tsaw
Yudong Liu
Shentong Mo
Dani Yogatama
Louis-Philippe Morency
Ruslan Salakhutdinov
ArXivPDFHTML

Papers citing "High-Modality Multimodal Transformer: Quantifying Modality & Interaction Heterogeneity for High-Modality Representation Learning"

34 / 34 papers shown
Title
A survey of multimodal deep generative models
A survey of multimodal deep generative models
Masahiro Suzuki
Y. Matsuo
SyDa
DRL
65
78
0
05 Jul 2022
A Generalist Agent
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&Ro
LLMAG
AI4CE
193
814
0
12 May 2022
FLAVA: A Foundational Language And Vision Alignment Model
FLAVA: A Foundational Language And Vision Alignment Model
Amanpreet Singh
Ronghang Hu
Vedanuj Goswami
Guillaume Couairon
Wojciech Galuba
Marcus Rohrbach
Douwe Kiela
CLIP
VLM
88
706
0
08 Dec 2021
PolyViT: Co-training Vision Transformers on Images, Videos and Audio
PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Valerii Likhosherstov
Anurag Arnab
K. Choromanski
Mario Lucic
Yi Tay
Adrian Weller
Mostafa Dehghani
ViT
55
75
0
25 Nov 2021
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Andrew Jaegle
Sebastian Borgeaud
Jean-Baptiste Alayrac
Carl Doersch
Catalin Ionescu
...
Olivier J. Hénaff
M. Botvinick
Andrew Zisserman
Oriol Vinyals
João Carreira
MLLM
VLM
GNN
59
579
0
30 Jul 2021
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
Paul Pu Liang
Yiwei Lyu
Xiang Fan
Zetian Wu
Yun Cheng
...
Peter Wu
Michelle A. Lee
Yuke Zhu
Ruslan Salakhutdinov
Louis-Philippe Morency
VLM
68
168
0
15 Jul 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
305
588
0
22 Apr 2021
Pretrained Transformers as Universal Computation Engines
Pretrained Transformers as Universal Computation Engines
Kevin Lu
Aditya Grover
Pieter Abbeel
Igor Mordatch
50
221
0
09 Mar 2021
Perceiver: General Perception with Iterative Attention
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
173
1,014
0
04 Mar 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
387
4,937
0
24 Feb 2021
Decoupling the Role of Data, Attention, and Losses in Multimodal
  Transformers
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Lisa Anne Hendricks
John F. J. Mellor
R. Schneider
Jean-Baptiste Alayrac
Aida Nematzadeh
108
114
0
31 Jan 2021
Multimodal Sensor Fusion with Differentiable Filters
Multimodal Sensor Fusion with Differentiable Filters
Michelle A. Lee
Brent Yi
Roberto Martín-Martín
Silvio Savarese
Jeannette Bohg
46
59
0
25 Oct 2020
Explaining Black Box Predictions and Unveiling Data Artifacts through
  Influence Functions
Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions
Xiaochuang Han
Byron C. Wallace
Yulia Tsvetkov
MILM
FAtt
AAML
TDI
58
171
0
14 May 2020
Temporal Fusion Transformers for Interpretable Multi-horizon Time Series
  Forecasting
Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting
Bryan Lim
Sercan O. Arik
Nicolas Loeff
Tomas Pfister
AI4TS
108
1,451
0
19 Dec 2019
Supervised Multimodal Bitransformers for Classifying Images and Text
Supervised Multimodal Bitransformers for Classifying Images and Text
Douwe Kiela
Suvrat Bhooshan
Hamed Firooz
Ethan Perez
Davide Testuggine
115
247
0
06 Sep 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
147
1,663
0
22 Aug 2019
LXMERT: Learning Cross-Modality Encoder Representations from
  Transformers
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
231
2,477
0
20 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
132
1,950
0
09 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for
  Vision-and-Language Tasks
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
219
3,674
0
06 Aug 2019
Making Sense of Vision and Touch: Learning Multimodal Representations
  for Contact-Rich Tasks
Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks
Michelle A. Lee
Yuke Zhu
Peter Zachares
Matthew Tan
K. Srinivasan
Silvio Savarese
Fei-Fei Li
Animesh Garg
Jeannette Bohg
SSL
59
211
0
28 Jul 2019
Which Tasks Should Be Learned Together in Multi-task Learning?
Which Tasks Should Be Learned Together in Multi-task Learning?
Trevor Scott Standley
Amir Zamir
Dawn Chen
Leonidas Guibas
Jitendra Malik
Silvio Savarese
95
517
0
18 May 2019
UR-FUNNY: A Multimodal Language Dataset for Understanding Humor
UR-FUNNY: A Multimodal Language Dataset for Understanding Humor
M. Hasan
Wasifur Rahman
Amir Zadeh
Jianyuan Zhong
Md. Iftekhar Tanveer
Louis-Philippe Morency
Mohammed
Ehsan Hoque
59
185
0
14 Apr 2019
Adaptive Cross-Modal Few-Shot Learning
Adaptive Cross-Modal Few-Shot Learning
Chen Xing
Negar Rostamzadeh
Boris N. Oreshkin
Pedro H. O. Pinheiro
132
272
0
19 Feb 2019
Characterizing and Avoiding Negative Transfer
Characterizing and Avoiding Negative Transfer
Zirui Wang
Zihang Dai
Barnabás Póczós
J. Carbonell
85
415
0
24 Nov 2018
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal
  Representations for Contact-Rich Tasks
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks
Michelle A. Lee
Yuke Zhu
K. Srinivasan
Parth Shah
Silvio Savarese
Li Fei-Fei
Animesh Garg
Jeannette Bohg
SSL
71
369
0
24 Oct 2018
CentralNet: a Multilayer Approach for Multimodal Fusion
CentralNet: a Multilayer Approach for Multimodal Fusion
Valentin Vielzeuf
Alexis Lechervy
S. Pateux
F. Jurie
63
169
0
22 Aug 2018
Taskonomy: Disentangling Task Transfer Learning
Taskonomy: Disentangling Task Transfer Learning
Amir Zamir
Alexander Sax
Bokui (William) Shen
Leonidas Guibas
Jitendra Malik
Silvio Savarese
118
1,217
0
23 Apr 2018
A Survey on Multi-Task Learning
A Survey on Multi-Task Learning
Yu Zhang
Qiang Yang
AIMat
598
2,225
0
25 Jul 2017
Tensor Fusion Network for Multimodal Sentiment Analysis
Tensor Fusion Network for Multimodal Sentiment Analysis
Amir Zadeh
Minghai Chen
Soujanya Poria
Min Zhang
Louis-Philippe Morency
73
1,233
0
23 Jul 2017
One Model To Learn Them All
One Model To Learn Them All
Lukasz Kaiser
Aidan Gomez
Noam M. Shazeer
Ashish Vaswani
Niki Parmar
Llion Jones
Jakob Uszkoreit
VLM
ViT
77
334
0
16 Jun 2017
An Overview of Multi-Task Learning in Deep Neural Networks
An Overview of Multi-Task Learning in Deep Neural Networks
Sebastian Ruder
CVBM
144
2,826
0
15 Jun 2017
Multimodal Machine Learning: A Survey and Taxonomy
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
77
2,928
0
26 May 2017
Zero-Shot Learning Through Cross-Modal Transfer
Zero-Shot Learning Through Cross-Modal Transfer
R. Socher
M. Ganjoo
Hamsa Sridhar
Osbert Bastani
Christopher D. Manning
A. Ng
BDL
VLM
119
1,468
0
16 Jan 2013
Frustratingly Easy Domain Adaptation
Frustratingly Easy Domain Adaptation
Hal Daumé
117
1,799
0
10 Jul 2009
1