ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.10707
  4. Cited By
Multimodal Representation Learning by Alternating Unimodal Adaptation
v1v2 (latest)

Multimodal Representation Learning by Alternating Unimodal Adaptation

17 November 2023
Xiaohui Zhang
Jaehong Yoon
Mohit Bansal
Huaxiu Yao
ArXiv (abs)PDFHTML

Papers citing "Multimodal Representation Learning by Alternating Unimodal Adaptation"

27 / 27 papers shown
Title
Aligning Modalities in Vision Large Language Models via Preference
  Fine-tuning
Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
Yiyang Zhou
Chenhang Cui
Rafael Rafailov
Chelsea Finn
Huaxiu Yao
VLMMLLM
88
120
0
18 Feb 2024
Multimodal Clinical Trial Outcome Prediction with Large Language Models
Multimodal Clinical Trial Outcome Prediction with Large Language Models
Wenhao Zheng
Dongsheng Peng
Hongxia Xu
Yun Li
Hongtu Zhu
Tianfan Fu
Huaxiu Yao
Huaxiu Yao
181
5
0
09 Feb 2024
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
Shoubin Yu
Jaehong Yoon
Mohit Bansal
122
6
0
08 Feb 2024
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model
  Reasoning over Image Sequences
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences
Xiyao Wang
Yuhang Zhou
Xiaoyu Liu
Hongjin Lu
Yuancheng Xu
...
Taixi Lu
Gedas Bertasius
Mohit Bansal
Huaxiu Yao
Furong Huang
LRMVLM
133
77
0
19 Jan 2024
Holistic Analysis of Hallucination in GPT-4V(ision): Bias and
  Interference Challenges
Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges
Chenhang Cui
Yiyang Zhou
Xinyu Yang
Shirley Wu
Linjun Zhang
James Zou
Huaxiu Yao
MLLM
66
91
0
06 Nov 2023
Read, Look or Listen? What's Needed for Solving a Multimodal Dataset
Read, Look or Listen? What's Needed for Solving a Multimodal Dataset
Netta Madvil
Yonatan Bitton
Roy Schwartz
55
3
0
06 Jul 2023
Provable Dynamic Fusion for Low-Quality Multimodal Data
Provable Dynamic Fusion for Low-Quality Multimodal Data
Qingyang Zhang
Haitao Wu
Changqing Zhang
Qinghua Hu
Huazhu Fu
Qiufeng Wang
Xi Peng
91
61
0
03 Jun 2023
Self-Chained Image-Language Model for Video Localization and Question
  Answering
Self-Chained Image-Language Model for Video Localization and Question Answering
Shoubin Yu
Jaemin Cho
Prateek Yadav
Joey Tianyi Zhou
130
139
0
11 May 2023
An Empirical Study of Multimodal Model Merging
An Empirical Study of Multimodal Model Merging
Yi-Lin Sung
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Joey Tianyi Zhou
Lijuan Wang
MoMe
98
41
0
28 Apr 2023
M$^3$Care: Learning with Missing Modalities in Multimodal Healthcare
  Data
M3^33Care: Learning with Missing Modalities in Multimodal Healthcare Data
Chaohe Zhang
Xu Chu
Liantao Ma
Yinghao Zhu
Yasha Wang
Jiangtao Wang
Junfeng Zhao
54
87
0
28 Oct 2022
Contrastive Audio-Visual Masked Autoencoder
Contrastive Audio-Visual Masked Autoencoder
Yuan Gong
Andrew Rouditchenko
Alexander H. Liu
David Harwath
Leonid Karlinsky
Hilde Kuehne
James R. Glass
75
128
0
02 Oct 2022
Multimodal Masked Autoencoders Learn Transferable Representations
Multimodal Masked Autoencoders Learn Transferable Representations
Xinyang Geng
Hao Liu
Lisa Lee
Dale Schuurams
Sergey Levine
Pieter Abbeel
75
118
0
27 May 2022
Are Multimodal Transformers Robust to Missing Modality?
Are Multimodal Transformers Robust to Missing Modality?
Mengmeng Ma
Jian Ren
Long Zhao
Davide Testuggine
Xi Peng
ViT
98
154
0
12 Apr 2022
Balanced Multimodal Learning via On-the-fly Gradient Modulation
Balanced Multimodal Learning via On-the-fly Gradient Modulation
Xiaokang Peng
Yake Wei
Andong Deng
Dong Wang
Di Hu
72
213
0
29 Mar 2022
Modality Competition: What Makes Joint Training of Multi-modal Network
  Fail in Deep Learning? (Provably)
Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably)
Yu Huang
Junyang Lin
Chang Zhou
Hongxia Yang
Longbo Huang
60
96
0
23 Mar 2022
GCNet: Graph Completion Network for Incomplete Multimodal Learning in
  Conversation
GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation
Zheng Lian
Lang Chen
Guoying Zhao
B. Liu
J. Tao
92
101
0
04 Mar 2022
Modality-aware Mutual Learning for Multi-modal Medical Image
  Segmentation
Modality-aware Mutual Learning for Multi-modal Medical Image Segmentation
Yao Zhang
Jiawei Yang
Jiang Tian
Zhongchao Shi
Cheng Zhong
Yang Zhang
Zhiqiang He
71
95
0
21 Jul 2021
SMIL: Multimodal Learning with Severely Missing Modality
SMIL: Multimodal Learning with Severely Missing Modality
Mengmeng Ma
Jian Ren
Long Zhao
Sergey Tulyakov
Cathy H. Wu
Xi Peng
98
263
0
09 Mar 2021
On Modality Bias in the TVQA Dataset
On Modality Bias in the TVQA Dataset
T. Winterbottom
S. Xiao
A. McLean
Noura Al Moubayed
63
35
0
18 Dec 2020
Deep Partial Multi-View Learning
Deep Partial Multi-View Learning
Changqing Zhang
Yajie Cui
Zongbo Han
Qiufeng Wang
Huazhu Fu
Q. Hu
87
229
0
12 Nov 2020
Improving Multimodal Accuracy Through Modality Pre-training and
  Attention
Improving Multimodal Accuracy Through Modality Pre-training and Attention
Aya Abdelsalam Ismail
Mahmudul Hasan
F. Ishtiaq
57
17
0
11 Nov 2020
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Humam Alwassel
D. Mahajan
Bruno Korbar
Lorenzo Torresani
Guohao Li
Du Tran
SSL
97
431
0
28 Nov 2019
LXMERT: Learning Cross-Modality Encoder Representations from
  Transformers
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Hao Hao Tan
Joey Tianyi Zhou
VLMMLLM
247
2,488
0
20 Aug 2019
Continual Learning of Context-dependent Processing in Neural Networks
Continual Learning of Context-dependent Processing in Neural Networks
Guanxiong Zeng
Yang Chen
Bo Cui
Shan Yu
CLL
82
310
0
29 Sep 2018
Efficient Large-Scale Multi-Modal Classification
Efficient Large-Scale Multi-Modal Classification
D. Kiela
Edouard Grave
Armand Joulin
Tomas Mikolov
84
148
0
06 Feb 2018
FiLM: Visual Reasoning with a General Conditioning Layer
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAttAIMatOffRLAI4CE
356
2,230
0
22 Sep 2017
Look, Listen and Learn
Look, Listen and Learn
Relja Arandjelović
Andrew Zisserman
SSL
125
906
0
23 May 2017
1