ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.08358
  4. Cited By
MixGen: A New Multi-Modal Data Augmentation
v1v2v3 (latest)

MixGen: A New Multi-Modal Data Augmentation

16 June 2022
Xiaoshuai Hao
Yi Zhu
Srikar Appalaraju
Aston Zhang
Wanqian Zhang
Boyang Li
Mu Li
    VLM
ArXiv (abs)PDFHTML

Papers citing "MixGen: A New Multi-Modal Data Augmentation"

19 / 19 papers shown
Title
Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning
Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning
Yuting Li
Lai Wei
Kaipeng Zheng
Jingyuan Huang
Linghe Kong
Lichao Sun
Weiran Huang
AAMLLRMVLM
84
0
0
11 Jun 2025
Video-CoT: A Comprehensive Dataset for Spatiotemporal Understanding of Videos Based on Chain-of-Thought
Shuyi Zhang
Xiaoshuai Hao
Yingbo Tang
Lingfeng Zhang
Pengwei Wang
Zhongyuan Wang
Hongxuan Ma
Shanghang Zhang
VGenAI4TS
61
0
0
10 Jun 2025
Uneven Event Modeling for Partially Relevant Video Retrieval
Uneven Event Modeling for Partially Relevant Video Retrieval
Sa Zhu
Huashan Chen
Wanqian Zhang
Jinchao Zhang
Zexian Yang
Xiaoshuai Hao
Bo Li
48
1
0
01 Jun 2025
SynRES: Towards Referring Expression Segmentation in the Wild via Synthetic Data
SynRES: Towards Referring Expression Segmentation in the Wild via Synthetic Data
Dong-Hee Kim
Hyunjee Song
Donghyun Kim
292
0
0
23 May 2025
MIDAS: Mixing Ambiguous Data with Soft Labels for Dynamic Facial Expression Recognition
Ryosuke Kawamura
Hideaki Hayashi
Noriko Takemura
Hajime Nagahara
CVBM3DH
102
4
0
28 Feb 2025
RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete
RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete
Yuheng Ji
Huajie Tan
Jiayu Shi
Xiaoshuai Hao
Yuan Zhang
...
Huaihai Lyu
Xiaolong Zheng
Jiaming Liu
Zhongyuan Wang
Shanghang Zhang
189
15
0
28 Feb 2025
Contrastive Visual Data Augmentation
Contrastive Visual Data Augmentation
Yu Zhou
B. Li
Mohan Tang
Xiaomeng Jin
Te-Lin Wu
Kuan-Hao Huang
Heng Ji
Kai-Wei Chang
Nanyun Peng
117
0
0
24 Feb 2025
Harnessing Shared Relations via Multimodal Mixup Contrastive Learning for Multimodal Classification
Harnessing Shared Relations via Multimodal Mixup Contrastive Learning for Multimodal Classification
Raja Kumar
Raghav Singhal
Pranamya Kulkarni
Deval Mehta
Kshitij S. Jadhav
83
0
0
26 Sep 2024
FTF-ER: Feature-Topology Fusion-Based Experience Replay Method for
  Continual Graph Learning
FTF-ER: Feature-Topology Fusion-Based Experience Replay Method for Continual Graph Learning
Jinhui Pang
Changqing Lin
Xiaoshuai Hao
Rong Yin
Zixuan Wang
Zhihui Zhang
Jinglin He
Huang Tai Sheng
83
4
0
28 Jul 2024
BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation
BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation
Peng Hao
Xiaobing Wang
Yingying Jiang
Hanchao Jia
Xiaoshuai Hao
Shaowei Cui
Junhang Wei
Xiaoshuai Hao
152
3
0
26 Jul 2024
Enhancing Vision-Language Pre-training with Rich Supervisions
Enhancing Vision-Language Pre-training with Rich Supervisions
Yuan Gao
Kunyu Shi
Pengkai Zhu
Edouard Belval
Oren Nuriel
Srikar Appalaraju
Shabnam Ghadar
Vijay Mahadevan
Zhuowen Tu
Stefano Soatto
VLMCLIP
168
12
0
05 Mar 2024
3VL: Using Trees to Improve Vision-Language Models' Interpretability
3VL: Using Trees to Improve Vision-Language Models' Interpretability
Nir Yellinek
Leonid Karlinsky
Raja Giryes
CoGeVLM
298
3
0
28 Dec 2023
Team AcieLee: Technical Report for EPIC-SOUNDS Audio-Based Interaction
  Recognition Challenge 2023
Team AcieLee: Technical Report for EPIC-SOUNDS Audio-Based Interaction Recognition Challenge 2023
Yuqi Li
Yi-Jhen Luo
Xiaoshuai Hao
Chuanguang Yang
Zhulin An
Dantong Song
Wei Yi
76
0
0
15 Jun 2023
Learning Multimodal Data Augmentation in Feature Space
Learning Multimodal Data Augmentation in Feature Space
Zichang Liu
Zhiqiang Tang
Xingjian Shi
Aston Zhang
Mu Li
Anshumali Shrivastava
A. Wilson
98
23
0
29 Dec 2022
Teaching Structured Vision&Language Concepts to Vision&Language Models
Teaching Structured Vision&Language Concepts to Vision&Language Models
Sivan Doveh
Assaf Arbelle
Sivan Harary
Yikang Shen
Roei Herzig
...
Donghyun Kim
Raja Giryes
Rogerio Feris
S. Ullman
Leonid Karlinsky
VLMCoGe
126
72
0
21 Nov 2022
Unifying Vision-Language Representation Space with Single-tower
  Transformer
Unifying Vision-Language Representation Space with Single-tower Transformer
Jiho Jang
Chaerin Kong
D. Jeon
Seonhoon Kim
Nojun Kwak
113
21
0
21 Nov 2022
YORO -- Lightweight End to End Visual Grounding
YORO -- Lightweight End to End Visual Grounding
Chih-Hui Ho
Srikar Appalaraju
Bhavan A. Jasani
R. Manmatha
Nuno Vasconcelos
ObjD
60
22
0
15 Nov 2022
Beyond Instance Discrimination: Relation-aware Contrastive
  Self-supervised Learning
Beyond Instance Discrimination: Relation-aware Contrastive Self-supervised Learning
Yifei Zhang
Chang-rui Liu
Yu Zhou
Weiping Wang
QiXiang Ye
Xiangyang Ji
SSLISegBDL
88
7
0
02 Nov 2022
Efficient Vision-Language Pretraining with Visual Concepts and
  Hierarchical Alignment
Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment
Mustafa Shukor
Guillaume Couairon
Matthieu Cord
VLMCLIP
100
27
0
29 Aug 2022
1