ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.04183
  4. Cited By
OpenViVQA: Task, Dataset, and Multimodal Fusion Models for Visual
  Question Answering in Vietnamese

OpenViVQA: Task, Dataset, and Multimodal Fusion Models for Visual Question Answering in Vietnamese

7 May 2023
Nghia Hieu Nguyen
Duong T.D. Vo
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
ArXiv (abs)PDFHTML

Papers citing "OpenViVQA: Task, Dataset, and Multimodal Fusion Models for Visual Question Answering in Vietnamese"

14 / 14 papers shown
Title
Enhancing Vietnamese VQA through Curriculum Learning on Raw and Augmented Text Representations
Khoi Anh Nguyen
Linh Yen Vu
Thang Dinh Duong
Thuan Nguyen Duong
Huy Thanh Nguyen
V. Q. Dinh
91
3
0
05 Mar 2025
LiGT: Layout-infused Generative Transformer for Visual Question Answering on Vietnamese Receipts
LiGT: Layout-infused Generative Transformer for Visual Question Answering on Vietnamese Receipts
Thanh-Phong Le
Trung Le Chi Phan
Nghia Hieu Nguyen
Kiet Van Nguyen
ViT
89
1
0
26 Feb 2025
ViConsFormer: Constituting Meaningful Phrases of Scene Texts using
  Transformer-based Method in Vietnamese Text-based Visual Question Answering
ViConsFormer: Constituting Meaningful Phrases of Scene Texts using Transformer-based Method in Vietnamese Text-based Visual Question Answering
Nghia Hieu Nguyen
Tho Thanh Quan
Ngan Luu-Thuy Nguyen
75
0
0
18 Oct 2024
Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese
Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese
Khang T. Doan
Bao G. Huynh
D. T. Hoang
Thuc D. Pham
Nhat H. Pham
Quan T.M. Nguyen
Bang Q. Vo
Suong N. Hoang
MLLM
72
6
0
22 Aug 2024
Enhancing Visual Question Answering through Ranking-Based Hybrid
  Training and Multimodal Fusion
Enhancing Visual Question Answering through Ranking-Based Hybrid Training and Multimodal Fusion
Peiyuan Chen
Zecheng Zhang
Yiping Dong
Li Zhou
Han Wang
70
12
0
14 Aug 2024
Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust
  Visual Question-Localized Answering in Robotic Surgery
Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question-Localized Answering in Robotic Surgery
Long Bai
Guankun Wang
Mobarakol Islam
Lalithkumar Seenivasan
An-Chi Wang
Hongliang Ren
107
17
0
09 Aug 2024
Advancing Vietnamese Visual Question Answering with Transformer and
  Convolutional Integration
Advancing Vietnamese Visual Question Answering with Transformer and Convolutional Integration
Ngoc Son Nguyen
Van Nguyen
Tung Le
ViT
91
1
0
30 Jul 2024
New Benchmark Dataset and Fine-Grained Cross-Modal Fusion Framework for
  Vietnamese Multimodal Aspect-Category Sentiment Analysis
New Benchmark Dataset and Fine-Grained Cross-Modal Fusion Framework for Vietnamese Multimodal Aspect-Category Sentiment Analysis
Quy Hoang Nguyen
Minh-Van Truong Nguyen
Kiet Van Nguyen
88
2
0
01 May 2024
ViOCRVQA: Novel Benchmark Dataset and Vision Reader for Visual Question
  Answering by Understanding Vietnamese Text in Images
ViOCRVQA: Novel Benchmark Dataset and Vision Reader for Visual Question Answering by Understanding Vietnamese Text in Images
Huy Quang Pham
Thang Kien-Bao Nguyen
Quan Van Nguyen
Dan Quang Tran
Nghia Hieu Nguyen
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
97
4
0
29 Apr 2024
ViCLEVR: A Visual Reasoning Dataset and Hybrid Multimodal Fusion Model
  for Visual Question Answering in Vietnamese
ViCLEVR: A Visual Reasoning Dataset and Hybrid Multimodal Fusion Model for Visual Question Answering in Vietnamese
Khiem Vinh Tran
Hao Phu Phan
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
54
7
0
27 Oct 2023
Generative Pre-trained Transformer for Vietnamese Community-based
  COVID-19 Question Answering
Generative Pre-trained Transformer for Vietnamese Community-based COVID-19 Question Answering
T. M. Vo
Khiem Vinh Tran
58
1
0
23 Oct 2023
Expanding Frozen Vision-Language Models without Retraining: Towards
  Improved Robot Perception
Expanding Frozen Vision-Language Models without Retraining: Towards Improved Robot Perception
Riley Tavassoli
Mani Amani
Reza Akhavian
76
1
0
31 Aug 2023
PAT: Parallel Attention Transformer for Visual Question Answering in
  Vietnamese
PAT: Parallel Attention Transformer for Visual Question Answering in Vietnamese
Nghia Hieu Nguyen
Kiet Van Nguyen
47
2
0
17 Jul 2023
EVJVQA Challenge: Multilingual Visual Question Answering
EVJVQA Challenge: Multilingual Visual Question Answering
Ngan Luu-Thuy Nguyen
Nghia Hieu Nguyen
Duong T.D. Vo
K. Tran
Kiet Van Nguyen
84
7
0
23 Feb 2023
1