ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.09454
  4. Cited By
Voila-A: Aligning Vision-Language Models with User's Gaze Attention

Voila-A: Aligning Vision-Language Models with User's Gaze Attention

22 December 2023
Kun Yan
Lei Ji
Zeyu Wang
Yuntao Wang
Nan Duan
Shuai Ma
ArXivPDFHTML

Papers citing "Voila-A: Aligning Vision-Language Models with User's Gaze Attention"

5 / 5 papers shown
Title
DWARF: Disease-weighted network for attention map refinement
DWARF: Disease-weighted network for attention map refinement
Haozhe Luo
Aurélie Pahud de Mortanges
Oana Inel
Abraham Bernstein
Mauricio Reyes
MedIm
31
3
0
24 Jun 2024
G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios
G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios
Zeyu Wang
Yuanchun Shi
Yuntao wang
Yuchen Yao
Kun Yan
Yuhan Wang
Lei Ji
Xuhai Xu
Chun Yu
40
7
0
13 May 2024
mPLUG-Owl: Modularization Empowers Large Language Models with
  Multimodality
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
208
900
0
27 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
270
4,244
0
30 Jan 2023
Unifying Vision-and-Language Tasks via Text Generation
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Joey Tianyi Zhou
MLLM
256
525
0
04 Feb 2021
1