ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.07935
  4. Cited By
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents

Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents

18 August 2020
Ye Zhu
Yu Wu
Yi Yang
Yan Yan
ArXivPDFHTML

Papers citing "Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents"

3 / 3 papers shown
Title
CLIP-Powered TASS: Target-Aware Single-Stream Network for Audio-Visual
  Question Answering
CLIP-Powered TASS: Target-Aware Single-Stream Network for Audio-Visual Question Answering
Yuanyuan Jiang
Jianqin Yin
45
1
0
13 May 2024
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li
Yake Wei
Yapeng Tian
Chenliang Xu
Ji-Rong Wen
Di Hu
29
136
0
26 Mar 2022
A Metamodel and Framework for Artificial General Intelligence From
  Theory to Practice
A Metamodel and Framework for Artificial General Intelligence From Theory to Practice
Hugo Latapie
Özkan Kiliç
Gaowen Liu
Yan Yan
Ramana Rao Kompella
Pei Wang
K. Thórisson
Adam Lawrence
Yuhong Sun
Jayanth Srinivasa
AI4CE
20
9
0
11 Feb 2021
1