ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.06309
  4. Cited By
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large
  Multi-Modal Models

Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models

9 April 2024
David Kurzendörfer
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
    VLM
    CLIP
ArXivPDFHTML

Papers citing "Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models"

8 / 8 papers shown
Title
Extremely Simple Out-of-distribution Detection for Audio-visual Generalized Zero-shot Learning
Extremely Simple Out-of-distribution Detection for Audio-visual Generalized Zero-shot Learning
Yang Liu
X. Zhang
Jiale Du
Xinbo Gao
Jungong Han
OODD
49
0
0
28 Mar 2025
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound
  Classification and Detection
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
118
264
0
02 Feb 2022
V-SlowFast Network for Efficient Visual Sound Separation
V-SlowFast Network for Efficient Visual Sound Separation
Lingyu Zhu
Esa Rahtu
44
10
0
18 Sep 2021
Distilling Audio-Visual Knowledge by Compositional Contrastive Learning
Distilling Audio-Visual Knowledge by Compositional Contrastive Learning
Yanbei Chen
Yongqin Xian
A. Sophia Koepke
Ying Shan
Zeynep Akata
80
80
0
22 Apr 2021
Detection of Audio-Video Synchronization Errors Via Event Detection
Detection of Audio-Video Synchronization Errors Via Event Detection
Joshua Peter Ebenezer
Yongjun Wu
Hai Wei
S. Sethuraman
Z. Liu
31
12
0
20 Apr 2021
End-to-end Audio-visual Speech Recognition with Conformers
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
M. Pantic
81
224
0
12 Feb 2021
Audiovisual SlowFast Networks for Video Recognition
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
194
206
0
23 Jan 2020
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
239
31,257
0
16 Jan 2013
1