ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.02687
  4. Cited By
Strumming to the Beat: Audio-Conditioned Contrastive Video Textures

Strumming to the Beat: Audio-Conditioned Contrastive Video Textures

6 April 2021
Medhini Narasimhan
Shiry Ginosar
Andrew Owens
Alexei A. Efros
Trevor Darrell
    DiffM
ArXivPDFHTML

Papers citing "Strumming to the Beat: Audio-Conditioned Contrastive Video Textures"

14 / 14 papers shown
Title
Seeing Soundscapes: Audio-Visual Generation and Separation from Soundscapes Using Audio-Visual Separator
Seeing Soundscapes: Audio-Visual Generation and Separation from Soundscapes Using Audio-Visual Separator
Minjae Kang
Martim Brandão
64
0
0
25 Apr 2025
Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal
  Latent Alignment
Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal Latent Alignment
Kim Sung-Bin
Arda Senocak
Hyunwoo Ha
Tae-Hyun Oh
DiffM
83
0
0
09 Dec 2024
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large
  Multi-Modal Models
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models
David Kurzendörfer
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
VLM
CLIP
33
2
0
09 Apr 2024
Generative Image Dynamics
Generative Image Dynamics
Zhengqi Li
Richard Tucker
Noah Snavely
Aleksander Holynski
DiffM
46
63
0
14 Sep 2023
Text-to-feature diffusion for audio-visual few-shot learning
Text-to-feature diffusion for audio-visual few-shot learning
Otniel-Bogdan Mercea
Thomas Hummel
A. Sophia Koepke
Zeynep Akata
VLM
27
2
0
07 Sep 2023
Learning to Taste: A Multimodal Wine Dataset
Learning to Taste: A Multimodal Wine Dataset
Thoranna Bender
Simon Moe Sorensen
A. Kashani
K. E. Hjorleifsson
Grethe Hyldig
Søren Hauberg
Serge Belongie
Frederik Warburg
CoGe
32
2
0
31 Aug 2023
Objects do not disappear: Video object detection by single-frame object
  location anticipation
Objects do not disappear: Video object detection by single-frame object location anticipation
X. Liu
F. Karimi Nejadasl
Jan van Gemert
Olaf Booij
S. Pintea
32
5
0
09 Aug 2023
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
Kim Sung-Bin
Arda Senocak
H. Ha
Andrew Owens
Tae-Hyun Oh
DiffM
VGen
46
37
0
30 Mar 2023
3D Video Loops from Asynchronous Input
3D Video Loops from Asynchronous Input
Li Ma
Xiaoyu Li
Jing Liao
Pedro Sander
44
5
0
09 Mar 2023
Temporal and cross-modal attention for audio-visual zero-shot learning
Temporal and cross-modal attention for audio-visual zero-shot learning
Otniel-Bogdan Mercea
Thomas Hummel
A. Sophia Koepke
Zeynep Akata
38
25
0
20 Jul 2022
Audio-visual Generalised Zero-shot Learning with Cross-modal Attention
  and Language
Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and Language
Otniel-Bogdan Mercea
Lukas Riesch
A. Sophia Koepke
Zeynep Akata
33
48
0
07 Mar 2022
Taming Visually Guided Sound Generation
Taming Visually Guided Sound Generation
Vladimir E. Iashin
Esa Rahtu
VLM
32
122
0
17 Oct 2021
Improved Baselines with Momentum Contrastive Learning
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
281
3,378
0
09 Mar 2020
A Style-Based Generator Architecture for Generative Adversarial Networks
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
306
10,378
0
12 Dec 2018
1