ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.12772
  4. Cited By
Exploiting Transformation Invariance and Equivariance for
  Self-supervised Sound Localisation

Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation

26 June 2022
Jinxian Liu
Chen Ju
Weidi Xie
Ya Zhang
ArXivPDFHTML

Papers citing "Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation"

14 / 14 papers shown
Title
Improving Sound Source Localization with Joint Slot Attention on Image and Audio
Improving Sound Source Localization with Joint Slot Attention on Image and Audio
Inho Kim
Youngkil Song
Jicheol Park
Won Hwa Kim
Suha Kwak
22
0
0
21 Apr 2025
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
Haicheng Wang
Zhemeng Yu
Gabriele Spadaro
Chen Ju
Victor Quétu
Enzo Tartaglione
Enzo Tartaglione
VLM
185
3
0
05 Jan 2025
A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio
A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio
Xavier Juanola
Gloria Haro
Magdalena Fuentes
36
2
0
01 Oct 2024
Aligning Sight and Sound: Advanced Sound Source Localization Through
  Audio-Visual Alignment
Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment
Arda Senocak
H. Ryu
Junsik Kim
Tae-Hyun Oh
Hanspeter Pfister
Joon Son Chung
38
3
0
18 Jul 2024
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language
  Large Models
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models
Chen Ju
Haicheng Wang
Haozhe Cheng
Xu Chen
Zhonghua Zhai
Weilin Huang
Jinsong Lan
Shuai Xiao
Bo Zheng
VLM
56
5
0
16 Jul 2024
SAVE: Segment Audio-Visual Easy way using Segment Anything Model
SAVE: Segment Audio-Visual Easy way using Segment Anything Model
Khanh-Binh Nguyen
Chae Jung Park
VLM
VOS
44
1
0
02 Jul 2024
Made to Order: Discovering monotonic temporal changes via
  self-supervised video ordering
Made to Order: Discovering monotonic temporal changes via self-supervised video ordering
Charig Yang
Weidi Xie
Andrew Zisserman
39
2
0
25 Apr 2024
Audio-Visual Segmentation via Unlabeled Frame Exploitation
Audio-Visual Segmentation via Unlabeled Frame Exploitation
Jinxiang Liu
Yikun Liu
Fei Zhang
Chen Ju
Ya Zhang
Yanfeng Wang
46
10
0
17 Mar 2024
Sound Source Localization is All about Cross-Modal Alignment
Sound Source Localization is All about Cross-Modal Alignment
Arda Senocak
H. Ryu
Junsik Kim
Tae-Hyun Oh
Hanspeter Pfister
Joon Son Chung
36
18
0
19 Sep 2023
AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute
  Decomposition-Aggregation
AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute Decomposition-Aggregation
Chaofan Ma
Yu-Hao Yang
Chen Ju
Fei Zhang
Ya Zhang
Yanfeng Wang
VLM
48
17
0
31 Aug 2023
MarginNCE: Robust Sound Localization with a Negative Margin
MarginNCE: Robust Sound Localization with a Negative Margin
Sooyoung Park
Arda Senocak
Joon Son Chung
SSL
27
13
0
03 Nov 2022
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
383
5,818
0
29 Apr 2021
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
196
199
0
08 Jan 2021
Audiovisual SlowFast Networks for Video Recognition
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
197
207
0
23 Jan 2020
1