ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.12772
  4. Cited By
Exploiting Transformation Invariance and Equivariance for
  Self-supervised Sound Localisation
v1v2 (latest)

Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation

26 June 2022
Jinxian Liu
Chen Ju
Weidi Xie
Ya Zhang
ArXiv (abs)PDFHTML

Papers citing "Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation"

29 / 29 papers shown
Title
Hearing and Seeing Through CLIP: A Framework for Self-Supervised Sound Source Localization
Hearing and Seeing Through CLIP: A Framework for Self-Supervised Sound Source Localization
Sooyoung Park
Arda Senocak
Joon Son Chung
VLM
88
0
0
08 May 2025
Improving Sound Source Localization with Joint Slot Attention on Image and Audio
Improving Sound Source Localization with Joint Slot Attention on Image and Audio
Inho Kim
Youngkil Song
Jicheol Park
Won Hwa Kim
Suha Kwak
194
0
0
21 Apr 2025
Squeeze Out Tokens from Sample for Finer-Grained Data Governance
Squeeze Out Tokens from Sample for Finer-Grained Data Governance
Weixiong Lin
Chen Ju
Haicheng Wang
Shengchao Hu
Shuai Xiao
...
Yuheng Jiao
Mingshuai Yao
Jinsong Lan
Qingwen Liu
Ying Chen
84
0
0
18 Mar 2025
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
Haicheng Wang
Zhemeng Yu
Gabriele Spadaro
Chen Ju
Victor Quétu
Enzo Tartaglione
Enzo Tartaglione
VLM
441
6
0
05 Jan 2025
Advancing Myopia To Holism: Fully Contrastive Language-Image
  Pre-training
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
Haicheng Wang
Chen Ju
Weixiong Lin
Shuai Xiao
Mengting Chen
...
Mingshuai Yao
Jinsong Lan
Ying Chen
Qingwen Liu
Yanfeng Wang
VLMCLIP
123
4
0
30 Nov 2024
A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio
A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio
Xavier Juanola
Gloria Haro
Magdalena Fuentes
88
2
0
01 Oct 2024
Enhancing Sound Source Localization via False Negative Elimination
Enhancing Sound Source Localization via False Negative Elimination
Zengjie Song
Jiangshe Zhang
Yuxi Wang
Junsong Fan
Zhaoxiang Zhang
92
0
0
29 Aug 2024
Unveiling Visual Biases in Audio-Visual Localization Benchmarks
Unveiling Visual Biases in Audio-Visual Localization Benchmarks
Liangyu Chen
Zihao Yue
Boshen Xu
Qin Jin
SSL
99
0
0
25 Aug 2024
Aligning Sight and Sound: Advanced Sound Source Localization Through
  Audio-Visual Alignment
Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment
Arda Senocak
H. Ryu
Junsik Kim
Tae-Hyun Oh
Hanspeter Pfister
Joon Son Chung
111
4
0
18 Jul 2024
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language
  Large Models
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models
Chen Ju
Haicheng Wang
Haozhe Cheng
Xu Chen
Zhonghua Zhai
Weilin Huang
Jinsong Lan
Shuai Xiao
Bo Zheng
VLM
100
6
0
16 Jul 2024
SAVE: Segment Audio-Visual Easy way using Segment Anything Model
SAVE: Segment Audio-Visual Easy way using Segment Anything Model
Khanh-Binh Nguyen
Chae Jung Park
VLMVOS
116
1
0
02 Jul 2024
Meerkat: Audio-Visual Large Language Model for Grounding in Space and
  Time
Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time
Sanjoy Chowdhury
Sayan Nag
Subhrajyoti Dasgupta
Jun Chen
Mohamed Elhoseiny
Ruohan Gao
Dinesh Manocha
VLMMLLM
98
15
0
01 Jul 2024
Made to Order: Discovering monotonic temporal changes via
  self-supervised video ordering
Made to Order: Discovering monotonic temporal changes via self-supervised video ordering
Charig Yang
Weidi Xie
Andrew Zisserman
82
2
0
25 Apr 2024
DENOISER: Rethinking the Robustness for Open-Vocabulary Action
  Recognition
DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition
Haozhe Cheng
Chen Ju
Haicheng Wang
Jinxiang Liu
Mengting Chen
Qiang Hu
Xiaoyun Zhang
Yanfeng Wang
DiffMVLM
84
6
0
23 Apr 2024
Learning to Visually Localize Sound Sources from Mixtures without Prior
  Source Knowledge
Learning to Visually Localize Sound Sources from Mixtures without Prior Source Knowledge
Dongjin Kim
Sung-Jin Um
Sangmin Lee
Jung Uk Kim
72
6
0
26 Mar 2024
Audio-Visual Segmentation via Unlabeled Frame Exploitation
Audio-Visual Segmentation via Unlabeled Frame Exploitation
Jinxiang Liu
Yikun Liu
Fei Zhang
Chen Ju
Ya Zhang
Yanfeng Wang
98
13
0
17 Mar 2024
Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for
  Audio-Visual Source Localization
Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization
Yuxin Guo
Shijie Ma
Hu Su
Zhiqing Wang
Yuhao Zhao
Wei Zou
Siyang Sun
Yun Zheng
SSL
82
12
0
05 Mar 2024
Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source
  Localization
Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source Localization
Yuxin Guo
Shijie Ma
Yuhao Zhao
Hu Su
Wei Zou
76
4
0
05 Mar 2024
Segment Beyond View: Handling Partially Missing Modality for
  Audio-Visual Semantic Segmentation
Segment Beyond View: Handling Partially Missing Modality for Audio-Visual Semantic Segmentation
Renjie Wu
Hu Wang
Feras Dayoub
Hsiang-Ting Chen
68
5
0
14 Dec 2023
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language
  Models
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Models
Chen Ju
Haicheng Wang
Zeqian Li
Xu Chen
Zhonghua Zhai
Weilin Huang
Shuai Xiao
VLM
125
8
0
12 Dec 2023
Can CLIP Help Sound Source Localization?
Can CLIP Help Sound Source Localization?
Sooyoung Park
Arda Senocak
Joon Son Chung
85
9
0
07 Nov 2023
Sound Source Localization is All about Cross-Modal Alignment
Sound Source Localization is All about Cross-Modal Alignment
Arda Senocak
H. Ryu
Junsik Kim
Tae-Hyun Oh
Hanspeter Pfister
Joon Son Chung
88
19
0
19 Sep 2023
AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute
  Decomposition-Aggregation
AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute Decomposition-Aggregation
Chaofan Ma
Yu-Hao Yang
Chen Ju
Fei Zhang
Ya Zhang
Yanfeng Wang
VLM
127
19
0
31 Aug 2023
Audio-Visual Segmentation by Exploring Cross-Modal Mutual Semantics
Audio-Visual Segmentation by Exploring Cross-Modal Mutual Semantics
Chen Liu
Peike Li
Xingqun Qi
Hu Zhang
Lincheng Li
Dadong Wang
Xin Yu
VOS
91
34
0
31 Jul 2023
Audio-aware Query-enhanced Transformer for Audio-Visual Segmentation
Audio-aware Query-enhanced Transformer for Audio-Visual Segmentation
Jinxian Liu
Chen Ju
Chaofan Ma
Yanfeng Wang
Yu Wang
Ya Zhang
VOS
129
24
0
25 Jul 2023
Annotation-free Audio-Visual Segmentation
Annotation-free Audio-Visual Segmentation
Jinxian Liu
Yu Wang
Chen Ju
Chaofan Ma
Ya Zhang
Weidi Xie
VOSVLM
111
30
0
18 May 2023
Learning Audio-Visual Source Localization via False Negative Aware
  Contrastive Learning
Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning
Weixuan Sun
Jiayi Zhang
Jianyuan Wang
Zheyuan Liu
Yiran Zhong
Tianpeng Feng
Yandong Guo
Yanhao Zhang
Nick Barnes
SSL
79
48
0
20 Mar 2023
Distilling Vision-Language Pre-training to Collaborate with
  Weakly-Supervised Temporal Action Localization
Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Chen Ju
Kunhao Zheng
Jinxian Liu
Peisen Zhao
Ya Zhang
Jianlong Chang
Yanfeng Wang
Qi Tian
68
11
0
19 Dec 2022
MarginNCE: Robust Sound Localization with a Negative Margin
MarginNCE: Robust Sound Localization with a Negative Margin
Sooyoung Park
Arda Senocak
Joon Son Chung
SSL
73
14
0
03 Nov 2022
1