ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.14240
  4. Cited By
Audio-Adaptive Activity Recognition Across Video Domains
v1v2 (latest)

Audio-Adaptive Activity Recognition Across Video Domains

27 March 2022
Yun C. Zhang
Hazel Doughty
Ling Shao
Cees G. M. Snoek
ArXiv (abs)PDFHTML

Papers citing "Audio-Adaptive Activity Recognition Across Video Domains"

46 / 46 papers shown
Title
Audio-visual cross-modality knowledge transfer for machine learning-based in-situ monitoring in laser additive manufacturing
Audio-visual cross-modality knowledge transfer for machine learning-based in-situ monitoring in laser additive manufacturing
Jiarui Xie
Mutahar Safdar
Lequn Chen
Seung Ki Moon
Y. Zhao
90
1
0
09 Aug 2024
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
Asmar Nadeem
Faegheh Sardari
R. Dawes
Syed Sameed Husain
Adrian Hilton
Armin Mustafa
92
4
0
10 Jun 2024
SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization
SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization
David Pujol-Perich
Albert Clapés
Sergio Escalera
90
0
0
20 Dec 2023
Audio-Visual Instance Segmentation
Audio-Visual Instance Segmentation
Ruohao Guo
Yaru Chen
Yanyu Qi
Wenzhen Yue
Dantong Niu
...
Wenzhen Yue
Ji Shi
Qixun Wang
Peiliang Zhang
Buwen Liang
VLMVOS
77
2
0
28 Oct 2023
Team VI-I2R Technical Report on EPIC-KITCHENS-100 Unsupervised Domain
  Adaptation Challenge for Action Recognition 2021
Team VI-I2R Technical Report on EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2021
Yi Cheng
Fen Fang
Ying Sun
EgoV
37
5
0
03 Jun 2022
Learning Cross-modal Contrastive Features for Video Domain Adaptation
Learning Cross-modal Contrastive Features for Video Domain Adaptation
Donghyun Kim
Yi-Hsuan Tsai
Bingbing Zhuang
Xiang Yu
Stan Sclaroff
Kate Saenko
Manmohan Chandraker
75
72
0
26 Aug 2021
Attention Bottlenecks for Multimodal Fusion
Attention Bottlenecks for Multimodal Fusion
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
98
567
0
30 Jun 2021
EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action
  Recognition 2021: Team M3EM Technical Report
EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2021: Team M3EM Technical Report
Lijin Yang
Yifei Huang
Yusuke Sugano
Yoichi Sato
30
5
0
18 Jun 2021
Cross-Domain First Person Audio-Visual Action Recognition through
  Relative Norm Alignment
Cross-Domain First Person Audio-Visual Action Recognition through Relative Norm Alignment
M. Planamente
Chiara Plizzari
Emanuele Alberti
Barbara Caputo
EgoV
111
12
0
03 Jun 2021
Home Action Genome: Cooperative Compositional Action Understanding
Home Action Genome: Cooperative Compositional Action Understanding
Nishant Rai
Haofeng Chen
Jingwei Ji
Rishi Desai
Kazuki Kozuka
Shun Ishizaka
Ehsan Adeli
Juan Carlos Niebles
43
76
0
11 May 2021
Ego-Exo: Transferring Visual Representations from Third-person to
  First-person Videos
Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos
Yanghao Li
Tushar Nagarajan
Bo Xiong
Kristen Grauman
EgoV
89
91
0
16 Apr 2021
DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial
  Estimation
DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation
Alexandre Ramé
Matthieu Cord
FedML
66
52
0
14 Jan 2021
Deep Visual Domain Adaptation
Deep Visual Domain Adaptation
G. Csurka
OOD
201
185
0
28 Dec 2020
ActBERT: Learning Global-Local Video-Text Representations
ActBERT: Learning Global-Local Video-Text Representations
Linchao Zhu
Yi Yang
ViT
122
422
0
14 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
657
41,103
0
22 Oct 2020
Multi-modal Transformer for Video Retrieval
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
533
610
0
21 Jul 2020
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video
  Parsing
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing
Yapeng Tian
Dingzeyu Li
Chenliang Xu
97
184
0
21 Jul 2020
Labelling unlabelled videos from scratch with multi-modal
  self-supervision
Labelling unlabelled videos from scratch with multi-modal self-supervision
Yuki M. Asano
Mandela Patrick
Christian Rupprecht
Andrea Vedaldi
SSL
68
152
0
24 Jun 2020
VGGSound: A Large-scale Audio-Visual Dataset
VGGSound: A Large-scale Audio-Visual Dataset
Honglie Chen
Weidi Xie
Andrea Vedaldi
Andrew Zisserman
89
577
0
29 Apr 2020
Action Segmentation with Joint Self-Supervised Temporal Domain
  Adaptation
Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation
Min-Hung Chen
Baopu Li
Sid Ying-Ze Bao
G. Al-Regib
Z. Kira
TTA
117
122
0
05 Mar 2020
Multi-Modal Domain Adaptation for Fine-Grained Action Recognition
Multi-Modal Domain Adaptation for Fine-Grained Action Recognition
Jonathan Munro
Dima Damen
EgoV
57
194
0
27 Jan 2020
Adversarial Cross-Domain Action Recognition with Co-Attention
Adversarial Cross-Domain Action Recognition with Co-Attention
Boxiao Pan
Zhangjie Cao
Ehsan Adeli
Juan Carlos Niebles
ViT
70
104
0
22 Dec 2019
Listen to Look: Action Recognition by Previewing Audio
Listen to Look: Action Recognition by Previewing Audio
Ruohan Gao
Tae-Hyun Oh
Kristen Grauman
Lorenzo Torresani
VLM
83
252
0
10 Dec 2019
A Comprehensive Survey on Transfer Learning
A Comprehensive Survey on Transfer Learning
Fuzhen Zhuang
Zhiyuan Qi
Keyu Duan
Dongbo Xi
Yongchun Zhu
Hengshu Zhu
Hui Xiong
Qing He
183
4,449
0
07 Nov 2019
EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action
  Recognition
EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition
Evangelos Kazakos
Arsha Nagrani
Andrew Zisserman
Dima Damen
EgoV
62
337
0
22 Aug 2019
Temporal Attentive Alignment for Large-Scale Video Domain Adaptation
Temporal Attentive Alignment for Large-Scale Video Domain Adaptation
Min-Hung Chen
Z. Kira
G. Al-Regib
Jaekwon Yoo
Ruxin Chen
Jian Zheng
TTAAI4TS
63
179
0
30 Jul 2019
What Makes Training Multi-Modal Classification Networks Hard?
What Makes Training Multi-Modal Classification Networks Hard?
Weiyao Wang
Du Tran
Matt Feiszli
111
453
0
29 May 2019
SCSampler: Sampling Salient Clips from Video for Efficient Action
  Recognition
SCSampler: Sampling Salient Clips from Video for Efficient Action Recognition
Bruno Korbar
Du Tran
Lorenzo Torresani
66
224
0
08 Apr 2019
VideoBERT: A Joint Model for Video and Language Representation Learning
VideoBERT: A Joint Model for Video and Language Representation Learning
Chen Sun
Austin Myers
Carl Vondrick
Kevin Patrick Murphy
Cordelia Schmid
VLMSSL
79
1,246
0
03 Apr 2019
Class-Balanced Loss Based on Effective Number of Samples
Class-Balanced Loss Based on Effective Number of Samples
Huayu Chen
Menglin Jia
Nayeon Lee
Yang Song
Serge J. Belongie
200
2,281
0
16 Jan 2019
Domain Adaptation for Structured Output via Discriminative Patch
  Representations
Domain Adaptation for Structured Output via Discriminative Patch Representations
Yi-Hsuan Tsai
Kihyuk Sohn
S. Schulter
Manmohan Chandraker
OOD
81
320
0
16 Jan 2019
SlowFast Networks for Video Recognition
SlowFast Networks for Video Recognition
Christoph Feichtenhofer
Haoqi Fan
Jitendra Malik
Kaiming He
166
3,274
0
10 Dec 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
94,891
0
11 Oct 2018
Cooperative Learning of Audio and Video Models from Self-Supervised
  Synchronization
Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization
Bruno Korbar
Du Tran
Lorenzo Torresani
97
475
0
30 Jun 2018
Actor and Observer: Joint Modeling of First and Third-Person Videos
Actor and Observer: Joint Modeling of First and Third-Person Videos
Gunnar Sigurdsson
Abhinav Gupta
Cordelia Schmid
Ali Farhadi
Alahari Karteek
EgoV
106
158
0
25 Apr 2018
Scaling Egocentric Vision: The EPIC-KITCHENS Dataset
Scaling Egocentric Vision: The EPIC-KITCHENS Dataset
Dima Damen
Hazel Doughty
G. Farinella
Sanja Fidler
Antonino Furnari
...
Davide Moltisanti
Jonathan Munro
Toby Perrett
Will Price
Michael Wray
EgoV
123
1,030
0
08 Apr 2018
Audio-Visual Event Localization in Unconstrained Videos
Audio-Visual Event Localization in Unconstrained Videos
Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
99
435
0
23 Mar 2018
Domain Adaptive Faster R-CNN for Object Detection in the Wild
Domain Adaptive Faster R-CNN for Object Detection in the Wild
Yuhua Chen
Wen Li
Daniel Gehrig
Dengxin Dai
Luc Van Gool
OODObjD
107
1,301
0
08 Mar 2018
Deep Visual Domain Adaptation: A Survey
Deep Visual Domain Adaptation: A Survey
Mei Wang
Weihong Deng
OOD
73
2,013
0
10 Feb 2018
MINE: Mutual Information Neural Estimation
MINE: Mutual Information Neural Estimation
Mohamed Ishmael Belghazi
A. Baratin
Sai Rajeswar
Sherjil Ozair
Yoshua Bengio
Aaron Courville
R. Devon Hjelm
DRL
194
1,279
0
12 Jan 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
701
131,652
0
12 Jun 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
232
8,019
0
22 May 2017
Adversarial Discriminative Domain Adaptation
Adversarial Discriminative Domain Adaptation
Eric Tzeng
Judy Hoffman
Kate Saenko
Trevor Darrell
GANOOD
262
4,667
0
17 Feb 2017
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
194,020
0
10 Dec 2015
Unsupervised Domain Adaptation by Backpropagation
Unsupervised Domain Adaptation by Backpropagation
Yaroslav Ganin
Victor Lempitsky
OOD
233
6,030
0
26 Sep 2014
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
K. Soomro
Amir Zamir
M. Shah
CLIPVGen
152
6,148
0
03 Dec 2012
1