Fine-Grained Action Retrieval Through Multiple Parts-of-Speech Embeddings

IEEE International Conference on Computer Vision (ICCV), 2019

9 August 2019

Dima Damen

Papers citing "Fine-Grained Action Retrieval Through Multiple Parts-of-Speech Embeddings"

50 / 103 papers shown

UNIV: Unified Foundation Model for Infrared and Visible Modalities

160

19 Sep 2025

Repeating Words for Video-Language Retrieval with Coarse-to-Fine Objectives

197

20 Aug 2025

Zero-Shot Skeleton-Based Action Recognition With Prototype-Guided Feature AlignmentIEEE Transactions on Image Processing (IEEE TIP), 2025

303

01 Jul 2025

EVA02-AT: Egocentric Video-Language Understanding with Spatial-Temporal Rotary Positional Embeddings and Symmetric Optimization

Xiaoqi Wang

Yi Wang

Lap-Pui Chau

223

17 Jun 2025

Leveraging Auxiliary Information in Text-to-Video Retrieval: A Review

A. Fragomeni

Dima Damen

Michael Wray

268

29 May 2025

Do Egocentric Video-Language Models Truly Understand Hand-Object Interactions?International Conference on Learning Representations (ICLR), 2024

210

21 Feb 2025

Neuron: Learning Context-Aware Evolving Representations for Zero-Shot Skeleton Action RecognitionComputer Vision and Pattern Recognition (CVPR), 2024

405

18 Nov 2024

Bridging the Skeleton-Text Modality Gap: Diffusion-Powered Modality Alignment for Zero-shot Skeleton-based Action Recognition

Jeonghyeok Do

Munchurl Kim

696

16 Nov 2024

Beyond Coarse-Grained Matching in Video-Text RetrievalAsian Conference on Computer Vision (ACCV), 2024

Aozhu Chen

Hazel Doughty

Xirong Li

Cees G. M. Snoek

323

16 Oct 2024

Zero-Shot Skeleton-based Action Recognition with Dual Visual-Text AlignmentPattern Recognition (Pattern Recogn.), 2024

420

22 Sep 2024

SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval

Min Wang

259

23 Jul 2024

SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders

435

18 Jul 2024

Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood

281

28 Jun 2024

Part-aware Unified Representation of Language and Skeleton for Zero-shot Action RecognitionComputer Vision and Pattern Recognition (CVPR), 2024

274

19 Jun 2024

Symmetric Multi-Similarity Loss for EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2024

Xiaoqi Wang

Yi Wang

Lap-Pui Chau

270

18 Jun 2024

EchoGuide: Active Acoustic Guidance for LLM-Based Eating Event Analysis from Egocentric Videos

François Guimbretière

Cheng Zhang

297

15 Jun 2024

An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition

Haojun Xu

Yanlei Gao

Jie Li

Xinbo Gao

320

02 Jun 2024

SHE-Net: Syntax-Hierarchy-Enhanced Text-Video Retrieval

Ming Yang

434

22 Apr 2024

Fine-Grained Side Information Guided Dual-Prompts for Zero-Shot Skeleton Action Recognition

412

11 Apr 2024

A SOUND APPROACH: Using Large Language Models to generate audio descriptions for egocentric text-audio retrieval

Andreea-Maria Oncescu

João F. Henriques

Andrew Zisserman

Samuel Albanie

A. Sophia Koepke

226

29 Feb 2024

Video Editing for Video Retrieval

Dima Damen

247

04 Feb 2024

Training a Large Video Model on a Single Machine in a Day

Yue Zhao

Philipp Krahenbuhl

VLM

309

28 Sep 2023

Video-adverb retrieval with compositional adverb-action embeddingsBritish Machine Vision Conference (BMVC), 2023

Thomas Hummel

Otniel-Bogdan Mercea

A. Sophia Koepke

Zeynep Akata

230

26 Sep 2023

Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive LearningACM Multimedia (ACM MM), 2023

...

449

20 Sep 2023

Multi-Semantic Fusion Model for Generalized Zero-Shot Skeleton-Based Action RecognitionInternational Conference on Image and Graphics (ICIG), 2023

Liang Wang

256

18 Sep 2023

Zero-shot Skeleton-based Action Recognition via Mutual Information Estimation and MaximizationACM Multimedia (ACM MM), 2023

240

07 Aug 2023

Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and ModelIEEE Transactions on Image Processing (IEEE TIP), 2023

Peng Wu

Jing Liu

Xiangteng He

Yuxin Peng

Peng Wang

Yanning Zhang

472

24 Jul 2023

EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the BackboneIEEE International Conference on Computer Vision (ICCV), 2023

427

149

11 Jul 2023

UniUD Submission to the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge 2023

Alex Falcon

Giuseppe Serra

233

27 Jun 2023

Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive TrainingIEEE Transactions on Image Processing (IEEE TIP), 2023

Liang Wang

278

15 Jun 2023

An Overview of Challenges in Egocentric Text-Video Retrieval

371

07 Jun 2023

Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set AlignmentInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

Hao Li

323

20 May 2023

Verbs in Action: Improving verb understanding in video-language modelsIEEE International Conference on Computer Vision (ICCV), 2023

547

13 Apr 2023

Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval

Jae Myung Kim

A. Sophia Koepke

Cordelia Schmid

Zeynep Akata

288

06 Apr 2023

Learning Action Changes by Measuring Verb-Adverb Textual RelationshipsComputer Vision and Pattern Recognition (CVPR), 2023

353

27 Mar 2023

Improving Video Retrieval by Adaptive MarginAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2021

340

09 Mar 2023

Deep Learning for Video-Text Retrieval: a ReviewInternational Journal of Multimedia Information Retrieval (IJMIR), 2023

254

24 Feb 2023

Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal GroundingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Yi Yang

Fei Wu

267

22 Jan 2023

Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models?Findings (Findings), 2023

Ding Zhao

264

21 Jan 2023

HierVL: Learning Hierarchical Video-Language EmbeddingsComputer Vision and Pattern Recognition (CVPR), 2023

574

05 Jan 2023

Learning Video Representations from Large Language ModelsComputer Vision and Pattern Recognition (CVPR), 2022

442

246

08 Dec 2022

Normalized Contrastive Learning for Text-Video RetrievalConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

205

30 Nov 2022

Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment

Ding Zhao

150

10 Oct 2022

ConTra: (Con)text (Tra)nsformer for Cross-Modal Video RetrievalAsian Conference on Computer Vision (ACCV), 2022

A. Fragomeni

Michael Wray

Dima Damen

CLIP ViT

177

09 Oct 2022

A Feature-space Multimodal Data Augmentation Technique for Text-video RetrievalACM Multimedia (ACM MM), 2022

258

03 Aug 2022

Exploiting Semantic Role Contextualized Video Features for Multi-Instance Text-Video Retrieval EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022

238

29 Jun 2022

RoME: Role-aware Mixture-of-Expert Transformer for Text-to-Video Retrieval

202

26 Jun 2022

UniUD-FBK-UB-UniBZ Submission to the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge 2022

274

22 Jun 2022

Self-Supervised Learning for Videos: A SurveyACM Computing Surveys (ACM CSUR), 2022

Madeline Chantry Schiappa

Yogesh S Rawat

M. Shah

SSL

603

178

18 Jun 2022

Egocentric Video-Language PretrainingNeural Information Processing Systems (NeurIPS), 2022

Rui Yan

...

Hongfa Wang

Dima Damen

Guohao Li

Wei Liu

Mike Zheng Shou

VLM EgoV

317

267

03 Jun 2022