ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.15109
  4. Cited By
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and
  Highlight Detection
v1v2 (latest)

DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection

29 August 2023
Henghao Zhao
Kevin Qinghong Lin
Rui Yan
Zechao Li
    VGenDiffM
ArXiv (abs)PDFHTML

Papers citing "DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection"

31 / 31 papers shown
Title
Weakly-Supervised Temporal Action Localization with Bidirectional
  Semantic Consistency Constraint
Weakly-Supervised Temporal Action Localization with Bidirectional Semantic Consistency Constraint
Guozhang Li
De Cheng
Xinpeng Ding
N. Wang
Jie Li
Xinbo Gao
70
7
0
25 Apr 2023
Exploring Diffusion Models for Unsupervised Video Anomaly Detection
Exploring Diffusion Models for Unsupervised Video Anomaly Detection
Anil Osman Tur
Nicola Dall’Asen
Cigdem Beyan
Elisa Ricci
DiffMVGen
72
37
0
12 Apr 2023
DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
Sauradip Nag
Xiatian Zhu
Jiankang Deng
Yi-Zhe Song
Tao Xiang
DiffMVGen
96
24
0
27 Mar 2023
DiffStyler: Controllable Dual Diffusion for Text-Driven Image
  Stylization
DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization
Nisha Huang
Yuxin Zhang
Fan Tang
Chongyang Ma
Haibin Huang
Yong Zhang
Weiming Dong
Changsheng Xu
DiffM
53
43
0
19 Nov 2022
DiffusionDet: Diffusion Model for Object Detection
DiffusionDet: Diffusion Model for Object Detection
Shoufa Chen
Pei Sun
Yibing Song
Ping Luo
117
465
0
17 Nov 2022
ProDiff: Progressive Fast Diffusion Model For High-Quality
  Text-to-Speech
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Rongjie Huang
Zhou Zhao
Huadai Liu
Jinglin Liu
Chenye Cui
Yi Ren
DiffM
87
200
0
13 Jul 2022
GLIDE: Towards Photorealistic Image Generation and Editing with
  Text-Guided Diffusion Models
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
Alex Nichol
Prafulla Dhariwal
Aditya A. Ramesh
Pranav Shyam
Pamela Mishkin
Bob McGrew
Ilya Sutskever
Mark Chen
364
3,605
0
20 Dec 2021
Diffusion Models for Implicit Image Segmentation Ensembles
Diffusion Models for Implicit Image Segmentation Ensembles
J. Wolleb
Robin Sandkühler
Florentin Bieder
Philippe Valmaggia
Philippe C. Cattin
DiffMMedImVLM
194
271
0
06 Dec 2021
Label-Efficient Semantic Segmentation with Diffusion Models
Label-Efficient Semantic Segmentation with Diffusion Models
Dmitry Baranchuk
Ivan Rubachev
A. Voynov
Valentin Khrulkov
Artem Babenko
DiffMVLM
277
536
0
06 Dec 2021
Structured Denoising Diffusion Models in Discrete State-Spaces
Structured Denoising Diffusion Models in Discrete State-Spaces
Jacob Austin
Daniel D. Johnson
Jonathan Ho
Daniel Tarlow
Rianne van den Berg
DiffM
176
945
0
07 Jul 2021
Interventional Video Grounding with Dual Contrastive Learning
Interventional Video Grounding with Dual Contrastive Learning
Guoshun Nan
Rui Qiao
Yao Xiao
Jun Liu
Sicong Leng
H. Zhang
Wei Lu
83
145
0
21 Jun 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
DiffM
104
537
0
13 May 2021
Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with
  Natural Language
Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language
Songyang Zhang
Houwen Peng
Jianlong Fu
Yijuan Lu
Jiebo Luo
71
53
0
04 Dec 2020
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
Pei Sun
Rufeng Zhang
Yi Jiang
Tao Kong
Chenfeng Xu
...
Masayoshi Tomizuka
Lei Li
Zehuan Yuan
Changhu Wang
Ping Luo
ObjD
93
1,097
0
25 Nov 2020
VLG-Net: Video-Language Graph Matching Network for Video Grounding
VLG-Net: Video-Language Graph Matching Network for Video Grounding
Mattia Soldan
Mengmeng Xu
Sisi Qu
Jesper N. Tegnér
Guohao Li
73
70
0
19 Nov 2020
Denoising Diffusion Implicit Models
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLMDiffM
286
7,454
0
06 Oct 2020
Learning Trailer Moments in Full-Length Movies
Learning Trailer Moments in Full-Length Movies
Lezi Wang
Dong Liu
R. Puri
Dimitris N. Metaxas
42
43
0
19 Aug 2020
Fine-grained Iterative Attention Network for TemporalLanguage
  Localization in Videos
Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos
Xiaoye Qu
Peng Tang
Zhikang Zhou
Yu Cheng
Jianfeng Dong
Pan Zhou
77
92
0
06 Aug 2020
MINI-Net: Multiple Instance Ranking Network for Video Highlight
  Detection
MINI-Net: Multiple Instance Ranking Network for Video Highlight Detection
Fa-Ting Hong
Xuanteng Huang
Weihong Li
Weishi Zheng
57
62
0
20 Jul 2020
Span-based Localizing Network for Natural Language Video Localization
Span-based Localizing Network for Natural Language Video Localization
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
87
315
0
29 Apr 2020
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
197
286
0
24 Jan 2020
Tree-Structured Policy based Progressive Reinforcement Learning for
  Temporally Language Grounding in Video
Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video
Jie Wu
Guanbin Li
Si Liu
Liang Lin
OffRL
64
104
0
18 Jan 2020
AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly
  Supervised Action Recognition and Localization
AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization
Xiaoyu Zhang
Changsheng Li
Haichao Shi
Xiaobin Zhu
Peng Li
Jing Dong
62
37
0
27 Nov 2019
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding
  in Videos
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
Yitian Yuan
Lin Ma
Jingwen Wang
Wei Liu
Wenwu Zhu
82
244
0
31 Oct 2019
Generative Modeling by Estimating Gradients of the Data Distribution
Generative Modeling by Estimating Gradients of the Data Distribution
Yang Song
Stefano Ermon
SyDaDiffM
258
3,954
0
12 Jul 2019
Tripping through time: Efficient Localization of Activities in Videos
Tripping through time: Efficient Localization of Activities in Videos
Meera Hahn
Asim Kadav
James M. Rehg
H. Graf
73
86
0
22 Apr 2019
Localizing Moments in Video with Natural Language
Localizing Moments in Video with Natural Language
Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Sivic
Trevor Darrell
Bryan C. Russell
115
949
0
04 Aug 2017
TALL: Temporal Activity Localization via Language Query
TALL: Temporal Activity Localization via Language Query
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
123
820
0
05 May 2017
Video Summarization with Long Short-term Memory
Video Summarization with Long Short-term Memory
Ke Zhang
Wei-Lun Chao
Fei Sha
Kristen Grauman
95
689
0
26 May 2016
Hollywood in Homes: Crowdsourcing Data Collection for Activity
  Understanding
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
Gunnar Sigurdsson
Gül Varol
Xinyu Wang
Ali Farhadi
Ivan Laptev
Abhinav Gupta
VGen
106
1,246
0
06 Apr 2016
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
Joseph Redmon
S. Divvala
Ross B. Girshick
Ali Farhadi
ObjD
705
36,997
0
08 Jun 2015
1