Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.15109
Cited By
v1
v2 (latest)
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection
29 August 2023
Henghao Zhao
Kevin Qinghong Lin
Rui Yan
Zechao Li
VGen
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection"
31 / 31 papers shown
Title
Weakly-Supervised Temporal Action Localization with Bidirectional Semantic Consistency Constraint
Guozhang Li
De Cheng
Xinpeng Ding
N. Wang
Jie Li
Xinbo Gao
70
7
0
25 Apr 2023
Exploring Diffusion Models for Unsupervised Video Anomaly Detection
Anil Osman Tur
Nicola Dall’Asen
Cigdem Beyan
Elisa Ricci
DiffM
VGen
72
37
0
12 Apr 2023
DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
Sauradip Nag
Xiatian Zhu
Jiankang Deng
Yi-Zhe Song
Tao Xiang
DiffM
VGen
96
24
0
27 Mar 2023
DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization
Nisha Huang
Yuxin Zhang
Fan Tang
Chongyang Ma
Haibin Huang
Yong Zhang
Weiming Dong
Changsheng Xu
DiffM
53
43
0
19 Nov 2022
DiffusionDet: Diffusion Model for Object Detection
Shoufa Chen
Pei Sun
Yibing Song
Ping Luo
117
465
0
17 Nov 2022
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Rongjie Huang
Zhou Zhao
Huadai Liu
Jinglin Liu
Chenye Cui
Yi Ren
DiffM
87
200
0
13 Jul 2022
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
Alex Nichol
Prafulla Dhariwal
Aditya A. Ramesh
Pranav Shyam
Pamela Mishkin
Bob McGrew
Ilya Sutskever
Mark Chen
364
3,605
0
20 Dec 2021
Diffusion Models for Implicit Image Segmentation Ensembles
J. Wolleb
Robin Sandkühler
Florentin Bieder
Philippe Valmaggia
Philippe C. Cattin
DiffM
MedIm
VLM
192
271
0
06 Dec 2021
Label-Efficient Semantic Segmentation with Diffusion Models
Dmitry Baranchuk
Ivan Rubachev
A. Voynov
Valentin Khrulkov
Artem Babenko
DiffM
VLM
277
536
0
06 Dec 2021
Structured Denoising Diffusion Models in Discrete State-Spaces
Jacob Austin
Daniel D. Johnson
Jonathan Ho
Daniel Tarlow
Rianne van den Berg
DiffM
176
945
0
07 Jul 2021
Interventional Video Grounding with Dual Contrastive Learning
Guoshun Nan
Rui Qiao
Yao Xiao
Jun Liu
Sicong Leng
H. Zhang
Wei Lu
83
145
0
21 Jun 2021
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Vadim Popov
Ivan Vovk
Vladimir Gogoryan
Tasnima Sadekova
Mikhail Kudinov
DiffM
104
537
0
13 May 2021
Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language
Songyang Zhang
Houwen Peng
Jianlong Fu
Yijuan Lu
Jiebo Luo
71
53
0
04 Dec 2020
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
Pei Sun
Rufeng Zhang
Yi Jiang
Tao Kong
Chenfeng Xu
...
Masayoshi Tomizuka
Lei Li
Zehuan Yuan
Changhu Wang
Ping Luo
ObjD
93
1,097
0
25 Nov 2020
VLG-Net: Video-Language Graph Matching Network for Video Grounding
Mattia Soldan
Mengmeng Xu
Sisi Qu
Jesper N. Tegnér
Guohao Li
73
70
0
19 Nov 2020
Denoising Diffusion Implicit Models
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
283
7,454
0
06 Oct 2020
Learning Trailer Moments in Full-Length Movies
Lezi Wang
Dong Liu
R. Puri
Dimitris N. Metaxas
42
43
0
19 Aug 2020
Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos
Xiaoye Qu
Peng Tang
Zhikang Zhou
Yu Cheng
Jianfeng Dong
Pan Zhou
77
92
0
06 Aug 2020
MINI-Net: Multiple Instance Ranking Network for Video Highlight Detection
Fa-Ting Hong
Xuanteng Huang
Weihong Li
Weishi Zheng
57
62
0
20 Jul 2020
Span-based Localizing Network for Natural Language Video Localization
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
87
315
0
29 Apr 2020
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
197
286
0
24 Jan 2020
Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video
Jie Wu
Guanbin Li
Si Liu
Liang Lin
OffRL
64
104
0
18 Jan 2020
AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization
Xiaoyu Zhang
Changsheng Li
Haichao Shi
Xiaobin Zhu
Peng Li
Jing Dong
62
37
0
27 Nov 2019
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
Yitian Yuan
Lin Ma
Jingwen Wang
Wei Liu
Wenwu Zhu
82
244
0
31 Oct 2019
Generative Modeling by Estimating Gradients of the Data Distribution
Yang Song
Stefano Ermon
SyDa
DiffM
258
3,954
0
12 Jul 2019
Tripping through time: Efficient Localization of Activities in Videos
Meera Hahn
Asim Kadav
James M. Rehg
H. Graf
73
86
0
22 Apr 2019
Localizing Moments in Video with Natural Language
Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Sivic
Trevor Darrell
Bryan C. Russell
115
946
0
04 Aug 2017
TALL: Temporal Activity Localization via Language Query
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
123
820
0
05 May 2017
Video Summarization with Long Short-term Memory
Ke Zhang
Wei-Lun Chao
Fei Sha
Kristen Grauman
95
689
0
26 May 2016
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
Gunnar Sigurdsson
Gül Varol
Xinyu Wang
Ali Farhadi
Ivan Laptev
Abhinav Gupta
VGen
106
1,245
0
06 Apr 2016
You Only Look Once: Unified, Real-Time Object Detection
Joseph Redmon
S. Divvala
Ross B. Girshick
Ali Farhadi
ObjD
705
36,997
0
08 Jun 2015
1