Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.00937
Cited By
v1
v2 (latest)
Neural Discrete Representation Learning
2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Neural Discrete Representation Learning"
50 / 3,267 papers shown
Title
MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation
Vikram S. Voleti
Alexia Jolicoeur-Martineau
Christopher Pal
DiffM
VGen
159
309
0
19 May 2022
Deterministic training of generative autoencoders using invertible layers
Gianluigi Silvestri
Daan Roos
L. Ambrogioni
TPM
79
2
0
19 May 2022
Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space
Kuan Fang
Patrick Yin
Ashvin Nair
Sergey Levine
OffRL
117
35
0
17 May 2022
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization
Yuhta Takida
Takashi Shibuya
Wei-Hsiang Liao
Chieh-Hsin Lai
Junki Ohmura
Toshimitsu Uesaka
Naoki Murata
Shusuke Takahashi
Toshiyuki Kumakura
Yuki Mitsufuji
BDL
87
67
0
16 May 2022
Clinical outcome prediction under hypothetical interventions -- a representation learning framework for counterfactual reasoning
Yikuan Li
M. Mamouei
Shishir Rao
A. Hassaine
D. Canoy
Thomas Lukasiewicz
K. Rahimi
G. Salimi-Khorshidi
OOD
CML
AI4CE
83
1
0
15 May 2022
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech
Rongjie Huang
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
OODD
VLM
195
34
0
15 May 2022
VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
Yuchao Gu
Xintao Wang
Liangbin Xie
Chao Dong
Gengyan Li
Ying Shan
Mingg-Ming Cheng
82
124
0
13 May 2022
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Jiachen Lian
Chunlei Zhang
Gopala Krishna Anumanchipalli
Dong Yu
58
23
0
11 May 2022
Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Qiankun Liu
Zhentao Tan
Dongdong Chen
Qi Chu
Xiyang Dai
Yinpeng Chen
Mengchen Liu
Lu Yuan
Nenghai Yu
ViT
87
70
0
10 May 2022
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Xu Tan
Jiawei Chen
Haohe Liu
Jian Cong
Chen Zhang
...
Lei He
Frank Soong
Tao Qin
Sheng Zhao
Tie-Yan Liu
146
221
0
09 May 2022
Variational Sparse Coding with Learned Thresholding
Kion Fallah
Christopher Rozell
DRL
89
7
0
07 May 2022
Generative methods for sampling transition paths in molecular dynamics
T. Lelièvre
Geneviève Robin
Inass Sekkat
G. Stoltz
Gabriel Victorino Cardoso
GAN
48
9
0
05 May 2022
Unsupervised Mismatch Localization in Cross-Modal Sequential Data with Application to Mispronunciations Localization
Wei Wei
Hengguan Huang
Xiangming Gu
Hao Wang
Ye Wang
BDL
73
0
0
05 May 2022
An Analysis of Generative Methods for Multiple Image Inpainting
C. Ballester
Aurélie Bugeau
Samuel Hurault
S. Parisotto
Patricia Vitoria
63
3
0
04 May 2022
End-to-End Visual Editing with a Generatively Pre-Trained Artist
A. Brown
Cheng-Yang Fu
Omkar M. Parkhi
Tamara L. Berg
Andrea Vedaldi
DiffM
89
8
0
03 May 2022
Learning Discrete Structured Variational Auto-Encoder using Natural Evolution Strategies
Alon Berliner
Guy Rotman
Yossi Adi
Roi Reichart
Tamir Hazan
BDL
DRL
77
4
0
03 May 2022
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
Ming Ding
Wendi Zheng
Wenyi Hong
Jie Tang
VLM
177
335
0
28 Apr 2022
Can deep learning match the efficiency of human visual long-term memory in storing object details?
Emin Orhan
VLM
OCL
129
0
0
27 Apr 2022
DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation
Wei Chen
Yeyun Gong
Song Wang
Bolun Yao
Weizhen Qi
...
Bartuer Zhou
Yi Mao
Weizhu Chen
Biao Cheng
Nan Duan
VLM
82
48
0
27 Apr 2022
Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Sanyuan Chen
Yu Wu
Chengyi Wang
Shujie Liu
Zhuo Chen
...
Gang Liu
Jinyu Li
Jian Wu
Xiangzhan Yu
Furu Wei
SSL
102
42
0
27 Apr 2022
Semi-Parametric Neural Image Synthesis
A. Blattmann
Robin Rombach
Kaan Oktay
Jonas Muller
Bjorn Ommer
DiffM
102
31
0
25 Apr 2022
Masked Image Modeling Advances 3D Medical Image Analysis
Zekai Chen
Devansh Agarwal
Kshitij Aggarwal
Wiem Safta
Samit Hirawat
V. Sethuraman
Mariann Micsinai Balan
Kevin Brown
103
75
0
25 Apr 2022
Unified Pretraining Framework for Document Understanding
Jiuxiang Gu
Jason Kuen
Vlad I. Morariu
Handong Zhao
Nikolaos Barmpalios
R. Jain
A. Nenkova
Tong Sun
105
98
0
22 Apr 2022
Learn from Unpaired Data for Image Restoration: A Variational Bayes Approach
Dihan Zheng
Xiaowen Zhang
Kaisheng Ma
Chenglong Bao
DiffM
81
23
0
21 Apr 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
75
113
0
20 Apr 2022
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
Katherine Crowson
Stella Biderman
Daniel Kornis
Dashiell Stander
Eric Hallahan
Louis Castricato
Edward Raff
CLIP
193
381
0
18 Apr 2022
Neural Space-filling Curves
Hanyu Wang
Kamal Gupta
Larry S. Davis
Abhinav Shrivastava
60
2
0
18 Apr 2022
Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion
Evonne Ng
Hanbyul Joo
Liwen Hu
Hao Li
Trevor Darrell
Angjoo Kanazawa
Shiry Ginosar
VGen
78
95
0
18 Apr 2022
Saliency in Augmented Reality
Huiyu Duan
Wei Shen
Xiongkuo Min
Danyang Tu
Jing Li
Guangtao Zhai
55
34
0
18 Apr 2022
Simultaneous Multiple-Prompt Guided Generation Using Differentiable Optimal Transport
Yingtao Tian
Marco Cuturi
David R Ha
DiffM
OT
77
1
0
18 Apr 2022
Unconditional Image-Text Pair Generation with Multimodal Cross Quantizer
Hyungyu Lee
Sungjin Park
Joonseok Lee
Edward Choi
69
2
0
15 Apr 2022
Controllable Video Generation through Global and Local Motion Dynamics
A. Davtyan
Paolo Favaro
46
9
0
13 Apr 2022
What Matters in Language Conditioned Robotic Imitation Learning over Unstructured Data
Oier Mees
Lukás Hermann
Wolfram Burgard
LM&Ro
115
156
0
13 Apr 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
518
6,946
0
13 Apr 2022
Physically Disentangled Representations
Tzofi Klinghoffer
Kushagra Tiwary
Arkadiusz Balata
Vivek Sharma
Ramesh Raskar
3DV
OCL
CoGe
DRL
63
1
0
11 Apr 2022
ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation
Jianan Wang
Guansong Lu
Hang Xu
Zhenguo Li
Chunjing Xu
Yanwei Fu
109
17
0
09 Apr 2022
EfficientFi: Towards Large-Scale Lightweight WiFi Sensing via CSI Compression
Jianfei Yang
Xinyan Chen
Han Zou
Dazhuo Wang
Q. Xu
Lihua Xie
73
91
0
08 Apr 2022
Hierarchical and Multi-Scale Variational Autoencoder for Diverse and Natural Non-Autoregressive Text-to-Speech
Jaesung Bae
Jinhyeok Yang
Taejun Bak
Young-Sun Joo
DiffM
136
6
0
08 Apr 2022
Enhanced exemplar autoencoder with cycle consistency loss in any-to-one voice conversion
Weida Liang
Lantian Li
Wenqiang Du
Dong Wang
128
0
0
08 Apr 2022
Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer
Songwei Ge
Thomas Hayes
Harry Yang
Xiaoyue Yin
Guan Pang
David Jacobs
Jia-Bin Huang
Devi Parikh
ViT
176
223
0
07 Apr 2022
KNN-Diffusion: Image Generation via Large-Scale Retrieval
Shelly Sheynin
Oron Ashual
Adam Polyak
Uriel Singer
Oran Gafni
Eliya Nachmani
Yaniv Taigman
VLM
SyDa
DiffM
85
124
0
06 Apr 2022
Autoregressive 3D Shape Generation via Canonical Mapping
A. Cheng
Xueting Li
Sifei Liu
Min Sun
Ming-Hsuan Yang
3DPC
93
41
0
05 Apr 2022
High-Quality Pluralistic Image Completion via Code Shared VQGAN
Chuanxia Zheng
Guoxian Song
Tat-Jen Cham
Jianfei Cai
Dinh Q. Phung
Linjie Luo
VLM
80
10
0
05 Apr 2022
Cancer Subtyping via Embedded Unsupervised Learning on Transcriptomics Data
Ziwei Yang
Lingwei Zhu
Zheng Chen
Ming Huang
N. Ono
M. Altaf-Ul-Amin
Shigehiko Kanaya
13
2
0
02 Apr 2022
PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation
Jing He
Yiyi Zhou
Qi Zhang
Jun Peng
Yunhang Shen
Xiaoshuai Sun
Chao Chen
Rongrong Ji
96
5
0
02 Apr 2022
Simplicial Embeddings in Self-Supervised Learning and Downstream Classification
Samuel Lavoie
Christos Tsirigotis
Max Schwarzer
Ankit Vani
Michael Noukhovitch
Kenji Kawaguchi
Rameswar Panda
SSL
65
18
0
01 Apr 2022
Quantized GAN for Complex Music Generation from Dance Videos
Ye Zhu
Kyle Olszewski
Yuehua Wu
Panos Achlioptas
Menglei Chai
Yan Yan
Sergey Tulyakov
MGen
118
46
0
01 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
84
33
0
31 Mar 2022
Generative Spoken Dialogue Language Modeling
Tu Nguyen
Eugene Kharitonov
Jade Copet
Yossi Adi
Wei-Ning Hsu
...
Paden Tomasello
Robin Algayres
Benoît Sagot
Abdel-rahman Mohamed
Emmanuel Dupoux
AuLLM
123
88
0
30 Mar 2022
Autoregressive Co-Training for Learning Discrete Speech Representations
Sung-Lin Yeh
Hao Tang
SSL
87
6
0
29 Mar 2022
Previous
1
2
3
...
50
51
52
...
64
65
66
Next