ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.00937
  4. Cited By
Neural Discrete Representation Learning

Neural Discrete Representation Learning

2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
    BDL
    SSL
    OCL
ArXivPDFHTML

Papers citing "Neural Discrete Representation Learning"

50 / 2,785 papers shown
Title
When Diffusion MRI Meets Diffusion Model: A Novel Deep Generative Model
  for Diffusion MRI Generation
When Diffusion MRI Meets Diffusion Model: A Novel Deep Generative Model for Diffusion MRI Generation
Xi Zhu
Wei Zhang
Yijie Li
L. O’Donnell
Fan Zhang
DiffM
MedIm
55
3
0
23 Aug 2024
T3M: Text Guided 3D Human Motion Synthesis from Speech
T3M: Text Guided 3D Human Motion Synthesis from Speech
Wenshuo Peng
Kaipeng Zhang
Sai Qian Zhang
38
0
0
23 Aug 2024
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed
  Representations
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Can Qin
Congying Xia
Krithika Ramakrishnan
Michael S Ryoo
Lifu Tu
...
Silvio Savarese
Juan Carlos Niebles
Zeyuan Chen
Ran Xu
Caiming Xiong
VGen
DiffM
76
2
0
22 Aug 2024
CODE: Confident Ordinary Differential Editing
CODE: Confident Ordinary Differential Editing
B. V. Delft
Tommaso Martorella
Alexandre Alahi
DiffM
40
0
0
22 Aug 2024
MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework
  for Multimodal Large Language Model
MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model
Chaoya Jiang
Jia Hongrui
Haiyang Xu
Wei Ye
Mengfan Dong
Ming Yan
Ji Zhang
Fei Huang
Shikun Zhang
VLM
56
1
0
22 Aug 2024
Real-Time Video Generation with Pyramid Attention Broadcast
Real-Time Video Generation with Pyramid Attention Broadcast
Xuanlei Zhao
Xiaolong Jin
Kai Wang
Yang You
VGen
DiffM
82
33
0
22 Aug 2024
Scalable Autoregressive Image Generation with Mamba
Scalable Autoregressive Image Generation with Mamba
Haopeng Li
Jinyue Yang
Kexin Wang
Xuerui Qiu
Yuhong Chou
Xin Li
Guoqi Li
Mamba
63
13
0
22 Aug 2024
Transfusion: Predict the Next Token and Diffuse Images with One
  Multi-Modal Model
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Chunting Zhou
Lili Yu
Arun Babu
Kushal Tirumala
Michihiro Yasunaga
Leonid Shamis
Jacob Kahn
Xuezhe Ma
Luke Zettlemoyer
Omer Levy
DiffM
42
153
0
20 Aug 2024
Prompt-Agnostic Adversarial Perturbation for Customized Diffusion Models
Prompt-Agnostic Adversarial Perturbation for Customized Diffusion Models
Cong Wan
Yuhang He
Xiang Song
Yihong Gong
DiffM
AAML
39
7
0
20 Aug 2024
Latent Diffusion for Guided Document Table Generation
Latent Diffusion for Guided Document Table Generation
Syed Jawwad Haider Hamdani
S. Saifullah
S. Agne
Andreas Dengel
Sheraz Ahmed
26
0
0
19 Aug 2024
Unsupervised Composable Representations for Audio
Unsupervised Composable Representations for Audio
Giovanni Bindi
P. Esling
DiffM
OCL
CoGe
42
0
0
19 Aug 2024
TraDiffusion: Trajectory-Based Training-Free Image Generation
TraDiffusion: Trajectory-Based Training-Free Image Generation
Mingrui Wu
Oucheng Huang
Jiayi Ji
Jiale Li
Xinyue Cai
Huafeng Kuang
Jianzhuang Liu
Xiaoshuai Sun
Rongrong Ji
40
3
0
19 Aug 2024
Combo: Co-speech holistic 3D human motion generation and efficient
  customizable adaptation in harmony
Combo: Co-speech holistic 3D human motion generation and efficient customizable adaptation in harmony
Chao Xu
Mingze Sun
Zhi-Qi Cheng
Fei Wang
Yang Liu
Baigui Sun
Ruqi Huang
Alexander G. Hauptmann
VGen
50
3
0
18 Aug 2024
Deep Generative Classification of Blood Cell Morphology
Deep Generative Classification of Blood Cell Morphology
Simon Deltadahl
J. Gilbey
C. V. Laer
Nancy Boeckx
M. Leers
...
Nicholas S. Gleadall
Carola-Bibiane Schönlieb
S. Sivapalaratnam
Michael Roberts
P. Nachev
DiffM
MedIm
44
0
0
16 Aug 2024
SC-Rec: Enhancing Generative Retrieval with Self-Consistent Reranking
  for Sequential Recommendation
SC-Rec: Enhancing Generative Retrieval with Self-Consistent Reranking for Sequential Recommendation
Tongyoung Kim
Soojin Yoon
SeongKu Kang
Jinyoung Yeo
Dongha Lee
RALM
30
2
0
16 Aug 2024
An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation
An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation
Peiming Guo
Sinuo Liu
Yanzhao Zhang
Dingkun Long
Pengjun Xie
Meishan Zhang
Hao Fei
DiffM
50
1
0
16 Aug 2024
Music2Latent: Consistency Autoencoders for Latent Audio Compression
Music2Latent: Consistency Autoencoders for Latent Audio Compression
Marco Pasini
Stefan Lattner
George Fazekas
26
7
0
12 Aug 2024
DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation
DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation
Jisoo Kim
Jungbin Cho
Joonho Park
Soonmin Hwang
Da Eun Kim
Geon Kim
Youngjae Yu
62
1
0
12 Aug 2024
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for
  Speech Processing
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing
Chunyu Qiang
Wang Geng
Yi Zhao
Ruibo Fu
Tao Wang
...
Chen Zhang
Hao Che
Longbiao Wang
Jianwu Dang
Jianhua Tao
AI4TS
44
0
0
11 Aug 2024
Scene123: One Prompt to 3D Scene Generation via Video-Assisted and
  Consistency-Enhanced MAE
Scene123: One Prompt to 3D Scene Generation via Video-Assisted and Consistency-Enhanced MAE
Yiying Yang
Fukun Yin
Jiayuan Fan
Xin Chen
Wanzhang Li
Gang Yu
VGen
57
1
0
10 Aug 2024
CoBooM: Codebook Guided Bootstrapping for Medical Image Representation
  Learning
CoBooM: Codebook Guided Bootstrapping for Medical Image Representation Learning
Azad Singh
Deepak Mishra
SSL
50
1
0
08 Aug 2024
D2Styler: Advancing Arbitrary Style Transfer with Discrete Diffusion
  Methods
D2Styler: Advancing Arbitrary Style Transfer with Discrete Diffusion Methods
Onkar Susladkar
Gayatri S Deshmukh
Sparsh Mittal
Parth Shastri
DiffM
49
3
0
07 Aug 2024
ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually
  Synced Facial Performer
ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
Jiazhi Guan
Zhiliang Xu
Hang Zhou
Kaisiyuan Wang
Shengyi He
...
Errui Ding
Jingtuo Liu
Jingdong Wang
Youjian Zhao
Ziwei Liu
VGen
59
2
0
06 Aug 2024
Integrating Controllable Motion Skills from Demonstrations
Integrating Controllable Motion Skills from Demonstrations
Honghao Liao
Zhiheng Li
Ziyu Meng
Ran Song
Yibin Li
Wei Zhang
40
0
0
06 Aug 2024
Central Kurdish Text-to-Speech Synthesis with Novel End-to-End
  Transformer Training
Central Kurdish Text-to-Speech Synthesis with Novel End-to-End Transformer Training
Hawraz A. Ahmad
Tarik A. Rashid
41
0
0
06 Aug 2024
MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh
  Tokenization
MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization
Yiwen Chen
Yikai Wang
Yihao Luo
Ziyi Wang
Zilong Chen
Jun Zhu
Chi Zhang
Guosheng Lin
40
24
0
05 Aug 2024
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Dongyang Liu
Shitian Zhao
Le Zhuo
Weifeng Lin
Ping Luo
Xinyue Li
Qi Qin
Yu Qiao
Hongsheng Li
Peng Gao
MLLM
82
48
0
05 Aug 2024
PanoFree: Tuning-Free Holistic Multi-view Image Generation with
  Cross-view Self-Guidance
PanoFree: Tuning-Free Holistic Multi-view Image Generation with Cross-view Self-Guidance
Aoming Liu
Zhong Li
Zhang Chen
Nannan Li
Yinghao Xu
Bryan A. Plummer
49
4
0
04 Aug 2024
LEGO: Self-Supervised Representation Learning for Scene Text Images
LEGO: Self-Supervised Representation Learning for Scene Text Images
Yujin Ren
Jiaxin Zhang
Lianwen Jin
SSL
46
0
0
04 Aug 2024
GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer
GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer
Yihong Lin
Zhaoxin Fan
Lingyu Xiong
Liang Peng
Xiandong Li
Xiandong Li
Wenxiong Kang
Xiandong Li
Huang Xu
47
3
0
03 Aug 2024
HMDN: Hierarchical Multi-Distribution Network for Click-Through Rate
  Prediction
HMDN: Hierarchical Multi-Distribution Network for Click-Through Rate Prediction
Xingyu Lou
Yu Yang
Kuiyao Dong
Emmanouil Benetos
Wenyi Yu
George Fazekas
Xiu Li
Dmitry Bogdanov
43
0
0
02 Aug 2024
VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling
VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling
Qian Zhang
Xiangzi Dai
Ninghua Yang
Xiang An
Ziyong Feng
Xingyu Ren
VLM
CLIP
43
17
0
02 Aug 2024
UniMoT: Unified Molecule-Text Language Model with Discrete Token
  Representation
UniMoT: Unified Molecule-Text Language Model with Discrete Token Representation
Jiayuan Zhu
Yunli Qi
Yongqiang Chen
Quanming Yao
39
7
0
01 Aug 2024
Text-Guided Video Masked Autoencoder
Text-Guided Video Masked Autoencoder
D. Fan
Jue Wang
Shuai Liao
Zhikang Zhang
Vimal Bhat
Xinyu Li
VGen
38
3
0
01 Aug 2024
Synthetic dual image generation for reduction of labeling efforts in
  semantic segmentation of micrographs with a customized metric function
Synthetic dual image generation for reduction of labeling efforts in semantic segmentation of micrographs with a customized metric function
Matias Oscar Volman Stern
Dominic Hohs
Markos Diomataris
Michael J. Black
Gerhard Schneider
DiffM
50
0
0
01 Aug 2024
Fine-gained Zero-shot Video Sampling
Fine-gained Zero-shot Video Sampling
Dengsheng Chen
Jie Hu
Javier Segovia-Aguas
Enhua Wu
VGen
DiffM
44
0
0
31 Jul 2024
TrackSorter: A Transformer-based sorting algorithm for track finding in
  High Energy Physics
TrackSorter: A Transformer-based sorting algorithm for track finding in High Energy Physics
Yash Melkani
Xiangyang Ju
50
1
0
31 Jul 2024
Segment Anything for Videos: A Systematic Survey
Segment Anything for Videos: A Systematic Survey
Chunhui Zhang
Yawen Cui
Weilin Lin
Guanjie Huang
Yan Rong
Li Liu
Shiguang Shan
VLM
52
6
0
31 Jul 2024
Self-supervised Multi-future Occupancy Forecasting for Autonomous
  Driving
Self-supervised Multi-future Occupancy Forecasting for Autonomous Driving
Bernard Lange
Masha Itkina
Jiachen Li
Mykel J. Kochenderfer
44
4
0
30 Jul 2024
Decoding Linguistic Representations of Human Brain
Decoding Linguistic Representations of Human Brain
Yu Wang
Heyang Liu
Yuhao Wang
Chuan Xuan
Yixuan Hou
Sheng Feng
Hongcheng Liu
Yusheng Liao
Yanfeng Wang
AI4CE
41
1
0
30 Jul 2024
SuperCodec: A Neural Speech Codec with Selective Back-Projection Network
SuperCodec: A Neural Speech Codec with Selective Back-Projection Network
Youqiang Zheng
Weiping Tu
Li Xiao
Xinmeng Xu
40
3
0
30 Jul 2024
Can I trust my anomaly detection system? A case study based on
  explainable AI
Can I trust my anomaly detection system? A case study based on explainable AI
Muhammad Rashid
E. Amparore
Enrico Ferrari
Damiano Verda
41
0
0
29 Jul 2024
The Interpretability of Codebooks in Model-Based Reinforcement Learning
  is Limited
The Interpretability of Codebooks in Model-Based Reinforcement Learning is Limited
Kenneth Eaton
Jonathan C. Balloch
Julia Kim
Mark O. Riedl
FAtt
OffRL
36
0
0
28 Jul 2024
QT-TDM: Planning with Transformer Dynamics Model and Autoregressive
  Q-Learning
QT-TDM: Planning with Transformer Dynamics Model and Autoregressive Q-Learning
Mostafa Kotb
C. Weber
Muhammad Burhan Hafez
Stefan Wermter
41
1
0
26 Jul 2024
Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken
  Generation
Revolutionizing Text-to-Image Retrieval as Autoregressive Token-to-Voken Generation
Yongqi Li
Hongru Cai
Wenjie Wang
Leigang Qu
Yinwei Wei
Wenjie Li
Liqiang Nie
Tat-Seng Chua
DiffM
40
1
0
24 Jul 2024
Speech Editing -- a Summary
Speech Editing -- a Summary
Tobias Kässmann
Yining Liu
Danni Liu
34
0
0
24 Jul 2024
Occlusion-Aware 3D Motion Interpretation for Abnormal Behavior Detection
Occlusion-Aware 3D Motion Interpretation for Abnormal Behavior Detection
Su Li
Wang Liang
Jianye Wang
Ziheng Zhang
Lei Zhang
47
0
0
23 Jul 2024
MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object
  Detection
MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection
Youngmin Oh
Hyung-Il Kim
Seong Tae Kim
Jung Uk Kim
DiffM
44
2
0
23 Jul 2024
On Differentially Private 3D Medical Image Synthesis with Controllable
  Latent Diffusion Models
On Differentially Private 3D Medical Image Synthesis with Controllable Latent Diffusion Models
Deniz Daum
Richard Osuala
Anneliese Riess
Georgios Kaissis
Julia A. Schnabel
Maxime Di Folco
MedIm
58
0
0
23 Jul 2024
CarFormer: Self-Driving with Learned Object-Centric Representations
CarFormer: Self-Driving with Learned Object-Centric Representations
Shadi S. Hamdan
Fatma Guney
3DPC
OCL
51
3
0
22 Jul 2024
Previous
123...141516...545556
Next