Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.00937
Cited By
v1
v2 (latest)
Neural Discrete Representation Learning
2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Neural Discrete Representation Learning"
50 / 3,267 papers shown
Title
Projectable Models: One-Shot Generation of Small Specialized Transformers from Large Ones
A. Zhmoginov
Jihwan Lee
Mark Sandler
44
0
0
06 Jun 2025
Improving AI-generated music with user-guided training
Vishwa Mohan Singh
Sai Anirudh Aryasomayajula
Ahan Chatterjee
Beste Aydemir
Rifat Mehreen Amin
102
0
0
05 Jun 2025
UniMate: A Unified Model for Mechanical Metamaterial Generation, Property Prediction, and Condition Confirmation
Wangzhi Zhan
Jianpeng Chen
Dongqi Fu
Dawei Zhou
AI4CE
26
0
0
05 Jun 2025
STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector Quantization
Hao Li
Qi Lv
Rui Shao
Xiang Deng
Yinchuan Li
Jianye Hao
Liqiang Nie
163
1
0
04 Jun 2025
InterMamba: Efficient Human-Human Interaction Generation with Adaptive Spatio-Temporal Mamba
Zizhao Wu
Yingying Sun
Yiming Chen
Xiaoling Gu
Ruyu Liu
Jiazhou Chen
Mamba
50
0
0
03 Jun 2025
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding
Junliang Ye
Zhengyi Wang
Ruowen Zhao
Shenghao Xie
Jun Zhu
74
0
0
02 Jun 2025
Self-supervised Latent Space Optimization with Nebula Variational Coding
Yida Wang
D. Tan
Nassir Navab
Federico Tombari
DRL
SSL
85
1
0
02 Jun 2025
Enhancing Interpretable Image Classification Through LLM Agents and Conditional Concept Bottleneck Models
Yiwen Jiang
Deval Mehta
Wei Feng
Zongyuan Ge
66
0
0
02 Jun 2025
Tomographic Foundation Model -- FORCE: Flow-Oriented Reconstruction Conditioning Engine
Wenjun Xia
Chuang Niu
Ge Wang
MedIm
OOD
34
0
0
02 Jun 2025
Generative Next POI Recommendation with Semantic ID
D. Wang
Yuxi Huang
Shen Gao
Yifan Wang
Chengrui Huang
Shuo Shang
66
0
0
02 Jun 2025
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
63
0
0
02 Jun 2025
Automatic Stage Lighting Control: Is it a Rule-Driven Process or Generative Task?
Zijian Zhao
Dian Jin
Zijing Zhou
Xiaoyu Zhang
43
0
0
02 Jun 2025
Unraveling Spatio-Temporal Foundation Models via the Pipeline Lens: A Comprehensive Review
Yuchen Fang
Hao Miao
Yuxuan Liang
Liwei Deng
Yue Cui
...
Yan Zhao
T. Pedersen
Christian S. Jensen
Xiaofang Zhou
Kai Zheng
AI4TS
AI4CE
83
0
0
02 Jun 2025
FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens
Yiming Zhong
Yumeng Liu
Chuyang Xiao
Zemin Yang
Youzhuo Wang
Yufei Zhu
Ye-ling Shi
Yujing Sun
X. Zhu
Yuexin Ma
70
0
0
02 Jun 2025
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching
Jialong Zuo
Shengpeng Ji
Minghui Fang
Mingze Li
Ziyue Jiang
Xize Cheng
Xiaoda Yang
Chen Feiyang
Xinyu Duan
Zhou Zhao
50
0
0
01 Jun 2025
Humanoid World Models: Open World Foundation Models for Humanoid Robotics
Muhammad Qasim Ali
Aditya Sridhar
Shahbuland Matiana
Alex Wong
Mohammad Al-Sharman
VGen
VLM
56
0
0
01 Jun 2025
Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues
Youngmin Kim
Jiwan Chung
Jisoo Kim
Sunghyun Lee
Sangkyu Lee
Junhyeok Kim
Cheoljong Yang
Youngjae Yu
VGen
37
0
0
01 Jun 2025
Concept-Centric Token Interpretation for Vector-Quantized Generative Models
Tianze Yang
Yucheng Shi
Mengnan Du
Xuansheng Wu
Qiaoyu Tan
Jin Sun
Ninghao Liu
35
0
0
31 May 2025
MagiCodec: Simple Masked Gaussian-Injected Codec for High-Fidelity Reconstruction and Generation
Yakun Song
Jiawei Chen
Xiaobin Zhuang
Chenpeng Du
Ziyang Ma
...
Dongya Jia
Zhuo Chen
Yuping Wang
Yuxuan Wang
Xie Chen
43
0
0
31 May 2025
Probabilistic Forecasting for Building Energy Systems using Time-Series Foundation Models
Young-Jin Park
François Germain
Jing Liu
Ye Wang
T. Koike-Akino
Gordon Wichern
Navid Azizan
C. Laughman
Ankush Chakrabarty
AI4TS
AI4CE
42
0
0
31 May 2025
On Designing Diffusion Autoencoders for Efficient Generation and Representation Learning
Magdalena Proszewska
Nikolay Malkin
N. Siddharth
DiffM
43
0
0
30 May 2025
SignBot: Learning Human-to-Humanoid Sign Language Interaction
Guanren Qiao
Sixu Lin
Ronglai Zuo Zhizheng Wu
Kui Jia
Kui Jia
Guiliang Liu
SLR
64
0
0
30 May 2025
DLM-One: Diffusion Language Models for One-Step Sequence Generation
Tianqi Chen
Shujian Zhang
Mingyuan Zhou
42
0
0
30 May 2025
Are Any-to-Any Models More Consistent Across Modality Transfers Than Specialists?
Jiwan Chung
Janghan Yoon
J. S. Park
Sangeyl Lee
Joowon Yang
Sooyeon Park
Youngjae Yu
56
0
0
30 May 2025
Semantics-Aware Human Motion Generation from Audio Instructions
Zi-An Wang
Shihao Zou
Shiyao Yu
Mingyuan Zhang
Chao Dong
VGen
39
0
0
29 May 2025
Normalizing Flows are Capable Models for RL
Raj Ghugare
Benjamin Eysenbach
OffRL
AI4CE
95
0
0
29 May 2025
Are Unified Vision-Language Models Necessary: Generalization Across Understanding and Generation
Jihai Zhang
Tianle Li
Linjie Li
Zhengyuan Yang
Yu Cheng
82
1
0
29 May 2025
EAD: An EEG Adapter for Automated Classification
Pushapdeep Singh
Jyoti Nigam
Medicherla Vamsi Krishna
Arnav V. Bhavsar
A. Nigam
17
0
0
29 May 2025
Fine-Tuning Next-Scale Visual Autoregressive Models with Group Relative Policy Optimization
Matteo Gallici
Haitz Sáez de Ocáriz Borde
51
0
0
29 May 2025
MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction
Yunkee Chae
Kyogu Lee
66
0
0
29 May 2025
MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation
Siyuan Wang
Jiawei Liu
Wei Wang
Yeying Jin
Jinsong Du
Zhi Han
SLR
VGen
84
0
0
29 May 2025
Let's Predict Sentence by Sentence
Hyeonbin Hwang
Byeongguk Jeon
Seungone Kim
Jiyeon Kim
Hoyeon Chang
Sohee Yang
Seungpil Won
Dohaeng Lee
Youbin Ahn
Minjoon Seo
96
0
0
28 May 2025
PacTure: Efficient PBR Texture Generation on Packed Views with Visual Autoregressive Models
Fan Fei
Jiajun Tang
Fei-Peng Tian
Boxin Shi
P. Tan
DiffM
52
0
0
28 May 2025
ACE-Step: A Step Towards Music Generation Foundation Model
Junmin Gong
Sean Zhao
Sen Wang
S. Xu
Joe Guo
44
2
0
28 May 2025
Improving Brain-to-Image Reconstruction via Fine-Grained Text Bridging
Runze Xia
Shuo Feng
Renzhi Wang
Congchi Yin
Xuyun Wen
Piji Li
DiffM
42
0
0
28 May 2025
AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion
Junqi Zhao
Jinzheng Zhao
Haohe Liu
Yun Chen
Lu Han
Xubo Liu
Mark D. Plumbley
Wenwu Wang
DiffM
48
0
0
28 May 2025
MetaSlot: Break Through the Fixed Number of Slots in Object-Centric Learning
Hongjia Liu
Rongzhen Zhao
Haohan Chen
Joni Pajarinen
OCL
VLM
129
0
0
27 May 2025
Sci-Fi: Symmetric Constraint for Frame Inbetweening
Liuhan Chen
Xiaodong Cun
Xiaoyu Li
Xianyi He
Shenghai Yuan
Jie Chen
Ying Shan
Lichao Sun
VGen
84
0
0
27 May 2025
What Do Latent Action Models Actually Learn?
Chuheng Zhang
Tim Pearce
Pushi Zhang
Kaixin Wang
Xiaoyu Chen
Wei Shen
Li Zhao
Jiang Bian
19
0
0
27 May 2025
Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective
Yang Zhang
Xinran Li
Jianing Ye
Delin Qu
Shuang Qiu
Chongjie Zhang
Xiu Li
Chenjia Bai
52
0
0
27 May 2025
LeDiFlow: Learned Distribution-guided Flow Matching to Accelerate Image Generation
Pascal Zwick
Nils Friederich
Maximilian Beichter
Lennart Hilbert
Ralf Mikut
Oliver Bringmann
MedIm
41
0
0
27 May 2025
Spotlight-TTS: Spotlighting the Style via Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech
Nam-Gyu Kim
Deok-Hyeon Cho
Seung-Bin Kim
Seong-Whan Lee
74
0
0
27 May 2025
StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation
Yi Wu
Lingting Zhu
Shengju Qian
Lei Liu
Wandi Qiao
Lequan Yu
Bin Li
77
0
0
26 May 2025
Knowledge-Aligned Counterfactual-Enhancement Diffusion Perception for Unsupervised Cross-Domain Visual Emotion Recognition
Wen Yin
Yong Wang
Guiduo Duan
Dongyang Zhang
Xin Hu
Yuan-Fang Li
Tao He
127
0
0
26 May 2025
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
Kunjun Li
Zigeng Chen
Cheng-Yen Yang
Jenq-Neng Hwang
95
0
0
26 May 2025
Discrete Markov Bridge
Hengli Li
Yuxuan Wang
Song-Chun Zhu
Ying Nian Wu
Zilong Zheng
DiffM
73
0
0
26 May 2025
Advancing Limited-Angle CT Reconstruction Through Diffusion-Based Sinogram Completion
Jiaqi Guo
S. Tapia
Aggelos K. Katsaggelos
DiffM
MedIm
34
0
0
26 May 2025
DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for Cross-Speaker Emotion Transfer in Text-to-Speech
Deok-Hyeon Cho
Hyung-Seok Oh
Seung-Bin Kim
Seong-Whan Lee
65
0
0
26 May 2025
Information-theoretic Generalization Analysis for VQ-VAEs: A Role of Latent Variables
Futoshi Futami
Masahiro Fujisawa
DRL
CML
97
0
0
26 May 2025
AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models
Muyao Niu
Mingdeng Cao
Yifan Zhan
Qingtian Zhu
Mingze Ma
Jiancheng Zhao
Yanhong Zeng
Zhihang Zhong
Xiao Sun
Yinqiang Zheng
DiffM
VGen
66
0
0
26 May 2025
Previous
1
2
3
4
5
...
64
65
66
Next