Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.00937
Cited By
v1
v2 (latest)
Neural Discrete Representation Learning
2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Neural Discrete Representation Learning"
50 / 3,267 papers shown
Title
OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad
Luyao Tang
Yuxuan Yuan
Chen Chen
Zeyu Zhang
Yue Huang
Kun Zhang
98
1
0
24 Mar 2025
Learning Beamforming Codebooks for Active Sensing with Reconfigurable Intelligent Surface
Zhongze Zhang
Wei Yu
63
0
0
24 Mar 2025
Panorama Generation From NFoV Image Done Right
Dian Zheng
Cheng Zhang
Xiao-Ming Wu
Cao Li
Chengfei Lv
Jian-Fang Hu
Wei-Shi Zheng
DiffM
137
2
0
24 Mar 2025
VTD-CLIP: Video-to-Text Discretization via Prompting CLIP
Wencheng Zhu
Yuexin Wang
Hongxuan Li
Pengfei Zhu
Q. Hu
CLIP
111
0
0
24 Mar 2025
AGIR: Assessing 3D Gait Impairment with Reasoning based on LLMs
Diwei Wang
Cédric Bobenrieth
Hyewon Seo
LRM
67
0
0
23 Mar 2025
SG-Tailor: Inter-Object Commonsense Relationship Reasoning for Scene Graph Manipulation
Haoliang Shang
Hanyu Wu
Guangyao Zhai
Boyang Sun
Fangjinhua Wang
F. Tombari
Marc Pollefeys
105
0
0
23 Mar 2025
Generating Realistic, Diverse, and Fault-Revealing Inputs with Latent Space Interpolation for Testing Deep Neural Networks
Bin Duan
Matthew B.Dwyer
Guowei Yang
AAML
56
0
0
22 Mar 2025
CODA: Repurposing Continuous VAEs for Discrete Tokenization
Zeyu Liu
Zanlin Ni
Yeguo Hua
Xin Deng
Xiao Ma
Cheng Zhong
Gao Huang
87
0
0
22 Mar 2025
Halton Scheduler For Masked Generative Image Transformer
Victor Besnier
Mickael Chen
David Hurych
Eduardo Valle
Matthieu Cord
106
3
0
21 Mar 2025
STFTCodec: High-Fidelity Audio Compression through Time-Frequency Domain Representation
Tao Feng
Zhiyuan Zhao
Yifan Xie
Yuqi Ye
Xiangyang Luo
Xun Guan
Yongqian Li
132
0
0
21 Mar 2025
MerGen: Micro-electrode recording synthesis using a generative data-driven approach
Thibault Martin
Paul Sauleau
Claire Haegelen
Pierre Jannin
John S. H. Baxter
80
0
0
21 Mar 2025
Zero-Shot Styled Text Image Generation, but Make It Autoregressive
Vittorio Pippi
Fabio Quattrini
S. Cascianelli
Alessio Tonioni
Rita Cucchiara
81
1
0
21 Mar 2025
D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens
Panpan Wang
Liqiang Niu
Fandong Meng
Jinan Xu
Yufeng Chen
Jie Zhou
DiffM
115
0
0
21 Mar 2025
ProtoGS: Efficient and High-Quality Rendering with 3D Gaussian Prototypes
Zhengqing Gao
Dongting Hu
Jia-Wang Bian
Huan Fu
Yongqian Li
Tongliang Liu
Mingming Gong
Jianchao Tan
3DGS
137
0
0
21 Mar 2025
NuiScene: Exploring Efficient Generation of Unbounded Outdoor Scenes
Han-Hung Lee
Qinghong Han
Angel X. Chang
158
0
0
20 Mar 2025
Aligning Text-to-Music Evaluation with Human Preferences
Yichen Huang
Zachary Novack
Koichi Saito
Jiatong Shi
Shinji Watanabe
Yuki Mitsufuji
John Thickstun
Chris Donahue
EGVM
119
1
0
20 Mar 2025
Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction
Ziyao Guo
Jianchao Tan
Michael Qizhe Shieh
64
0
0
20 Mar 2025
Scale-wise Distillation of Diffusion Models
Nikita Starodubcev
Denis Kuznedelev
Artem Babenko
Dmitry Baranchuk
DiffM
106
0
0
20 Mar 2025
Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens
Shuqi Lu
Haowei Lin
Lin Yao
Zhifeng Gao
Xiaohong Ji
Weinan E
Linfeng Zhang
Guolin Ke
107
0
0
20 Mar 2025
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Yanjie Wang
Zhijie Lin
Yao Teng
Yuanzhi Zhu
Shuhuai Ren
Jiashi Feng
Xihui Liu
106
5
0
20 Mar 2025
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
Yang Sui
Yu-Neng Chuang
Guanchu Wang
Jiamu Zhang
Tianyi Zhang
...
Hongyi Liu
Andrew Wen
Shaochen
Zhong
Hanjie Chen
OffRL
ReLM
LRM
209
101
0
20 Mar 2025
Tokenize Image as a Set
Zigang Geng
Mengde Xu
Han Hu
Shuyang Gu
DiffM
87
0
0
20 Mar 2025
Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization
Feifei Li
Mi Zhang
Yiming Sun
Min Yang
DiffM
89
2
0
19 Mar 2025
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
R. Zhao
Junliang Ye
Ziyi Wang
Guangce Liu
Yiwen Chen
Yikai Wang
Jun Zhu
AI4CE
97
4
0
19 Mar 2025
CAM-Seg: A Continuous-valued Embedding Approach for Semantic Image Generation
Masud Ahmed
Zahid Hasan
Syed Arefinul Haque
A. Faridee
S. Purushotham
Suya You
Nirmalya Roy
191
0
0
19 Mar 2025
A Vector-Quantized Foundation Model for Patient Behavior Monitoring
Rodrigo Oliver
Josué Pérez-Sabater
Leire Paz-Arbaizar
Alejandro Lancho
Antonio Artés
Alejandro Lancho
Pablo M. Olmos
52
0
0
19 Mar 2025
Shap-MeD
Nicolás Laverde
Melissa Robles
Johan Rodríguez
MedIm
70
0
0
19 Mar 2025
MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space
Lixing Xiao
Shunlin Lu
Huaijin Pi
Ke Fan
Liang Pan
Yueer Zhou
Ziyong Feng
Xiaowei Zhou
Sida Peng
Jingbo Wang
DiffM
VGen
121
7
0
19 Mar 2025
Cube: A Roblox View of 3D Intelligence
Foundation AI Team Roblox
Kiran Bhat
Nishchaie Khanna
Karun Channa
Tinghui Zhou
...
Kyle Price
Steve Han
Yiqing Wang
A. Singh
David Baszucki
147
1
0
19 Mar 2025
Behaviour Discovery and Attribution for Explainable Reinforcement Learning
Rishav Rishav
Somjit Nath
Vincent Michalski
Samira Ebrahimi Kahou
FAtt
OffRL
173
1
0
19 Mar 2025
Towards Unified and Lossless Latent Space for 3D Molecular Latent Diffusion Modeling
Yanchen Luo
Zhiyuan Liu
Yi Zhao
Changhao Nai
Kenji Kawaguchi
Tat-Seng Chua
Xiang Wang
Yang Zhang
Xiang Wang
MedIm
164
0
0
19 Mar 2025
QINCODEC: Neural Audio Compression with Implicit Neural Codebooks
Zineb Lahrichi
Gaëtan Hadjeres
Gaël Richard
Geoffroy Peeters
111
0
0
19 Mar 2025
VenusFactory: A Unified Platform for Protein Engineering Data Retrieval and Language Model Fine-Tuning
Y. Tan
Chen Liu
Jingyuan Gao
Banghao Wu
Mingchen Li
...
Lingrong Zhang
Huiqun Yu
Guisheng Fan
Liang Hong
Bingxin Zhou
86
3
0
19 Mar 2025
DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies
Wei Song
Yansen Wang
Zijia Song
Yadong Li
Haoze Sun
Xin Wu
Guosheng Dong
Jianhua Xu
Jiaqi Wang
Kaicheng Yu
131
4
0
18 Mar 2025
Quantization-Free Autoregressive Action Transformer
Ziyad Sheebaelhamd
Michael Tschannen
Michael Muehlebach
Claire Vernade
101
1
0
18 Mar 2025
SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and Editing
Seokhyeon Hong
Chaelin Kim
Serin Yoon
Junghyun Nam
Sihun Cha
Junyong Noh
DiffM
VGen
111
2
0
18 Mar 2025
R3-Avatar: Record and Retrieve Temporal Codebook for Reconstructing Photorealistic Human Avatars
Yifan Zhan
Wangze Xu
Qingtian Zhu
Muyao Niu
Mingze Ma
Yifei Liu
Zhihang Zhong
Xiao-Fu Sun
Yinqiang Zheng
112
0
0
17 Mar 2025
Dense Policy: Bidirectional Autoregressive Learning of Actions
Yue Su
Xinyu Zhan
Hongjie Fang
Han Xue
Hao-Shu Fang
Yongqian Li
Cewu Lu
Lixin Yang
VGen
110
4
0
17 Mar 2025
Progressive Human Motion Generation Based on Text and Few Motion Frames
Ling-an Zeng
Gaojie Wu
Ancong Wu
Jian-Fang Hu
Wei-Shi Zheng
128
1
0
17 Mar 2025
3D Human Interaction Generation: A Survey
Siyuan Fan
Wenke Huang
Xiantao Cai
Di Lin
VGen
116
0
0
17 Mar 2025
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
Tao Wang
Changxu Cheng
Lingfeng Wang
Senda Chen
Wuyue Zhao
VLM
110
1
0
17 Mar 2025
Next-Scale Autoregressive Models are Zero-Shot Single-Image Object View Synthesizers
Shiran Yuan
Hao Zhao
DiffM
124
0
0
17 Mar 2025
Measuring In-Context Computation Complexity via Hidden State Prediction
Vincent Herrmann
Róbert Csordás
Jürgen Schmidhuber
88
0
0
17 Mar 2025
Versatile Physics-based Character Control with Hybrid Latent Representation
Jinseok Bae
Jungdam Won
Donggeun Lim
I. Hwang
Y. Kim
92
0
0
17 Mar 2025
HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding
Jiahe Zhao
Ruibing Hou
Zejie Tian
Hong Chang
Shiguang Shan
88
0
0
17 Mar 2025
BREEN: Bridge Data-Efficient Encoder-Free Multimodal Learning with Learnable Queries
Tianle Li
Yongming Rao
Winston Hu
Yu Cheng
MLLM
100
0
0
16 Mar 2025
Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
Xue Jiang
Xiulian Peng
Yuan Zhang
Yan Lu
SSL
148
1
0
15 Mar 2025
Direction-Aware Diagonal Autoregressive Image Generation
Yijia Xu
Jianzhong Ju
Jian Luan
J. Cui
185
0
0
14 Mar 2025
ACMo: Attribute Controllable Motion Generation
Mingjie Wei
Xuemei Xie
G. Shi
114
0
0
14 Mar 2025
Neurons: Emulating the Human Visual Cortex Improves Fidelity and Interpretability in fMRI-to-Video Reconstruction
Haonan Wang
Qixiang Zhang
Lehan Wang
Xuanqi Huang
Xiaomeng Li
VOS
VGen
108
0
0
14 Mar 2025
Previous
1
2
3
...
6
7
8
...
64
65
66
Next