Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.00937
Cited By
Neural Discrete Representation Learning
2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
BDL
SSL
OCL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Neural Discrete Representation Learning"
50 / 2,747 papers shown
Title
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
62
4
0
24 Mar 2025
HOIGPT: Learning Long Sequence Hand-Object Interaction with Language Models
Mingzhen Huang
Fu-Jen Chu
Bugra Tekin
Kevin J Liang
Haoyu Ma
...
Hongfei Xue
Siwei Lyu
Kris M. Kitani
Matt Feiszli
Hao Tang
VLM
65
0
0
24 Mar 2025
Human Motion Unlearning
Edoardo De Matteis
Matteo Migliarini
Alessio Sampieri
Indro Spinelli
Fabio Galasso
MU
60
0
0
24 Mar 2025
VTD-CLIP: Video-to-Text Discretization via Prompting CLIP
Wencheng Zhu
Yuexin Wang
Hongxuan Li
Pengfei Zhu
Q. Hu
CLIP
52
0
0
24 Mar 2025
Learning Beamforming Codebooks for Active Sensing with Reconfigurable Intelligent Surface
Zhongze Zhang
Wei Yu
49
0
0
24 Mar 2025
AGIR: Assessing 3D Gait Impairment with Reasoning based on LLMs
Diwei Wang
Cédric Bobenrieth
Hyewon Seo
LRM
47
0
0
23 Mar 2025
SG-Tailor: Inter-Object Commonsense Relationship Reasoning for Scene Graph Manipulation
Haoliang Shang
Hanyu Wu
Guangyao Zhai
Boyang Sun
Fangjinhua Wang
F. Tombari
Marc Pollefeys
57
0
0
23 Mar 2025
Generating Realistic, Diverse, and Fault-Revealing Inputs with Latent Space Interpolation for Testing Deep Neural Networks
Bin Duan
Matthew B.Dwyer
Guowei Yang
AAML
44
0
0
22 Mar 2025
CODA: Repurposing Continuous VAEs for Discrete Tokenization
Zeyu Liu
Zanlin Ni
Yeguo Hua
Xin Deng
Xiao Ma
Cheng Zhong
Gao Huang
47
0
0
22 Mar 2025
Zero-Shot Styled Text Image Generation, but Make It Autoregressive
Vittorio Pippi
Fabio Quattrini
S. Cascianelli
Alessio Tonioni
Rita Cucchiara
42
0
0
21 Mar 2025
STFTCodec: High-Fidelity Audio Compression through Time-Frequency Domain Representation
Tao Feng
Zhiyuan Zhao
Yifan Xie
Yuqi Ye
Xiangyang Luo
Xun Guan
Yong Li
57
0
0
21 Mar 2025
Halton Scheduler For Masked Generative Image Transformer
Victor Besnier
Mickael Chen
David Hurych
Eduardo Valle
Matthieu Cord
52
1
0
21 Mar 2025
D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens
Panpan Wang
Liqiang Niu
Fandong Meng
Jinan Xu
Yufeng Chen
Jie Zhou
DiffM
50
0
0
21 Mar 2025
ProtoGS: Efficient and High-Quality Rendering with 3D Gaussian Prototypes
Zhengqing Gao
Dongting Hu
Jia-Wang Bian
Huan Fu
Yong Li
Tongliang Liu
Mingming Gong
Kaipeng Zhang
3DGS
42
0
0
21 Mar 2025
MerGen: Micro-electrode recording synthesis using a generative data-driven approach
Thibault Martin
Paul Sauleau
Claire Haegelen
Pierre Jannin
John S. H. Baxter
36
0
0
21 Mar 2025
NuiScene: Exploring Efficient Generation of Unbounded Outdoor Scenes
Han-Hung Lee
Qinghong Han
Angel X. Chang
86
0
0
20 Mar 2025
Scale-wise Distillation of Diffusion Models
Nikita Starodubcev
Denis Kuznedelev
Artem Babenko
Dmitry Baranchuk
DiffM
53
0
0
20 Mar 2025
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
Yang Sui
Yu-Neng Chuang
Guanchu Wang
Jiamu Zhang
Tianyi Zhang
...
Hongyi Liu
Andrew Wen
Shaochen
Zhong
Hanjie Chen
OffRL
ReLM
LRM
83
31
0
20 Mar 2025
Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens
Shuqi Lu
Haowei Lin
Lin Yao
Zhifeng Gao
Xiaohong Ji
Weinan E
Linfeng Zhang
Guolin Ke
48
0
0
20 Mar 2025
Tokenize Image as a Set
Zigang Geng
Mengde Xu
Han Hu
Shuyang Gu
DiffM
55
0
0
20 Mar 2025
Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction
Ziyao Guo
Kaipeng Zhang
Michael Qizhe Shieh
43
0
0
20 Mar 2025
Aligning Text-to-Music Evaluation with Human Preferences
Yichen Huang
Zachary Novack
Koichi Saito
Jiatong Shi
Shinji Watanabe
Yuki Mitsufuji
John Thickstun
Chris Donahue
EGVM
70
1
0
20 Mar 2025
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Yanjie Wang
Zhijie Lin
Yao Teng
Yuanzhi Zhu
Shuhuai Ren
Jiashi Feng
Xihui Liu
53
0
0
20 Mar 2025
Shap-MeD
Nicolás Laverde
Melissa Robles
Johan Rodríguez
MedIm
56
0
0
19 Mar 2025
A Foundation Model for Patient Behavior Monitoring and Suicide Detection
Rodrigo Oliver
Josué Pérez-Sabater
Leire Paz-Arbaizar
Alejandro Lancho
Antonio Artés
Pablo M. Olmos
36
0
0
19 Mar 2025
CAM-Seg: A Continuous-valued Embedding Approach for Semantic Image Generation
Masud Ahmed
Zahid Hasan
Syed Arefinul Haque
A. Faridee
S. Purushotham
Suya You
Nirmalya Roy
60
0
0
19 Mar 2025
VenusFactory: A Unified Platform for Protein Engineering Data Retrieval and Language Model Fine-Tuning
Y. Tan
Chen Liu
Jingyuan Gao
Banghao Wu
Mingchen Li
...
Lingrong Zhang
Huiqun Yu
Guisheng Fan
Liang Hong
Bingxin Zhou
55
1
0
19 Mar 2025
Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization
Feifei Li
Mi Zhang
Yiming Sun
Min Yang
DiffM
59
1
0
19 Mar 2025
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
R. Zhao
Junliang Ye
Zihan Wang
Guangce Liu
Yiwen Chen
Yikai Wang
Jun Zhu
AI4CE
45
0
0
19 Mar 2025
MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space
Lixing Xiao
Shunlin Lu
Huaijin Pi
Ke Fan
Liang Pan
Yueer Zhou
Ziyong Feng
Xiaowei Zhou
Sida Peng
Jingbo Wang
DiffM
VGen
50
4
0
19 Mar 2025
Behaviour Discovery and Attribution for Explainable Reinforcement Learning
Rishav Rishav
Somjit Nath
Vincent Michalski
Samira Ebrahimi Kahou
FAtt
OffRL
70
0
0
19 Mar 2025
Towards Unified Latent Space for 3D Molecular Latent Diffusion Modeling
Yanchen Luo
Zhiyuan Liu
Yi Zhao
Sihang Li
Kenji Kawaguchi
Tat-Seng Chua
Xuben Wang
MedIm
69
0
0
19 Mar 2025
Cube: A Roblox View of 3D Intelligence
Foundation AI Team Roblox
Kiran Bhat
Nishchaie Khanna
Karun Channa
Tinghui Zhou
...
Kyle Price
Steve Han
Yiqing Wang
A. Singh
David Baszucki
66
0
0
19 Mar 2025
QINCODEC: Neural Audio Compression with Implicit Neural Codebooks
Zineb Lahrichi
Gaëtan Hadjeres
Gaël Richard
Geoffroy Peeters
47
0
0
19 Mar 2025
DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies
Wei Song
Yansen Wang
Zijia Song
Yadong Li
Haoze Sun
Xin Wu
Zenan Zhou
Jianhua Xu
Jiaqi Wang
Kaicheng Yu
60
2
0
18 Mar 2025
SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and Editing
Seokhyeon Hong
Chaelin Kim
Serin Yoon
Junghyun Nam
Sihun Cha
Junyong Noh
DiffM
VGen
73
1
0
18 Mar 2025
Quantization-Free Autoregressive Action Transformer
Ziyad Sheebaelhamd
Michael Tschannen
Michael Muehlebach
Claire Vernade
49
0
0
18 Mar 2025
HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding
Jiahe Zhao
Ruibing Hou
Zejie Tian
Hong Chang
Shiguang Shan
45
0
0
17 Mar 2025
Progressive Human Motion Generation Based on Text and Few Motion Frames
Ling-an Zeng
Gaojie Wu
Ancong Wu
Jian-Fang Hu
Wei-Shi Zheng
64
1
0
17 Mar 2025
Next-Scale Autoregressive Models are Zero-Shot Single-Image Object View Synthesizers
Shiran Yuan
Hao Zhao
DiffM
54
0
0
17 Mar 2025
Versatile Physics-based Character Control with Hybrid Latent Representation
Jinseok Bae
Jungdam Won
Donggeun Lim
I. Hwang
Y. Kim
44
0
0
17 Mar 2025
Measuring In-Context Computation Complexity via Hidden State Prediction
Vincent Herrmann
Róbert Csordás
Jürgen Schmidhuber
44
0
0
17 Mar 2025
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
Tao Wang
Changxu Cheng
Lingfeng Wang
Senda Chen
Wuyue Zhao
VLM
72
0
0
17 Mar 2025
3D Human Interaction Generation: A Survey
Siyuan Fan
Wenke Huang
Xiantao Cai
Bo Du
VGen
58
0
0
17 Mar 2025
Dense Policy: Bidirectional Autoregressive Learning of Actions
Yue Su
Xinyu Zhan
Hongjie Fang
Han Xue
Hao-Shu Fang
Yong Li
Cewu Lu
Lixin Yang
VGen
57
3
0
17 Mar 2025
R3-Avatar: Record and Retrieve Temporal Codebook for Reconstructing Photorealistic Human Avatars
Yifan Zhan
Wangze Xu
Qingtian Zhu
Muyao Niu
Mingze Ma
Yifei Liu
Zhihang Zhong
Xiao-Fu Sun
Yinqiang Zheng
74
0
0
17 Mar 2025
BREEN: Bridge Data-Efficient Encoder-Free Multimodal Learning with Learnable Queries
Tianle Li
Yongming Rao
Winston Hu
Yu Cheng
MLLM
68
0
0
16 Mar 2025
Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
Xue Jiang
Xiulian Peng
Yuan Zhang
Yan-Heng Lu
SSL
85
0
0
15 Mar 2025
Human-in-the-Loop Local Corrections of 3D Scene Layouts via Infilling
Christopher Xie
A. Avetisyan
Henry Howard-Jenkins
Yawar Siddiqui
Julian Straub
Richard Newcombe
Vasileios Balntas
Jakob Julian Engel
3DH
3DV
70
0
0
14 Mar 2025
TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing
S. Lionar
Jiabin Liang
G. Lee
3DPC
54
0
0
14 Mar 2025
Previous
1
2
3
4
5
...
53
54
55
Next