ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.00937
  4. Cited By
Neural Discrete Representation Learning
v1v2 (latest)

Neural Discrete Representation Learning

2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
    BDLSSLOCL
ArXiv (abs)PDFHTML

Papers citing "Neural Discrete Representation Learning"

50 / 3,267 papers shown
Title
EA-3DGS: Efficient and Adaptive 3D Gaussians with Highly Enhanced Quality for outdoor scenes
EA-3DGS: Efficient and Adaptive 3D Gaussians with Highly Enhanced Quality for outdoor scenes
Jianlin Guo
Haihong Xiao
Wenxiong Kang
3DGS
133
1
0
16 May 2025
Self-supervised perception for tactile skin covered dexterous hands
Self-supervised perception for tactile skin covered dexterous hands
Akash Sharma
Carolina Higuera
Chaithanya Krishna Bodduluri
Ziqiang Liu
Taosha Fan
...
Byron Boots
Michael Kaess
Tingfan Wu
Francois Robert Hogan
Mustafa Mukadam
SSL
86
2
0
16 May 2025
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
Jun Guo
Xiaojian Ma
Yikai Wang
Min Yang
Huaping Liu
Qing Li
VGen
81
0
0
15 May 2025
An Introduction to Discrete Variational Autoencoders
An Introduction to Discrete Variational Autoencoders
Alan Jeffares
Liyuan Liu
DRLBDLCML
67
0
0
15 May 2025
Multi-Token Prediction Needs Registers
Multi-Token Prediction Needs Registers
Anastasios Gerontopoulos
Spyros Gidaris
N. Komodakis
120
0
0
15 May 2025
Text-driven Motion Generation: Overview, Challenges and Directions
Text-driven Motion Generation: Overview, Challenges and Directions
Ali Rida Sahili
Najett Neji
Hedi Tabia
VGen
81
0
0
14 May 2025
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
Julian Tanke
Takashi Shibuya
Kengo Uchida
Koichi Saito
Yuki Mitsufuji
Mamba
90
0
0
14 May 2025
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
Ayush K. Rai
Kyle Min
Tarun Krishna
Feiyan Hu
Alan F. Smeaton
Noel E. O'Connor
VGen
109
0
0
13 May 2025
TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection
TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection
Wenkui Yang
Zhida Zhang
Xiaoqiang Zhou
Junxian Duan
Jie Cao
DiffM
158
1
0
13 May 2025
Interest Changes: Considering User Interest Life Cycle in Recommendation System
Interest Changes: Considering User Interest Life Cycle in Recommendation System
Yinjiang Cai
Jiangpan Hou
Yangping Zhu
Yuan Nie
123
0
0
13 May 2025
Towards Foundation Models for Experimental Readout Systems Combining Discrete and Continuous Data
Towards Foundation Models for Experimental Readout Systems Combining Discrete and Continuous Data
J. Giroux
C. Fanelli
55
0
0
13 May 2025
AI and Generative AI Transforming Disaster Management: A Survey of Damage Assessment and Response Techniques
AI and Generative AI Transforming Disaster Management: A Survey of Damage Assessment and Response Techniques
Aman Raj
Lakshit Arora
Sanjay Surendranath Girija
Shashank Kapoor
Dipen Pradhan
Ankit Shetgaonkar
248
0
0
13 May 2025
EventDiff: A Unified and Efficient Diffusion Model Framework for Event-based Video Frame Interpolation
EventDiff: A Unified and Efficient Diffusion Model Framework for Event-based Video Frame Interpolation
Hanle Zheng
Xujie Han
Zegang Peng
Shangbin Zhang
Guangxun Du
Zhuo Zou
Xiang Wang
Jibin Wu
Hao Guo
Lei Deng
DiffMVGen
89
0
0
13 May 2025
M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis
M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis
Zhizhuo Yin
Yuk Hang Tsui
Pan Hui
SLRVGen
66
0
0
13 May 2025
H$^3$DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning
H3^33DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning
Yiyang Lu
Yufeng Tian
Zhecheng Yuan
Xinyu Wang
Pu Hua
Zhengrong Xue
Huazhe Xu
106
1
0
12 May 2025
Metrics that matter: Evaluating image quality metrics for medical image generation
Metrics that matter: Evaluating image quality metrics for medical image generation
Yash Deo
Yan Jia
T. Lassila
William A. P. Smith
T. Lawton
Siyuan Kang
Alejandro F. Frangi
Ibrahim Habli
104
0
0
12 May 2025
Latent Behavior Diffusion for Sequential Reaction Generation in Dyadic Setting
Latent Behavior Diffusion for Sequential Reaction Generation in Dyadic Setting
Minh-Duc Nguyen
Hyung-Jeong Yang
Soo-Hyung Kim
Ji-Eun Shin
Seung-Won Kim
DiffM
86
1
0
12 May 2025
Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding
Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding
Dianwen Ng
Kun Zhou
Yi-Wen Chao
Zhiwei Xiong
B. Ma
Eng Siong Chng
93
0
0
12 May 2025
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder
Bowen Zhang
Congchao Guo
Geng Yang
Hang Yu
Haozhe Zhang
...
Yichen Xiao
Yiying Zhou
Yize Zhang
Yuan Lu
Yucen He
70
1
0
12 May 2025
Continuous Visual Autoregressive Generation via Score Maximization
Continuous Visual Autoregressive Generation via Score Maximization
Chenze Shao
Fandong Meng
Jie Zhou
DiffM
66
1
0
12 May 2025
Joint Low-level and High-level Textual Representation Learning with Multiple Masking Strategies
Joint Low-level and High-level Textual Representation Learning with Multiple Masking Strategies
Zhengmi Tang
Yuto Mitsui
Tomo Miyazaki
S. Omachi
96
0
0
11 May 2025
ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images
ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images
Xianghao Kong
Qiaosong Qi
Yuanbin Wang
Anyi Rao
Biaolong Chen
Aixi Zhang
Si Liu
Hao Jiang
DiffMVGen
71
1
0
10 May 2025
A Short Overview of Multi-Modal Wi-Fi Sensing
A Short Overview of Multi-Modal Wi-Fi Sensing
Zijian Zhao
83
0
0
10 May 2025
Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach
Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach
Minting Pan
Yitao Zheng
Jiajian Li
Yunbo Wang
Xiaokang Yang
OffRL
132
0
0
10 May 2025
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
Qingwen Bu
Yanting Yang
Jisong Cai
Shenyuan Gao
Guanghui Ren
Maoqing Yao
Ping Luo
Hongyang Li
429
10
0
09 May 2025
Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation
Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation
Kunpeng Qiu
Zhiqiang Gao
Zhiying Zhou
Mingjie Sun
Yongxin Guo
MedIm
147
0
0
09 May 2025
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Yiming Niu
Jinliang Deng
Lulu Zhang
Zimu Zhou
Yongxin Tong
AI4TS
167
0
0
09 May 2025
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
Haokun Lin
Teng Wang
Yixiao Ge
Yuying Ge
Zhichao Lu
Ying Wei
Qingfu Zhang
Zhenan Sun
Ying Shan
MLLMVLM
157
5
0
08 May 2025
CLAM: Continuous Latent Action Models for Robot Learning from Unlabeled Demonstrations
CLAM: Continuous Latent Action Models for Robot Learning from Unlabeled Demonstrations
Anthony Liang
Pavel Czempin
Matthew Hong
Yutai Zhou
Erdem Biyik
Stephen Tu
154
1
0
08 May 2025
ReactDance: Progressive-Granular Representation for Long-Term Coherent Reactive Dance Generation
ReactDance: Progressive-Granular Representation for Long-Term Coherent Reactive Dance Generation
Jingzhong Lin
Yuanyuan Qi
Xinru Li
Wenxuan Huang
Xiangfeng Xu
Bangyan Li
Xuejiao Wang
Gaoqi He
70
0
0
08 May 2025
X-Driver: Explainable Autonomous Driving with Vision-Language Models
X-Driver: Explainable Autonomous Driving with Vision-Language Models
Wei Liu
Jingyun Zhang
Binxiong Zheng
Yufeng Hu
Yingzhan Lin
Zengfeng Zeng
VLMLRM
135
1
0
08 May 2025
The Moon's Many Faces: A Single Unified Transformer for Multimodal Lunar Reconstruction
The Moon's Many Faces: A Single Unified Transformer for Multimodal Lunar Reconstruction
Tom Sander
Moritz Tenthoff
Kay Wohlfarth
Christian Wöhler
119
0
0
08 May 2025
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes
Zhuocheng Gong
Jian Guan
Wei Wu
Huishuai Zhang
Dongyan Zhao
106
1
0
08 May 2025
Distillation-Enabled Knowledge Alignment Protocol for Semantic Communication in AI Agent Networks
Distillation-Enabled Knowledge Alignment Protocol for Semantic Communication in AI Agent Networks
Jingzhi Hu
Geoffrey Ye Li
60
0
0
07 May 2025
Occupancy World Model for Robots
Occupancy World Model for Robots
Zhang Zhang
Qiang Zhang
Wei Cui
Shuai Shi
Yijie Guo
...
Hao-Ran Cheng
Xiaozhu Ju
Zhengping Che
Renjing Xu
Jian-Bo Tang
128
0
0
07 May 2025
AI-Generated Fall Data: Assessing LLMs and Diffusion Model for Wearable Fall Detection
AI-Generated Fall Data: Assessing LLMs and Diffusion Model for Wearable Fall Detection
Sana Alamgeer
Yasine Souissi
Anne H. H. Ngu
76
0
0
07 May 2025
ALFEE: Adaptive Large Foundation Model for EEG Representation
ALFEE: Adaptive Large Foundation Model for EEG Representation
Wei Xiong
Junming Lin
Jiangtong Li
Jie Li
Changjun Jiang
124
0
0
07 May 2025
Fixed-Length Dense Fingerprint Representation
Fixed-Length Dense Fingerprint Representation
Zhiyu Pan
Xiongjun Guan
Yongjie Duan
Jianjiang Feng
Jie Zhou
52
0
0
06 May 2025
Uncertainty-Aware Prototype Semantic Decoupling for Text-Based Person Search in Full Images
Uncertainty-Aware Prototype Semantic Decoupling for Text-Based Person Search in Full Images
Zengli Luo
Canlong Zhang
Xiaochun Lu
Zhixin Li
Zhiwen Wang
146
0
0
06 May 2025
From Pixels to Polygons: A Survey of Deep Learning Approaches for Medical Image-to-Mesh Reconstruction
From Pixels to Polygons: A Survey of Deep Learning Approaches for Medical Image-to-Mesh Reconstruction
Fengming Lin
Arezoo Zakeri
Yidan Xue
Michael MacRaild
Haoran Dou
Zherui Zhou
Ziwei Zou
Ali Sarrami-Foroushani
Jinming Duan
Alejandro F Frangi
3DVMedIm
86
0
0
06 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
...
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
325
1
0
05 May 2025
Lane-Wise Highway Anomaly Detection
Lane-Wise Highway Anomaly Detection
Mei Qiu
William Lorenz Reindl
Yaobin Chen
Stanley Y. P. Chien
Shu Hu
90
0
0
05 May 2025
FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis
FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis
Jiangtong Tan
Hu Yu
Jie Huang
Jie Xiao
Feng Zhao
140
1
0
02 May 2025
FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
Gaoxiang Cong
Liang-Sheng Li
Jiadong Pan
Zhedong Zhang
Amin Beheshti
Anton Van Den Hengel
Yuankai Qi
Qingming Huang
443
0
0
02 May 2025
Voice Cloning: Comprehensive Survey
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
123
0
0
01 May 2025
VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
Shiying Li
Xingqun Qi
Bingkun Yang
Chen Weile
Zezhao Tian
Muyi Sun
Qifeng Liu
Man Zhang
Zhenan Sun
129
0
0
30 Apr 2025
CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation
CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation
Jianyu Wu
Yizhou Wang
Xiangyu Yue
Xinzhu Ma
Jinpei Guo
Dongzhan Zhou
Wanli Ouyang
Shixiang Tang
159
0
0
29 Apr 2025
Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion
Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion
Zehua Wang
Alexandre Bruckert
P. Le Callet
Guangtao Zhai
VGen
58
0
0
29 Apr 2025
Representation Learning on a Random Lattice
Representation Learning on a Random Lattice
Aryeh Brill
OODFAttAI4CE
128
0
0
28 Apr 2025
Towards Robust Multimodal Physiological Foundation Models: Handling Arbitrary Missing Modalities
Towards Robust Multimodal Physiological Foundation Models: Handling Arbitrary Missing Modalities
Xi Fu
Wei-Bang Jiang
Yi Ding
Cuntai Guan
121
0
0
28 Apr 2025
Previous
12345...646566
Next