ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.00937
  4. Cited By
Neural Discrete Representation Learning

Neural Discrete Representation Learning

2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
    BDL
    SSL
    OCL
ArXivPDFHTML

Papers citing "Neural Discrete Representation Learning"

50 / 2,747 papers shown
Title
MARRS: Masked Autoregressive Unit-based Reaction Synthesis
MARRS: Masked Autoregressive Unit-based Reaction Synthesis
Y.B. Wang
S Wang
J.N. Zhang
J.F. Wu
Q.D. He
C.C. Fu
C.J. Wang
Y. Liu
12
0
0
16 May 2025
EA-3DGS: Efficient and Adaptive 3D Gaussians with Highly Enhanced Quality for outdoor scenes
EA-3DGS: Efficient and Adaptive 3D Gaussians with Highly Enhanced Quality for outdoor scenes
Jianlin Guo
Haihong Xiao
Wenxiong Kang
3DGS
29
0
0
16 May 2025
Self-supervised perception for tactile skin covered dexterous hands
Self-supervised perception for tactile skin covered dexterous hands
Akash Sharma
Carolina Higuera
Chaithanya Krishna Bodduluri
Ziqiang Liu
Taosha Fan
...
Byron Boots
Michael Kaess
Tingfan Wu
Francois Robert Hogan
Mustafa Mukadam
SSL
22
0
0
16 May 2025
TACO: Rethinking Semantic Communications with Task Adaptation and Context Embedding
TACO: Rethinking Semantic Communications with Task Adaptation and Context Embedding
Achintha Wijesinghe
Weiwei Wang
Suchinthaka Wanninayaka
Songyang Zhang
Zhi Ding
12
0
0
16 May 2025
An Introduction to Discrete Variational Autoencoders
An Introduction to Discrete Variational Autoencoders
Alan Jeffares
Liyuan Liu
DRL
BDL
CML
41
0
0
15 May 2025
Multi-Token Prediction Needs Registers
Multi-Token Prediction Needs Registers
Anastasios Gerontopoulos
Spyros Gidaris
N. Komodakis
24
0
0
15 May 2025
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
Jun Guo
Xiaojian Ma
Yikai Wang
Min Yang
Huaping Liu
Qing Li
VGen
32
0
0
15 May 2025
Text-driven Motion Generation: Overview, Challenges and Directions
Text-driven Motion Generation: Overview, Challenges and Directions
Ali Rida Sahili
Najett Neji
Hedi Tabia
VGen
38
0
0
14 May 2025
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
Julian Tanke
Takashi Shibuya
Kengo Uchida
Koichi Saito
Yuki Mitsufuji
Mamba
47
0
0
14 May 2025
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
Ayush K. Rai
Kyle Min
Tarun Krishna
Feiyan Hu
Alan F. Smeaton
Noel E. O'Connor
VGen
31
0
0
13 May 2025
Interest Changes: Considering User Interest Life Cycle in Recommendation System
Interest Changes: Considering User Interest Life Cycle in Recommendation System
Yinjiang Cai
Jiangpan Hou
Yangping Zhu
Yuan Nie
19
0
0
13 May 2025
EventDiff: A Unified and Efficient Diffusion Model Framework for Event-based Video Frame Interpolation
EventDiff: A Unified and Efficient Diffusion Model Framework for Event-based Video Frame Interpolation
Hanle Zheng
Xujie Han
Zegang Peng
Shangbin Zhang
Guangxun Du
Zhuo Zou
Xinbing Wang
Jibin Wu
Hao Guo
Lei Deng
DiffM
VGen
53
0
0
13 May 2025
Towards Foundation Models for Experimental Readout Systems Combining Discrete and Continuous Data
Towards Foundation Models for Experimental Readout Systems Combining Discrete and Continuous Data
J. Giroux
C. Fanelli
26
0
0
13 May 2025
M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis
M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion Synthesis
Zhizhuo Yin
Yuk Hang Tsui
Pan Hui
SLR
VGen
21
0
0
13 May 2025
TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection
TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection
Wenkui Yang
Zhida Zhang
Xiaoqiang Zhou
Junxian Duan
Jie Cao
DiffM
33
0
0
13 May 2025
AI and Generative AI Transforming Disaster Management: A Survey of Damage Assessment and Response Techniques
AI and Generative AI Transforming Disaster Management: A Survey of Damage Assessment and Response Techniques
Aman Raj
Lakshit Arora
Sanjay Surendranath Girija
Shashank Kapoor
Dipen Pradhan
Ankit Shetgaonkar
24
0
0
13 May 2025
Continuous Visual Autoregressive Generation via Score Maximization
Continuous Visual Autoregressive Generation via Score Maximization
Chenze Shao
Fandong Meng
Jie Zhou
DiffM
31
1
0
12 May 2025
Metrics that matter: Evaluating image quality metrics for medical image generation
Metrics that matter: Evaluating image quality metrics for medical image generation
Yash Deo
Yan Jia
T. Lassila
William A. P. Smith
T. Lawton
Siyuan Kang
Alejandro F. Frangi
Ibrahim Habli
46
0
0
12 May 2025
Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding
Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding
Dianwen Ng
Kun Zhou
Yi-Wen Chao
Zhiwei Xiong
B. Ma
E. Chng
36
0
0
12 May 2025
Latent Behavior Diffusion for Sequential Reaction Generation in Dyadic Setting
Latent Behavior Diffusion for Sequential Reaction Generation in Dyadic Setting
Minh-Duc Nguyen
Hyung-Jeong Yang
Soo-Hyung Kim
Ji-Eun Shin
Seung-Won Kim
DiffM
36
0
0
12 May 2025
H$^{\mathbf{3}}$DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning
H3^{\mathbf{3}}3DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning
Yiyang Lu
Yufeng Tian
Zhecheng Yuan
Xinyu Wang
Pu Hua
Zhengrong Xue
Huazhe Xu
31
0
0
12 May 2025
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder
Bowen Zhang
Congchao Guo
Geng Yang
Hang Yu
Haozhe Zhang
...
Yichen Xiao
Yiying Zhou
Yuyao Zhang
Yuan Lu
Yucen He
26
0
0
12 May 2025
Joint Low-level and High-level Textual Representation Learning with Multiple Masking Strategies
Joint Low-level and High-level Textual Representation Learning with Multiple Masking Strategies
Zhengmi Tang
Yuto Mitsui
Tomo Miyazaki
S. Omachi
34
0
0
11 May 2025
Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach
Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach
Minting Pan
Yitao Zheng
Jiajian Li
Yunbo Wang
Xiaokang Yang
OffRL
48
0
0
10 May 2025
A Short Overview of Multi-Modal Wi-Fi Sensing
A Short Overview of Multi-Modal Wi-Fi Sensing
Zijian Zhao
31
0
0
10 May 2025
ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images
ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images
Xianghao Kong
Qiaosong Qi
Yuanbin Wang
Anyi Rao
Biaolong Chen
Aixi Zhang
Si Liu
Hao Jiang
DiffM
VGen
25
0
0
10 May 2025
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
Qingwen Bu
Yanting Yang
Jisong Cai
Shenyuan Gao
Guanghui Ren
Maoqing Yao
Ping Luo
Hongyang Li
143
1
0
09 May 2025
Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation
Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation
Kunpeng Qiu
Zhiqiang Gao
Zhiying Zhou
Mingjie Sun
Yongxin Guo
MedIm
36
0
0
09 May 2025
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Accurate and Efficient Multivariate Time Series Forecasting via Offline Clustering
Yiming Niu
Jinliang Deng
L. Zhang
Zimu Zhou
Yongxin Tong
AI4TS
26
0
0
09 May 2025
ReactDance: Progressive-Granular Representation for Long-Term Coherent Reactive Dance Generation
ReactDance: Progressive-Granular Representation for Long-Term Coherent Reactive Dance Generation
Jingzhong Lin
Yuanyuan Qi
Xinru Li
Wenxuan Huang
Xiangfeng Xu
Bangyan Li
Xuejiao Wang
Gaoqi He
31
0
0
08 May 2025
CLAM: Continuous Latent Action Models for Robot Learning from Unlabeled Demonstrations
CLAM: Continuous Latent Action Models for Robot Learning from Unlabeled Demonstrations
Anthony Liang
Pavel Czempin
Matthew Hong
Yutai Zhou
Erdem Biyik
Stephen Tu
47
0
0
08 May 2025
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
Haokun Lin
Teng Wang
Yixiao Ge
Yuying Ge
Zhichao Lu
Ying Wei
Qingfu Zhang
Zhenan Sun
Ying Shan
MLLM
VLM
70
0
0
08 May 2025
The Moon's Many Faces: A Single Unified Transformer for Multimodal Lunar Reconstruction
The Moon's Many Faces: A Single Unified Transformer for Multimodal Lunar Reconstruction
Tom Sander
Moritz Tenthoff
Kay Wohlfarth
Christian Wöhler
31
0
0
08 May 2025
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes
Zhuocheng Gong
Jian Guan
Wei Wu
Huishuai Zhang
Dongyan Zhao
64
1
0
08 May 2025
X-Driver: Explainable Autonomous Driving with Vision-Language Models
X-Driver: Explainable Autonomous Driving with Vision-Language Models
Wei Liu
J. A. Zhang
Binxiong Zheng
Yufeng Hu
Yingzhan Lin
Zengfeng Zeng
VLM
LRM
60
0
0
08 May 2025
AI-Generated Fall Data: Assessing LLMs and Diffusion Model for Wearable Fall Detection
AI-Generated Fall Data: Assessing LLMs and Diffusion Model for Wearable Fall Detection
Sana Alamgeer
Yasine Souissi
Anne H. H. Ngu
42
0
0
07 May 2025
ALFEE: Adaptive Large Foundation Model for EEG Representation
ALFEE: Adaptive Large Foundation Model for EEG Representation
Wei Xiong
Junming Lin
Jiangtong Li
Jie Li
Changjun Jiang
33
0
0
07 May 2025
Occupancy World Model for Robots
Occupancy World Model for Robots
Zhang Zhang
Qiang Zhang
Wei Cui
Shuai Shi
Yijie Guo
...
Hao-Ran Cheng
Xiaozhu Ju
Zhengping Che
Renjing Xu
Jian-Bo Tang
33
0
0
07 May 2025
Uncertainty-Aware Prototype Semantic Decoupling for Text-Based Person Search in Full Images
Uncertainty-Aware Prototype Semantic Decoupling for Text-Based Person Search in Full Images
Zengli Luo
Canlong Zhang
Xiaochun Lu
Zhixin Li
Zhiwen Wang
37
0
0
06 May 2025
From Pixels to Polygons: A Survey of Deep Learning Approaches for Medical Image-to-Mesh Reconstruction
From Pixels to Polygons: A Survey of Deep Learning Approaches for Medical Image-to-Mesh Reconstruction
Fengming Lin
Arezoo Zakeri
Yidan Xue
Michael MacRaild
Haoran Dou
Zherui Zhou
Ziwei Zou
Ali Sarrami-Foroushani
Jinming Duan
Alejandro F Frangi
3DV
MedIm
32
0
0
06 May 2025
Fixed-Length Dense Fingerprint Representation
Fixed-Length Dense Fingerprint Representation
Zhiyu Pan
Xiongjun Guan
Yongjie Duan
Jianjiang Feng
Jie Zhou
37
0
0
06 May 2025
Lane-Wise Highway Anomaly Detection
Lane-Wise Highway Anomaly Detection
Mei Qiu
William Lorenz Reindl
Yaobin Chen
Stanley Y. P. Chien
Shu Hu
41
0
0
05 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Xuzhi Zhang
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
74
0
0
05 May 2025
FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis
FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis
Jiangtong Tan
Hu Yu
Jie Huang
Jie Xiao
Feng Zhao
72
1
0
02 May 2025
FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
Gaoxiang Cong
Liang-Sheng Li
Jiadong Pan
Zhedong Zhang
Amin Beheshti
Anton Van Den Hengel
Yuankai Qi
Qingming Huang
159
0
0
02 May 2025
Voice Cloning: Comprehensive Survey
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
44
0
0
01 May 2025
VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
Shiying Li
Xingqun Qi
Bingkun Yang
Chen Weile
Zezhao Tian
Muyi Sun
Qifeng Liu
Man Zhang
Zhenan Sun
64
0
0
30 Apr 2025
Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion
Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion
Zehua Wang
Alexandre Bruckert
P. Le Callet
Guangtao Zhai
VGen
32
0
0
29 Apr 2025
CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation
CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation
Jianyu Wu
Yizhou Wang
Xiangyu Yue
Xinzhu Ma
J. Guo
Dongzhan Zhou
Wanli Ouyang
Shixiang Tang
68
0
0
29 Apr 2025
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation
Zhe Dong
Yuzhe Sun
Tianzhu Liu
Wangmeng Zuo
Yanfeng Gu
57
0
0
28 Apr 2025
1234...535455
Next