ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.00937
  4. Cited By
Neural Discrete Representation Learning
v1v2 (latest)

Neural Discrete Representation Learning

2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
    BDLSSLOCL
ArXiv (abs)PDFHTML

Papers citing "Neural Discrete Representation Learning"

50 / 3,267 papers shown
Title
Enhancing Spoken Discourse Modeling in Language Models Using Gestural Cues
Varsha Suresh
Muhammad Hamza Mughal
Christian Theobalt
Vera Demberg
95
0
0
05 Mar 2025
Teaching Metric Distance to Autoregressive Multimodal Foundational Models
Teaching Metric Distance to Autoregressive Multimodal Foundational Models
Jiwan Chung
Saejin Kim
Yongrae Jo
Jinho Park
Dongjun Min
Youngjae Yu
256
0
0
04 Mar 2025
MindSimulator: Exploring Brain Concept Localization via Synthetic FMRI
Guangyin Bao
Qi Zhang
Z. Gong
Zhuojia Wu
Duoqian Miao
106
1
0
04 Mar 2025
Undertrained Image Reconstruction for Realistic Degradation in Blind Image Super-Resolution
Ru Ito
Supatta Viriyavisuthisakul
K. Kawamoto
Hiroshi Kera
114
0
0
04 Mar 2025
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
Kaiwen Zheng
Yongxin Chen
Huayu Chen
Guande He
Xuan Li
Jun Zhu
Qinsheng Zhang
DiffM
159
3
0
03 Mar 2025
DLF: Extreme Image Compression with Dual-generative Latent Fusion
Naifu Xue
Zhaoyang Jia
Jiahao Li
Bin Li
Yuan Zhang
Yan Lu
111
2
0
03 Mar 2025
Lossy Neural Compression for Geospatial Analytics: A Review
Carlos Gomes
Isabelle Wittmann
Damien Robert
Johannes Jakubik
Tim Reichelt
...
Romeo Kienzler
Rania Briq
Sabrina Benassou
Michele Lazzarini
C. Albrecht
150
2
0
03 Mar 2025
Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text
Guotao Liang
Baoquan Zhang
Zhiyuan Wen
Junteng Zhao
Yunming Ye
Kola Ye
Yao He
96
0
0
03 Mar 2025
CAPS: Context-Aware Priority Sampling for Enhanced Imitation Learning in Autonomous Driving
Hamidreza Mirkhani
Behzad Khamidehi
Ehsan Ahmadi
Fazel Arasteh
Mohammed Elmahgiubi
Weize Zhang
Umar Rajguru
Kasra Rezaee
151
0
0
03 Mar 2025
Action Tokenizer Matters in In-Context Imitation Learning
An Vuong
M. Vu
Dong An
Ian Reid
121
1
0
03 Mar 2025
Surgical Vision World Model
Saurabh Koju
Saurav Bastola
Prashant Shrestha
Sanskar Amgain
Yash Raj Shrestha
Rudra P. K. Poudel
Binod Bhattarai
VGenMedIm
118
0
0
03 Mar 2025
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens
Xiang Wang
Mingqi Jiang
Zejun Ma
Ziyu Zhang
Shixuan Liu
...
Zhifei Li
Xie Chen
Lei Xie
Yu Guo
Wei Xue
132
22
0
03 Mar 2025
FlowDec: A flow-based full-band general audio codec with high perceptual quality
Simon Welker
Matthew Le
Ricky T. Q. Chen
Wei-Ning Hsu
Timo Gerkmann
Alexander Richard
Yi-Chiao Wu
103
1
0
03 Mar 2025
Evaluating and Predicting Distorted Human Body Parts for Generated Images
Lu Ma
Kaibo Cao
Hao Liang
Jiaxin Lin
Zhiyu Li
Yuhong Liu
Jihong Zhang
Wentao Zhang
Tengjiao Wang
MedIm
97
0
0
02 Mar 2025
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Ziyang Zhang
Yang Yu
Yucheng Chen
Xulei Yang
S. Yeo
MedIm
187
2
0
02 Mar 2025
Revisiting CAD Model Generation by Learning Raster Sketch
Pu Li
Wenhao Zhang
Jianwei Guo
Jinglu Chen
Dong-Ming Yan
3DV
70
1
0
02 Mar 2025
Wavelet-Driven Masked Image Modeling: A Path to Efficient Visual Representation
Wenzhao Xiang
Chang Liu
Hongyang Yu
Xilin Chen
84
0
0
02 Mar 2025
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
Alexander H. Liu
Sang-gil Lee
Chao-Han Huck Yang
Yuan Gong
Yu-Chun Wang
James Glass
Rafael Valle
Bryan Catanzaro
SSL
103
1
0
02 Mar 2025
LesionDiffusion: Towards Text-controlled General Lesion Synthesis
LesionDiffusion: Towards Text-controlled General Lesion Synthesis
Henrui Tian
Wenhui Lei
Linrui Dai
Hanyu Chen
Xiaofan Zhang
DiffMMedIm
86
0
0
02 Mar 2025
Towards High-fidelity 3D Talking Avatar with Personalized Dynamic Texture
X. Li
Jianyu Wang
Yuhao Cheng
Yikun Zeng
X. Ren
W. Zhu
Weiming Zhao
Yichao Yan
60
0
0
01 Mar 2025
Discrete Codebook World Models for Continuous Control
Aidan Scannell
Mohammadreza Nakhaei
Kalle Kujanpää
Yi Zhao
Kevin Sebastian Luck
Dieter Büchler
Joni Pajarinen
OffRL
98
2
0
01 Mar 2025
Protein Structure Tokenization: Benchmarking and New Recipe
Protein Structure Tokenization: Benchmarking and New Recipe
Xinyu Yuan
Zichen Wang
Marcus Collins
Huzefa Rangwala
62
1
0
28 Feb 2025
Spatial Reasoning with Denoising Models
Spatial Reasoning with Denoising Models
Christopher Wewer
Bart Pogodzinski
Bernt Schiele
J. E. Lenssen
DiffMLRM
149
1
0
28 Feb 2025
TimesBERT: A BERT-Style Foundation Model for Time Series Understanding
TimesBERT: A BERT-Style Foundation Model for Time Series Understanding
Haoran Zhang
Yong Liu
Yunzhong Qiu
Haixuan Liu
Zhongyi Pei
Jianmin Wang
Mingsheng Long
AI4TS
70
1
0
28 Feb 2025
PaliGemma-CXR: A Multi-task Multimodal Model for TB Chest X-ray Interpretation
Denis Musinguzi
Andrew Katumba
Sudi Murindanyi
70
0
0
28 Feb 2025
SemiSAM+: Rethinking Semi-Supervised Medical Image Segmentation in the Era of Foundation Models
SemiSAM+: Rethinking Semi-Supervised Medical Image Segmentation in the Era of Foundation Models
Yichi Zhang
Bohao Lv
Le Xue
Wenbo Zhang
Yuchen Liu
Yu Fu
Yuan Cheng
Yuan Qi
VLMMedIm
99
0
0
28 Feb 2025
Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction
Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction
Baiting Luo
Ava Pettet
Aron Laszka
A. Dubey
Ayan Mukhopadhyay
OffRL
92
1
0
28 Feb 2025
Advancing AI-Powered Medical Image Synthesis: Insights from MedVQA-GI Challenge Using CLIP, Fine-Tuned Stable Diffusion, and Dream-Booth + LoRA
Advancing AI-Powered Medical Image Synthesis: Insights from MedVQA-GI Challenge Using CLIP, Fine-Tuned Stable Diffusion, and Dream-Booth + LoRA
Ojonugwa Oluwafemi Ejiga Peter
Md Mahmudur Rahman
Fahmi Khalifa
DiffMMedIm
102
1
0
28 Feb 2025
PI-HMR: Towards Robust In-bed Temporal Human Shape Reconstruction with Contact Pressure Sensing
PI-HMR: Towards Robust In-bed Temporal Human Shape Reconstruction with Contact Pressure Sensing
Ziyu Wu
Yufan Xiong
Mengting Niu
Fangting Xie
Quan Wan
Qijun Ying
Boyan Liu
Xiaohui Cai
3DH
85
0
0
27 Feb 2025
ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model
ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model
Xuangeng Chu
Nabarun Goswami
Ziteng Cui
Hanqin Wang
Tatsuya Harada
DiffM
159
0
0
27 Feb 2025
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
Siyu Jiao
Gengwei Zhang
Yinlong Qian
Jiancheng Huang
Yao Zhao
Humphrey Shi
Lin Ma
Y. X. Wei
Zequn Jie
VLM
106
6
0
27 Feb 2025
Vector-Quantized Vision Foundation Models for Object-Centric Learning
Vector-Quantized Vision Foundation Models for Object-Centric Learning
Rongzhen Zhao
V. Wang
Arno Solin
Joni Pajarinen
OCLVLM
572
1
0
27 Feb 2025
Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
Heeseung Kim
Che Hyun Lee
Sangkwon Park
Jiheum Yeom
Nohil Park
Sangwon Yu
Sungroh Yoon
139
1
0
27 Feb 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Sucheng Ren
Qihang Yu
Ju He
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
VGen
233
11
0
27 Feb 2025
UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook
UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook
Yiheng Jiang
Qian Chen
Shengpeng Ji
Yu Xi
Wen Wang
Chuxu Zhang
Xianghu Yue
Shiliang Zhang
Haoyang Li
103
1
0
27 Feb 2025
3D Nephrographic Image Synthesis in CT Urography with the Diffusion Model and Swin Transformer
3D Nephrographic Image Synthesis in CT Urography with the Diffusion Model and Swin Transformer
Hongkun Yu
Syed Jamal Safdar Gardezi
E. J. Abel
D. D. Shapiro
Meghan G. Lubner
...
Matthew Smith
G. Toia
Lu Mao
Pallavi Tiwari
Andrew L. Wentland
MedIm
89
0
0
26 Feb 2025
Task-Driven Semantic Quantization and Imitation Learning for Goal-Oriented Communications
Task-Driven Semantic Quantization and Imitation Learning for Goal-Oriented Communications
Yu-Chieh Chao
Yubei Chen
Weiwei Wang
Achintha Wijesinghe
Suchinthaka Wanninayaka
Songyang Zhang
Zhi Ding
DiffM
130
0
0
25 Feb 2025
Deep Learning-Powered Electrical Brain Signals Analysis: Advancing Neurological Diagnostics
Deep Learning-Powered Electrical Brain Signals Analysis: Advancing Neurological Diagnostics
Jiahe Li
Xin Chen
Fanqi Shen
Junru Chen
Y. Liu
Daoze Zhang
Zhizhang Yuan
F. Zhao
Meng Li
Yang Yang
157
1
0
24 Feb 2025
DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications
DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications
Ibrahim Fayad
Max Zimmer
Martin Schwartz
P. Ciais
Fabian Gieseke
Gabriel Belouze
Sarah Brood
A. D. Truchis
Alexandre d’Aspremont
AI4TS
104
0
0
24 Feb 2025
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
Zhengqing Wang
Jiacheng Chen
Yasutaka Furukawa
156
8
0
24 Feb 2025
X-Dancer: Expressive Music to Human Dance Video Generation
X-Dancer: Expressive Music to Human Dance Video Generation
Zeyuan Chen
Hongyi Xu
Guoxian Song
You Xie
Chenxu Zhang
Xiusi Chen
Chao Wang
Di Chang
Linjie Luo
VGen
98
1
0
24 Feb 2025
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
Florent Bartoccioni
Elias Ramzi
Victor Besnier
Shashanka Venkataramanan
Tuan-Hung Vu
...
Mickael Chen
Éloi Zablocki
Andrei Bursuc
Eduardo Valle
Matthieu Cord
VGen
187
2
0
24 Feb 2025
A Pragmatic Note on Evaluating Generative Models with Fréchet Inception Distance for Retinal Image Synthesis
A Pragmatic Note on Evaluating Generative Models with Fréchet Inception Distance for Retinal Image Synthesis
Yuli Wu
Fucheng Liu
Rüveyda Yilmaz
Henning Konermann
Peter Walter
Johannes Stegmaier
EGVMMedIm
139
2
0
24 Feb 2025
Predicting the Energy Landscape of Stochastic Dynamical System via Physics-informed Self-supervised Learning
Predicting the Energy Landscape of Stochastic Dynamical System via Physics-informed Self-supervised Learning
Ruikun Li
Huandong Wang
Qingmin Liao
Yong Li
59
2
0
24 Feb 2025
Single-Channel EEG Tokenization Through Time-Frequency Modeling
Single-Channel EEG Tokenization Through Time-Frequency Modeling
Jathurshan Pradeepkumar
Xihao Piao
Zheng Chen
Jimeng Sun
118
2
0
22 Feb 2025
ESANS: Effective and Semantic-Aware Negative Sampling for Large-Scale Retrieval Systems
Haibo Xing
Kanefumi Matsuyama
Hao Deng
Jinxin Hu
Yu Zhang
Xiaoyi Zeng
86
0
0
22 Feb 2025
Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens
Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens
Ziwei Shan
Yaoyu He
Chengfeng Zhao
Jiashen Du
Jingyan Zhang
Qixuan Zhang
Jingyi Yu
Lan Xu
117
1
0
22 Feb 2025
CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers
D. She
Mushui Liu
Jingxuan Pang
Jin Wang
Zhen Yang
...
Yi Wang
Qihan Huang
Haobin Tang
YunLong Yu
Siming Fu
VGen
224
5
0
21 Feb 2025
DeepFracture: A Generative Approach for Predicting Brittle Fractures with Neural Discrete Representation Learning
DeepFracture: A Generative Approach for Predicting Brittle Fractures with Neural Discrete Representation Learning
Yuhang Huang
Takashi Kanai
AI4CE
137
1
0
21 Feb 2025
Animate Your Thoughts: Decoupled Reconstruction of Dynamic Natural Vision from Slow Brain Activity
Animate Your Thoughts: Decoupled Reconstruction of Dynamic Natural Vision from Slow Brain Activity
Yizhuo Lu
Changde Du
Chong Wang
Xuanliu Zhu
Liuyun Jiang
Xujin Li
Huiguang He
VGen
236
4
0
20 Feb 2025
Previous
123...8910...646566
Next