ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.00937
  4. Cited By
Neural Discrete Representation Learning

Neural Discrete Representation Learning

2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
    BDL
    SSL
    OCL
ArXivPDFHTML

Papers citing "Neural Discrete Representation Learning"

50 / 2,747 papers shown
Title
Extracting Symbolic Sequences from Visual Representations via Self-Supervised Learning
Victor Sebastian Martinez Pozos
Ivan Vladimir Meza Ruiz
44
0
0
06 Mar 2025
VQEL: Enabling Self-Developed Symbolic Language in Agents through Vector Quantization in Emergent Language Games
Mohammad Mahdi Samiei Paqaleh
Mahdieh Soleymani Baghshah
54
0
0
06 Mar 2025
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Ziyi Yang
Fanqi Wan
Longguang Zhong
Canbin Huang
Guosheng Liang
Xiaojun Quan
MoMe
95
0
0
06 Mar 2025
Boosting Offline Optimizers with Surrogate Sensitivity
Manh Cuong Dao
Phi Le Nguyen
Thao Nguyen Truong
Trong Nghia Hoang
OffRL
62
4
0
06 Mar 2025
Enhancing Spoken Discourse Modeling in Language Models Using Gestural Cues
Varsha Suresh
Muhammad Hamza Mughal
Christian Theobalt
Vera Demberg
56
0
0
05 Mar 2025
Handling Uncertainty in Health Data using Generative Algorithms
Mahdi Arab Loodaricheh
Neh Majmudar
A. Raja
Ansaf Salleb-Aouissi
66
0
0
05 Mar 2025
MindSimulator: Exploring Brain Concept Localization via Synthetic FMRI
Guangyin Bao
Qi Zhang
Z. Gong
Zhuojia Wu
Duoqian Miao
38
0
0
04 Mar 2025
Undertrained Image Reconstruction for Realistic Degradation in Blind Image Super-Resolution
Ru Ito
Supatta Viriyavisuthisakul
K. Kawamoto
Hiroshi Kera
71
0
0
04 Mar 2025
CAPS: Context-Aware Priority Sampling for Enhanced Imitation Learning in Autonomous Driving
Hamidreza Mirkhani
Behzad Khamidehi
Ehsan Ahmadi
Fazel Arasteh
Mohammed Elmahgiubi
Weize Zhang
Umar Rajguru
Kasra Rezaee
57
0
0
03 Mar 2025
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens
Xinbing Wang
Mingqi Jiang
Z. Ma
Ziyu Zhang
Shixuan Liu
...
Zhifei Li
Xie Chen
Lei Xie
Y. Guo
Wei Xue
84
13
0
03 Mar 2025
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
Kaiwen Zheng
Yongxin Chen
Huayu Chen
Guande He
Xuan Li
Jun Zhu
Qinsheng Zhang
DiffM
49
0
0
03 Mar 2025
FlowDec: A flow-based full-band general audio codec with high perceptual quality
Simon Welker
Matthew Le
Ricky T. Q. Chen
Wei-Ning Hsu
Timo Gerkmann
Alexander Richard
Yi-Chiao Wu
63
0
0
03 Mar 2025
Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text
Guotao Liang
Baoquan Zhang
Zhiyuan Wen
Junteng Zhao
Yunming Ye
Kola Ye
Yao He
57
0
0
03 Mar 2025
DLF: Extreme Image Compression with Dual-generative Latent Fusion
Naifu Xue
Zhaoyang Jia
Jiahao Li
Bin Li
Yuan Zhang
Yan-Heng Lu
56
1
0
03 Mar 2025
Surgical Vision World Model
Saurabh Koju
Saurav Bastola
Prashant Shrestha
Sanskar Amgain
Yash Raj Shrestha
Rudra P. K. Poudel
Binod Bhattarai
VGen
MedIm
68
0
0
03 Mar 2025
Action Tokenizer Matters in In-Context Imitation Learning
An Vuong
M. Vu
Dong An
Ian Reid
61
1
0
03 Mar 2025
Lossy Neural Compression for Geospatial Analytics: A Review
Carlos Gomes
Isabelle Wittmann
Damien Robert
Johannes Jakubik
Tim Reichelt
...
Romeo Kienzler
Rania Briq
Sabrina Benassou
Michele Lazzarini
C. Albrecht
96
2
0
03 Mar 2025
Wavelet-Driven Masked Image Modeling: A Path to Efficient Visual Representation
Wenzhao Xiang
Chang Liu
Hongyang Yu
Xilin Chen
36
0
0
02 Mar 2025
Revisiting CAD Model Generation by Learning Raster Sketch
Pu Li
Wenhao Zhang
Jianwei Guo
Jinglu Chen
Dong-Ming Yan
3DV
39
0
0
02 Mar 2025
Evaluating and Predicting Distorted Human Body Parts for Generated Images
Lu Ma
Kaibo Cao
Hao Liang
Jiaxin Lin
Zhiyu Li
Yuhong Liu
Jihong Zhang
Wentao Zhang
Bin Cui
MedIm
44
0
0
02 Mar 2025
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
Alexander H. Liu
Sang-gil Lee
Chao-Han Huck Yang
Yuan Gong
Yu-Chun Wang
James Glass
Rafael Valle
Bryan Catanzaro
SSL
55
0
0
02 Mar 2025
LesionDiffusion: Towards Text-controlled General Lesion Synthesis
LesionDiffusion: Towards Text-controlled General Lesion Synthesis
Henrui Tian
Wenhui Lei
Linrui Dai
Hanyu Chen
Xiaofan Zhang
DiffM
MedIm
47
0
0
02 Mar 2025
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Ziyang Zhang
Yang Yu
Yucheng Chen
Xulei Yang
S. Yeo
MedIm
56
1
0
02 Mar 2025
Discrete Codebook World Models for Continuous Control
Aidan Scannell
Mohammadreza Nakhaei
Kalle Kujanpää
Yi Zhao
Kevin Sebastian Luck
Dieter Büchler
Joni Pajarinen
OffRL
50
1
0
01 Mar 2025
Towards High-fidelity 3D Talking Avatar with Personalized Dynamic Texture
X. Li
Jianyu Wang
Yuhao Cheng
Yikun Zeng
X. Ren
W. Zhu
Weiming Zhao
Yichao Yan
33
0
0
01 Mar 2025
TimesBERT: A BERT-Style Foundation Model for Time Series Understanding
TimesBERT: A BERT-Style Foundation Model for Time Series Understanding
Haoran Zhang
Yong Liu
Yunzhong Qiu
Haixuan Liu
Zhongyi Pei
Jianmin Wang
Mingsheng Long
AI4TS
45
0
0
28 Feb 2025
Advancing AI-Powered Medical Image Synthesis: Insights from MedVQA-GI Challenge Using CLIP, Fine-Tuned Stable Diffusion, and Dream-Booth + LoRA
Advancing AI-Powered Medical Image Synthesis: Insights from MedVQA-GI Challenge Using CLIP, Fine-Tuned Stable Diffusion, and Dream-Booth + LoRA
Ojonugwa Oluwafemi Ejiga Peter
Md Mahmudur Rahman
Fahmi Khalifa
DiffM
MedIm
41
1
0
28 Feb 2025
PaliGemma-CXR: A Multi-task Multimodal Model for TB Chest X-ray Interpretation
Denis Musinguzi
Andrew Katumba
Sudi Murindanyi
36
0
0
28 Feb 2025
Protein Structure Tokenization: Benchmarking and New Recipe
Xinyu Yuan
Zichen Wang
Marcus Collins
Huzefa Rangwala
41
0
0
28 Feb 2025
Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction
Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction
Baiting Luo
Ava Pettet
Aron Laszka
A. Dubey
Ayan Mukhopadhyay
OffRL
48
1
0
28 Feb 2025
Spatial Reasoning with Denoising Models
Spatial Reasoning with Denoising Models
Christopher Wewer
Bart Pogodzinski
Bernt Schiele
J. E. Lenssen
DiffM
LRM
43
0
0
28 Feb 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Sucheng Ren
Qihang Yu
Ju He
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
VGen
83
6
0
27 Feb 2025
ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model
ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model
Xuangeng Chu
Nabarun Goswami
Ziteng Cui
Hanqin Wang
Tatsuya Harada
DiffM
80
0
0
27 Feb 2025
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
Siyu Jiao
Gengwei Zhang
Yinlong Qian
Jiancheng Huang
Yao Zhao
Humphrey Shi
Lin Ma
Y. X. Wei
Zequn Jie
VLM
49
2
0
27 Feb 2025
Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
Heeseung Kim
Che Hyun Lee
S. Park
Jiheum Yeom
Nohil Park
Sangwon Yu
Sungroh Yoon
71
0
0
27 Feb 2025
Vector-Quantized Vision Foundation Models for Object-Centric Learning
Vector-Quantized Vision Foundation Models for Object-Centric Learning
Rongzhen Zhao
V. Wang
Arno Solin
Joni Pajarinen
OCL
VLM
233
0
0
27 Feb 2025
PI-HMR: Towards Robust In-bed Temporal Human Shape Reconstruction with Contact Pressure Sensing
PI-HMR: Towards Robust In-bed Temporal Human Shape Reconstruction with Contact Pressure Sensing
Ziyu Wu
Yufan Xiong
Mengting Niu
Fangting Xie
Quan Wan
Qijun Ying
Boyan Liu
Xiaohui Cai
3DH
41
0
0
27 Feb 2025
UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook
UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook
Y. Jiang
Qian Chen
Shengpeng Ji
Yu Xi
Wen Wang
C. Zhang
Xianghu Yue
Shiliang Zhang
Yiming Li
70
0
0
27 Feb 2025
3D Nephrographic Image Synthesis in CT Urography with the Diffusion Model and Swin Transformer
3D Nephrographic Image Synthesis in CT Urography with the Diffusion Model and Swin Transformer
Hongkun Yu
Syed Jamal Safdar Gardezi
E. J. Abel
D. D. Shapiro
Meghan G. Lubner
...
Matthew Smith
G. Toia
Lu Mao
Pallavi Tiwari
Andrew L. Wentland
MedIm
57
0
0
26 Feb 2025
Task-Driven Semantic Quantization and Imitation Learning for Goal-Oriented Communications
Task-Driven Semantic Quantization and Imitation Learning for Goal-Oriented Communications
Yu-Chieh Chao
Yubei Chen
Weiwei Wang
Achintha Wijesinghe
Suchinthaka Wanninayaka
Songyang Zhang
Zhi Ding
DiffM
76
0
0
25 Feb 2025
X-Dancer: Expressive Music to Human Dance Video Generation
X-Dancer: Expressive Music to Human Dance Video Generation
Zeyuan Chen
Hongyi Xu
Guoxian Song
You Xie
Chenxu Zhang
Xiusi Chen
Chao Wang
Di Chang
Linjie Luo
VGen
43
0
0
24 Feb 2025
Predicting the Energy Landscape of Stochastic Dynamical System via Physics-informed Self-supervised Learning
Predicting the Energy Landscape of Stochastic Dynamical System via Physics-informed Self-supervised Learning
Ruikun Li
Huandong Wang
Qingmin Liao
Yong Li
43
0
0
24 Feb 2025
Deep Learning-Powered Electrical Brain Signals Analysis: Advancing Neurological Diagnostics
Deep Learning-Powered Electrical Brain Signals Analysis: Advancing Neurological Diagnostics
Jiahe Li
Xin Chen
Fanqi Shen
Junru Chen
Y. Liu
Daoze Zhang
Zhizhang Yuan
F. Zhao
Meng Li
Yang Yang
41
0
0
24 Feb 2025
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
Zhengqing Wang
Jiacheng Chen
Yasutaka Furukawa
69
5
0
24 Feb 2025
DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications
DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications
Ibrahim Fayad
Max Zimmer
Martin Schwartz
P. Ciais
Fabian Gieseke
Gabriel Belouze
Sarah Brood
A. D. Truchis
Alexandre d’Aspremont
AI4TS
43
0
0
24 Feb 2025
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
Florent Bartoccioni
Elias Ramzi
Victor Besnier
Shashanka Venkataramanan
Tuan-Hung Vu
...
Mickael Chen
Éloi Zablocki
Andrei Bursuc
Eduardo Valle
Matthieu Cord
VGen
86
1
0
24 Feb 2025
HumanGif: Single-View Human Diffusion with Generative Prior
HumanGif: Single-View Human Diffusion with Generative Prior
Shoukang Hu
Takuya Narihira
Kazumi Fukuda
Ryosuke Sawata
Takashi Shibuya
Yuki Mitsufuji
98
1
0
24 Feb 2025
A Pragmatic Note on Evaluating Generative Models with Fréchet Inception Distance for Retinal Image Synthesis
A Pragmatic Note on Evaluating Generative Models with Fréchet Inception Distance for Retinal Image Synthesis
Yuli Wu
Fucheng Liu
Rüveyda Yilmaz
Henning Konermann
Peter Walter
Johannes Stegmaier
EGVM
MedIm
55
1
0
24 Feb 2025
Single-Channel EEG Tokenization Through Time-Frequency Modeling
Single-Channel EEG Tokenization Through Time-Frequency Modeling
Jathurshan Pradeepkumar
Xihao Piao
Zheng Chen
Jimeng Sun
45
1
0
22 Feb 2025
Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens
Mojito: LLM-Aided Motion Instructor with Jitter-Reduced Inertial Tokens
Ziwei Shan
Yaoyu He
Chengfeng Zhao
Jiashen Du
Jingyan Zhang
Qixuan Zhang
Jingyi Yu
Lan Xu
61
1
0
22 Feb 2025
Previous
123...567...535455
Next