ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.00937
  4. Cited By
Neural Discrete Representation Learning

Neural Discrete Representation Learning

2 November 2017
Aaron van den Oord
Oriol Vinyals
Koray Kavukcuoglu
    BDL
    SSL
    OCL
ArXivPDFHTML

Papers citing "Neural Discrete Representation Learning"

50 / 2,790 papers shown
Title
How to Trace Latent Generative Model Generated Images without Artificial
  Watermark?
How to Trace Latent Generative Model Generated Images without Artificial Watermark?
Zhenting Wang
Vikash Sehwag
Chen Chen
Lingjuan Lyu
Dimitris N. Metaxas
Shiqing Ma
WIGM
47
6
0
22 May 2024
SIGGesture: Generalized Co-Speech Gesture Synthesis via Semantic
  Injection with Large-Scale Pre-Training Diffusion Models
SIGGesture: Generalized Co-Speech Gesture Synthesis via Semantic Injection with Large-Scale Pre-Training Diffusion Models
Qingrong Cheng
Xu Li
Xinghui Fu
DiffM
45
2
0
22 May 2024
DiffNorm: Self-Supervised Normalization for Non-autoregressive
  Speech-to-speech Translation
DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation
Weiting Tan
Jingyu Zhang
Lingfeng Shen
Daniel Khashabi
Philipp Koehn
37
0
0
22 May 2024
Computational Tradeoffs in Image Synthesis: Diffusion, Masked-Token, and
  Next-Token Prediction
Computational Tradeoffs in Image Synthesis: Diffusion, Masked-Token, and Next-Token Prediction
Maciej Kilian
Varun Jampani
Luke Zettlemoyer
DiffM
42
8
0
21 May 2024
ReALLM: A general framework for LLM compression and fine-tuning
ReALLM: A general framework for LLM compression and fine-tuning
Louis Leconte
Lisa Bedin
Van Minh Nguyen
Eric Moulines
MQ
46
0
0
21 May 2024
The Power of Next-Frame Prediction for Learning Physical Laws
The Power of Next-Frame Prediction for Learning Physical Laws
T. Winterbottom
G. Hudson
Daniel Kluvanec
Dean L. Slack
Jamie Sterling
Junjie Shentu
Chenghao Xiao
Zheming Zhou
Noura Al Moubayed
34
1
0
21 May 2024
FAdam: Adam is a natural gradient optimizer using diagonal empirical
  Fisher information
FAdam: Adam is a natural gradient optimizer using diagonal empirical Fisher information
Dongseong Hwang
ODL
37
5
0
21 May 2024
Sparse Autoencoders Enable Scalable and Reliable Circuit Identification
  in Language Models
Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models
Charles OÑeill
Thang Bui
45
5
0
21 May 2024
Diffusion for World Modeling: Visual Details Matter in Atari
Diffusion for World Modeling: Visual Details Matter in Atari
Eloi Alonso
Adam Jelley
Vincent Micheli
Anssi Kanervisto
Amos Storkey
Tim Pearce
Franccois Fleuret
63
46
0
20 May 2024
PT43D: A Probabilistic Transformer for Generating 3D Shapes from Single
  Highly-Ambiguous RGB Images
PT43D: A Probabilistic Transformer for Generating 3D Shapes from Single Highly-Ambiguous RGB Images
Yiheng Xiong
Angela Dai
ViT
32
0
0
20 May 2024
Du-IN: Discrete units-guided mask modeling for decoding speech from
  Intracranial Neural signals
Du-IN: Discrete units-guided mask modeling for decoding speech from Intracranial Neural signals
Hui Zheng
Haiteng Wang
Wei-Bang Jiang
Zhongtao Chen
Li He
Pei-Yang Lin
Peng-Hu Wei
Guo-Guang Zhao
Yun-Zhe Liu
54
1
0
19 May 2024
From Sora What We Can See: A Survey of Text-to-Video Generation
From Sora What We Can See: A Survey of Text-to-Video Generation
Rui Sun
Yumin Zhang
Tejal Shah
Jiahao Sun
Shuoying Zhang
Wenqi Li
Haoran Duan
Bo Wei
R. Ranjan
EGVM
79
20
0
17 May 2024
Libra: Building Decoupled Vision System on Large Language Models
Libra: Building Decoupled Vision System on Large Language Models
Yifan Xu
Xiaoshan Yang
Y. Song
Changsheng Xu
MLLM
VLM
43
8
0
16 May 2024
Learning to Predict Mutation Effects of Protein-Protein Interactions by
  Microenvironment-aware Hierarchical Prompt Learning
Learning to Predict Mutation Effects of Protein-Protein Interactions by Microenvironment-aware Hierarchical Prompt Learning
Lirong Wu
Yijun Tian
Haitao Lin
Yufei Huang
Siyuan Li
Nitesh Chawla
Stan Z. Li
49
4
0
16 May 2024
Efficient Vision-Language Pre-training by Cluster Masking
Efficient Vision-Language Pre-training by Cluster Masking
Zihao Wei
Zixuan Pan
Andrew Owens
VLM
34
8
0
14 May 2024
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species
  Genomic Sequence Modeling
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling
Siyuan Li
Zedong Wang
Zicheng Liu
Di Wu
Cheng Tan
Jiangbin Zheng
Yufei Huang
Stan Z. Li
45
7
0
13 May 2024
Generating Human Motion in 3D Scenes from Text Descriptions
Generating Human Motion in 3D Scenes from Text Descriptions
Zhi Cen
Huaijin Pi
Sida Peng
Zehong Shen
Minghui Yang
Shuai Zhu
Hujun Bao
Xiaowei Zhou
55
19
0
13 May 2024
Establishing a Unified Evaluation Framework for Human Motion Generation:
  A Comparative Analysis of Metrics
Establishing a Unified Evaluation Framework for Human Motion Generation: A Comparative Analysis of Metrics
Ali Ismail-Fawaz
Maxime Devanne
Stefano Berretti
Jonathan Weber
Germain Forestier
EGVM
50
2
0
13 May 2024
Sign Stitching: A Novel Approach to Sign Language Production
Sign Stitching: A Novel Approach to Sign Language Production
Harry Walsh
Ben Saunders
Richard Bowden
65
3
0
13 May 2024
The Lost Melody: Empirical Observations on Text-to-Video Generation From
  A Storytelling Perspective
The Lost Melody: Empirical Observations on Text-to-Video Generation From A Storytelling Perspective
Andrew Shin
Yusuke Mori
Kunitake Kaneko
VGen
EGVM
30
2
0
13 May 2024
Bottleneck-Minimal Indexing for Generative Document Retrieval
Bottleneck-Minimal Indexing for Generative Document Retrieval
Xin Du
Lixin Xiu
Kumiko Tanaka-Ishii
51
2
0
12 May 2024
Training-free Subject-Enhanced Attention Guidance for Compositional
  Text-to-image Generation
Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation
Shengyuan Liu
Bo Wang
Ye Ma
Te Yang
Xipeng Cao
Quan Chen
Han Li
Di Dong
Peng Jiang
EGVM
49
2
0
11 May 2024
Calo-VQ: Vector-Quantized Two-Stage Generative Model in Calorimeter
  Simulation
Calo-VQ: Vector-Quantized Two-Stage Generative Model in Calorimeter Simulation
Qibin Liu
Chase Shimmin
Xiulong Liu
Eli Shlizerman
Shu Li
Shih-Chieh Hsu
DRL
MQ
45
7
0
10 May 2024
Controllable Image Generation With Composed Parallel Token Prediction
Controllable Image Generation With Composed Parallel Token Prediction
Jamie Stirling
Noura Al-Moubayed
40
0
0
10 May 2024
LatentColorization: Latent Diffusion-Based Speaker Video Colorization
LatentColorization: Latent Diffusion-Based Speaker Video Colorization
Rory Ward
Dan Bigioi
Shubhajit Basak
John G. Breslin
Peter Corcoran
VGen
DiffM
42
3
0
09 May 2024
The Codecfake Dataset and Countermeasures for the Universally Detection
  of Deepfake Audio
The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio
Yuankun Xie
Yi Lu
Ruibo Fu
Zhengqi Wen
Zhiyong Wang
...
Xiaopeng Wang
Yukun Liu
Haonan Cheng
Long Ye
Yi Sun
52
16
0
08 May 2024
HILCodec: High Fidelity and Lightweight Neural Audio Codec
HILCodec: High Fidelity and Lightweight Neural Audio Codec
S. Ahn
Beom Jun Woo
Mingrui Han
Chanyeong Moon
Nam Soo Kim
34
6
0
08 May 2024
WISER: Weak supervISion and supErvised Representation learning to
  improve drug response prediction in cancer
WISER: Weak supervISion and supErvised Representation learning to improve drug response prediction in cancer
Kumar Shubham
A. Jayagopal
Syed Mohammed Danish
AP Prathosh
Vaibhav Rajan
OOD
32
3
0
07 May 2024
MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object
  Reconstruction from Single-View
MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object Reconstruction from Single-View
Emmanuelle Bourigault
Pauline Bourigault
42
2
0
06 May 2024
Sequence Compression Speeds Up Credit Assignment in Reinforcement
  Learning
Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning
Aditya A. Ramesh
Kenny Young
Louis Kirsch
Jürgen Schmidhuber
39
1
0
06 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World
  Models and Beyond
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
87
38
0
06 May 2024
Video Diffusion Models: A Survey
Video Diffusion Models: A Survey
Andrew Melnik
Michal Ljubljanac
Cong Lu
Qi Yan
Weiming Ren
Helge J. Ritter
VGen
74
13
0
06 May 2024
DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model
DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model
Peijing Jia
Tuopu Wen
Ziang Luo
Mengmeng Yang
Kun Jiang
...
Ziyuan Liu
Le Cui
Kehua Sheng
Bo Zhang
Diange Yang
55
4
0
03 May 2024
Creation of Novel Soft Robot Designs using Generative AI
Creation of Novel Soft Robot Designs using Generative AI
Wee Kiat Chan
PengWei Wang
R. C. Yeow
AI4CE
DiffM
17
2
0
03 May 2024
SATO: Stable Text-to-Motion Framework
SATO: Stable Text-to-Motion Framework
Wenshuo Chen
Hongru Xiao
Erhang Zhang
Lijie Hu
Lei Wang
Mengyuan Liu
Chong Chen
52
6
0
02 May 2024
Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance
Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance
Kelvin C. K. Chan
Yang Zhao
Xuhui Jia
Ming-Hsuan Yang
Huisheng Wang
22
3
0
02 May 2024
Sparse multi-view hand-object reconstruction for unseen environments
Sparse multi-view hand-object reconstruction for unseen environments
Yik Lung Pang
Changjae Oh
Andrea Cavallaro
51
1
0
02 May 2024
On Mechanistic Knowledge Localization in Text-to-Image Generative Models
On Mechanistic Knowledge Localization in Text-to-Image Generative Models
Samyadeep Basu
Keivan Rezaei
Priyatham Kattakinda
Ryan Rossi
Cherry Zhao
Vlad I. Morariu
Varun Manjunatha
S. Feizi
35
13
0
02 May 2024
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General
  Sound
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound
Haohe Liu
Xuenan Xu
Yiitan Yuan
Mengyue Wu
Wenwu Wang
Mark D. Plumbley
48
19
0
30 Apr 2024
Towards Real-world Video Face Restoration: A New Benchmark
Towards Real-world Video Face Restoration: A New Benchmark
Ziyan Chen
Jingwen He
Xinqi Lin
Yu Qiao
Chao Dong
53
4
0
30 Apr 2024
ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized
  Transformers
ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
Yuzhe Gu
Enmao Diao
42
4
0
30 Apr 2024
Improved AutoEncoder with LSTM module and KL divergence
Improved AutoEncoder with LSTM module and KL divergence
Wei Huang
Bingyang Zhang
Kaituo Zhang
Hua Gao
Rongchun Wan
28
1
0
30 Apr 2024
Embedded Representation Learning Network for Animating Styled Video
  Portrait
Embedded Representation Learning Network for Animating Styled Video Portrait
Tianyong Wang
Xiangyu Liang
Wangguandong Zheng
Dan Niu
Haifeng Xia
Siyu Xia
3DH
34
0
0
29 Apr 2024
What Foundation Models can Bring for Robot Learning in Manipulation : A
  Survey
What Foundation Models can Bring for Robot Learning in Manipulation : A Survey
Dingzhe Li
Yixiang Jin
A. Yong
Hongze Yu
Jun Shi
Xiaoshuai Hao
Peng Hao
Huaping Liu
Gang Hua
Bin Fang
AI4CE
LM&Ro
79
13
0
28 Apr 2024
Compressing Latent Space via Least Volume
Compressing Latent Space via Least Volume
Qiuyi Chen
M. Fuge
40
1
0
27 Apr 2024
sDAC -- Semantic Digital Analog Converter for Semantic Communications
sDAC -- Semantic Digital Analog Converter for Semantic Communications
Zhicheng Bao
Chen Dong
Xiaodong Xu
51
1
0
26 Apr 2024
Self-supervised visual learning in the low-data regime: a comparative
  evaluation
Self-supervised visual learning in the low-data regime: a comparative evaluation
Sotirios Konstantakos
Despina Ioanna Chalkiadaki
Ioannis Mademlis
Yuki M. Asano
E. Gavves
Georgios Th. Papadopoulos
47
6
0
26 Apr 2024
Synthesizing Audio from Silent Video using Sequence to Sequence Modeling
Synthesizing Audio from Silent Video using Sequence to Sequence Modeling
Hugo Garrido-Lestache Belinchon
Helina Mulugeta
Adam Haile
DiffM
VGen
14
0
0
25 Apr 2024
A Survey of Generative Search and Recommendation in the Era of Large
  Language Models
A Survey of Generative Search and Recommendation in the Era of Large Language Models
Yongqi Li
Xinyu Lin
Wenjie Wang
Fuli Feng
Liang Pang
Wenjie Li
Liqiang Nie
Xiangnan He
Tat-Seng Chua
3DV
LRM
51
7
0
25 Apr 2024
TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose
  Representation
TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
Sai Kumar Dwivedi
Yu Sun
Priyanka Patel
Yao Feng
Michael J. Black
3DH
52
28
0
25 Apr 2024
Previous
123...192021...545556
Next