ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.05284
  4. Cited By
Simple and Controllable Music Generation

Simple and Controllable Music Generation

8 June 2023
Jade Copet
Felix Kreuk
Itai Gat
Tal Remez
David Kant
Gabriel Synnaeve
Yossi Adi
Alexandre Défossez
    MGen
ArXivPDFHTML

Papers citing "Simple and Controllable Music Generation"

50 / 257 papers shown
Title
Resource-Efficient Generative AI Model Deployment in Mobile Edge
  Networks
Resource-Efficient Generative AI Model Deployment in Mobile Edge Networks
Yuxin Liang
Peng Yang
Yuanyuan He
Feng Lyu
26
2
0
09 Sep 2024
MetaBGM: Dynamic Soundtrack Transformation For Continuous Multi-Scene
  Experiences With Ambient Awareness And Personalization
MetaBGM: Dynamic Soundtrack Transformation For Continuous Multi-Scene Experiences With Ambient Awareness And Personalization
Haoxuan Liu
Zihao Wang
HaoRong Hong
Youwei Feng
Jiaxin Yu
Han Diao
Yunfei Xu
Kaipeng Zhang
39
0
0
05 Sep 2024
LAST: Language Model Aware Speech Tokenization
LAST: Language Model Aware Speech Tokenization
A. Turetzky
Yossi Adi
42
3
0
05 Sep 2024
FireRedTTS: A Foundation Text-To-Speech Framework for Industry-Level Generative Speech Applications
FireRedTTS: A Foundation Text-To-Speech Framework for Industry-Level Generative Speech Applications
Hao-Han Guo
Kun Liu
Fei-Yu Shen
Yi-Chen Wu
Xu Tang
Kun Xie
Kai-Tuo Xu
Kun Xie
Kai-Tuo Xu
45
22
0
05 Sep 2024
SymPAC: Scalable Symbolic Music Generation With Prompts And Constraints
SymPAC: Scalable Symbolic Music Generation With Prompts And Constraints
Haonan Chen
Jordan B. L. Smith
Janne Spijkervet
Ju-Chiang Wang
Pei Zou
Bochen Li
Qiuqiang Kong
Xingjian Du
33
1
0
04 Sep 2024
Dynamic Motion Synthesis: Masked Audio-Text Conditioned Spatio-Temporal
  Transformers
Dynamic Motion Synthesis: Masked Audio-Text Conditioned Spatio-Temporal Transformers
Sohan Anisetty
James Hays
51
0
0
03 Sep 2024
EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio
  Captioning Performance
EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance
Jaeyeon Kim
Minjeon Jeon
Jaeyoon Jung
Sang Hoon Woo
Jinjoo Lee
34
2
0
02 Sep 2024
Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio
  Captioning
Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio Captioning
Jaeyeon Kim
Jaeyoon Jung
Minjeong Jeon
Sang Hoon Woo
Jinjoo Lee
24
1
0
02 Sep 2024
SoCodec: A Semantic-Ordered Multi-Stream Speech Codec for Efficient
  Language Model Based Text-to-Speech Synthesis
SoCodec: A Semantic-Ordered Multi-Stream Speech Codec for Efficient Language Model Based Text-to-Speech Synthesis
Haohan Guo
Fenglong Xie
Kun Xie
Dongchao Yang
Dake Guo
Xixin Wu
Helen Meng
37
4
0
02 Sep 2024
MMT-BERT: Chord-aware Symbolic Music Generation Based on Multitrack
  Music Transformer and MusicBERT
MMT-BERT: Chord-aware Symbolic Music Generation Based on Multitrack Music Transformer and MusicBERT
Jinlong Zhu
Keigo Sakurai
Ren Togo
Takahiro Ogawa
Miki Haseyama
GAN
49
1
0
02 Sep 2024
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec
  Transformer
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
Yuancheng Wang
Haoyue Zhan
Liwei Liu
Ruihong Zeng
Haotian Guo
Jiachen Zheng
Qiang Zhang
Shunsi Zhang
Shunsi Zhang
Zhizheng Wu
45
43
0
01 Sep 2024
FLUX that Plays Music
FLUX that Plays Music
Zhengcong Fei
Mingyuan Fan
Changqian Yu
Junshi Huang
94
7
0
01 Sep 2024
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming
Zhifei Xie
Changqiao Wu
AuLLM
VGen
VLM
SyDa
LRM
37
60
0
29 Aug 2024
SSDM: Scalable Speech Dysfluency Modeling
SSDM: Scalable Speech Dysfluency Modeling
Jiachen Lian
Xuanru Zhou
Z. Ezzes
Jet M J Vonk
Brittany Morin
D. Baquirin
Zachary Mille
M. G. Tempini
Gopala Anumanchipalli
AuLLM
37
1
0
29 Aug 2024
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Shengpeng Ji
Ziyue Jiang
Xize Cheng
Yifu Chen
Minghui Fang
...
Rongjie Huang
Yidi Jiang
Qian Chen
Zhou Zhao
Zhou Zhao
VLM
60
36
0
29 Aug 2024
Hierarchical Generative Modeling of Melodic Vocal Contours in Hindustani
  Classical Music
Hierarchical Generative Modeling of Melodic Vocal Contours in Hindustani Classical Music
N. Shikarpur
Krishna Maneesha Dendukuri
Yusong Wu
Antoine Caillon
Cheng-Zhi Anna Huang
20
1
0
22 Aug 2024
Does Current Deepfake Audio Detection Model Effectively Detect ALM-based
  Deepfake Audio?
Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
Yuankun Xie
Chenxu Xiong
Xiaopeng Wang
Zhiyong Wang
Yi Lu
...
Yukun Liu
Zhengqi Wen
Jianhua Tao
Guanjun Li
Long Ye
AuLLM
41
1
0
20 Aug 2024
Efficient and Scalable Point Cloud Generation with Sparse Point-Voxel
  Diffusion Models
Efficient and Scalable Point Cloud Generation with Sparse Point-Voxel Diffusion Models
Ioannis Romanelis
Vlassios Fotis
Athanasios P. Kalogeras
Christos Alexakos
Konstantinos Moustakas
Adrian Munteanu
41
0
0
12 Aug 2024
TEAdapter: Supply abundant guidance for controllable text-to-music
  generation
TEAdapter: Supply abundant guidance for controllable text-to-music generation
Jialing Zou
Jiahao Mei
Xudong Nan
Jinghua Li
Daoguo Dong
Liang He
36
0
0
09 Aug 2024
PiCoGen2: Piano cover generation with transfer learning approach and
  weakly aligned data
PiCoGen2: Piano cover generation with transfer learning approach and weakly aligned data
Chih-Pin Tan
Hsin Ai
Yi-Hsin Chang
Shuen-Huei Guan
Yi-Hsuan Yang
53
2
0
02 Aug 2024
Nested Music Transformer: Sequentially Decoding Compound Tokens in
  Symbolic Music and Audio Generation
Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio Generation
Michael Kolle
Maximilian Zorn
Jongmin Jung
Dasaem Jeong
44
1
0
02 Aug 2024
Combining audio control and style transfer using latent diffusion
Combining audio control and style transfer using latent diffusion
Andreas Maier
Yuliya Burankova
Anne Hartebrodt
David B. Blumenthal
DiffM
47
2
0
31 Jul 2024
Recording First-person Experiences to Build a New Type of Foundation
  Model
Recording First-person Experiences to Build a New Type of Foundation Model
Dionis Barcari
David Gamez
Aliya Grig
ALM
36
0
0
31 Jul 2024
A New Type of Foundation Model Based on Recordings of People's Emotions
  and Physiology
A New Type of Foundation Model Based on Recordings of People's Emotions and Physiology
David Gamez
Dionis Barcari
Aliya Grig
37
0
0
31 Jul 2024
MMTrail: A Multimodal Trailer Video Dataset with Language and Music
  Descriptions
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Xiaowei Chi
Yatian Wang
Aosong Cheng
Pengjun Fang
Zeyue Tian
...
Wenhan Luo
Qifeng Chen
Shanghang Zhang
Qi-fei Liu
Yi-Ting Guo
75
7
0
30 Jul 2024
Futga: Towards Fine-grained Music Understanding through
  Temporally-enhanced Generative Augmentation
Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation
Junda Wu
Zachary Novack
Amit Namburi
Jiaheng Dai
Hao-Wen Dong
Zhouhang Xie
Carol Chen
Julian McAuley
46
1
0
29 Jul 2024
Discrete Flow Matching
Discrete Flow Matching
Itai Gat
Tal Remez
Neta Shaul
Felix Kreuk
Ricky T. Q. Chen
Gabriel Synnaeve
Yossi Adi
Y. Lipman
DiffM
52
57
0
22 Jul 2024
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music
  Generation
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation
Yun-Han Lan
Wen-Yi Hsiao
Hao-Chung Cheng
Yi-Hsuan Yang
56
7
0
21 Jul 2024
Towards Assessing Data Replication in Music Generation with Music
  Similarity Metrics on Raw Audio
Towards Assessing Data Replication in Music Generation with Music Similarity Metrics on Raw Audio
Roser Batlle-Roca
Wei-Hsiang Liao
Xavier Serra
Yuki Mitsufuji
Emilia Gómez
53
0
0
19 Jul 2024
Stable Audio Open
Stable Audio Open
Zach Evans
Julian Parker
CJ Carr
Zack Zukowski
Josiah Taylor
Jordi Pons
86
39
0
19 Jul 2024
Audio Conditioning for Music Generation via Discrete Bottleneck Features
Audio Conditioning for Music Generation via Discrete Bottleneck Features
Simon Rouard
Yossi Adi
Jade Copet
Axel Roebel
Alexandre Défossez
MGen
57
1
0
17 Jul 2024
A Language Modeling Approach to Diacritic-Free Hebrew TTS
A Language Modeling Approach to Diacritic-Free Hebrew TTS
Amit Roth
A. Turetzky
Yossi Adi
40
2
0
16 Jul 2024
LiteFocus: Accelerated Diffusion Inference for Long Audio Synthesis
LiteFocus: Accelerated Diffusion Inference for Long Audio Synthesis
Zhenxiong Tan
Xinyin Ma
Gongfan Fang
Xinchao Wang
44
3
0
15 Jul 2024
BandControlNet: Parallel Transformers-based Steerable Popular Music
  Generation with Fine-Grained Spatiotemporal Features
BandControlNet: Parallel Transformers-based Steerable Popular Music Generation with Fine-Grained Spatiotemporal Features
Jing Luo
Xinyu Yang
Dorien Herremans
39
3
0
15 Jul 2024
Masked Generative Video-to-Audio Transformers with Enhanced
  Synchronicity
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity
Santiago Pascual
Chunghsin Yeh
Ioannis Tsiamas
Joan Serrà
DiffM
VGen
47
15
0
15 Jul 2024
Live2Diff: Live Stream Translation via Uni-directional Attention in
  Video Diffusion Models
Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models
Zhening Xing
Gereon Fox
Yanhong Zeng
Xingang Pan
Mohamed A. Elgharib
Christian Theobalt
Kai Chen
VGen
35
4
0
11 Jul 2024
PAGURI: a user experience study of creative interaction with
  text-to-music models
PAGURI: a user experience study of creative interaction with text-to-music models
Francesca Ronchini
Luca Comanducci
Gabriele Perego
Fabio Antonacci
40
3
0
05 Jul 2024
MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music
  Generation through Pre-Training and Counterfactual Loss
MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation through Pre-Training and Counterfactual Loss
Yangyang Shu
Haiming Xu
Ziqin Zhou
Anton van den Hengel
Lingqiao Liu
43
3
0
05 Jul 2024
MuDiT & MuSiT: Alignment with Colloquial Expression in
  Description-to-Song Generation
MuDiT & MuSiT: Alignment with Colloquial Expression in Description-to-Song Generation
Zihao Wang
Haoxuan Liu
Jiaxing Yu
Tao Zhang
Yan Liu
Kaipeng Zhang
79
1
0
03 Jul 2024
Towards Training Music Taggers on Synthetic Data
Towards Training Music Taggers on Synthetic Data
N. Kroher
Steven Manangu
A. Pikrakis
52
1
0
02 Jul 2024
Accompanied Singing Voice Synthesis with Fully Text-controlled Melody
Accompanied Singing Voice Synthesis with Fully Text-controlled Melody
Ruiqi Li
Zhiqing Hong
Yongqi Wang
Lichao Zhang
Rongjie Huang
Siqi Zheng
Zhou Zhao
47
6
0
02 Jul 2024
Pictures Of MIDI: Controlled Music Generation via Graphical Prompts for
  Image-Based Diffusion Inpainting
Pictures Of MIDI: Controlled Music Generation via Graphical Prompts for Image-Based Diffusion Inpainting
Scott H. Hawley
48
2
0
01 Jul 2024
Subtractive Training for Music Stem Insertion using Latent Diffusion Models
Subtractive Training for Music Stem Insertion using Latent Diffusion Models
Ivan Villa-Renteria
Mason L. Wang
Zachary Shah
Zhe Li
Soohyun Kim
Neelesh Ramachandran
Mert Pilanci
47
0
0
27 Jun 2024
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic
  Alignment
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment
Paarth Neekhara
Shehzeen Samarah Hussain
Subhankar Ghosh
Jason Chun Lok Li
Rafael Valle
Rohan Badlani
Boris Ginsburg
58
11
0
25 Jun 2024
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for
  Efficient Audio Synthesis and Beyond
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond
Marco Comunità
Zhi-Wei Zhong
Akira Takahashi
Shiqi Yang
Mengjie Zhao
Koichi Saito
Yukara Ikemiya
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
71
2
0
25 Jun 2024
Exploring compressibility of transformer based text-to-music (TTM)
  models
Exploring compressibility of transformer based text-to-music (TTM) models
Vasileios Moschopoulos
Thanasis Kotsiopoulos
Pablo Peso Parada
Konstantinos Nikiforidis
Alexandros Stergiadis
Gerasimos Papakostas
Md. Asif Jalal
Jisi Zhang
Anastasios Drosou
Karthikeyan P. Saravanan
25
0
0
24 Jun 2024
Improving Unsupervised Clean-to-Rendered Guitar Tone Transformation
  Using GANs and Integrated Unaligned Clean Data
Improving Unsupervised Clean-to-Rendered Guitar Tone Transformation Using GANs and Integrated Unaligned Clean Data
Yu-Hua Chen
Woosung Choi
Wei-Hsiang Liao
Marco A. Martínez-Ramírez
K. Cheuk
Yuki Mitsufuji
J. Jang
Yi-Hsuan Yang
50
5
0
22 Jun 2024
JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal
  Parameters Tuning
JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal Parameters Tuning
Boyu Chen
Peike Li
Yao Yao
Alex Wang
DiffM
47
3
0
18 Jun 2024
Improving Text-To-Audio Models with Synthetic Captions
Improving Text-To-Audio Models with Synthetic Captions
Zhifeng Kong
Sang-gil Lee
Deepanway Ghosal
Navonil Majumder
Ambuj Mehrish
Rafael Valle
Soujanya Poria
Bryan Catanzaro
53
11
0
18 Jun 2024
Joint Audio and Symbolic Conditioning for Temporally Controlled
  Text-to-Music Generation
Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation
Or Tal
Alon Ziv
Itai Gat
Felix Kreuk
Yossi Adi
58
13
0
16 Jun 2024
Previous
123456
Next