Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio

Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio

19 May 2025

ArXiv (abs)PDF HTML

Papers citing "Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio"

19 / 19 papers shown

Title
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation Sungkyun Chang Emmanouil Benetos Holger Kirchhoff Simon Dixon 67 3 0 05 Jul 2024
Practical End-to-End Optical Music Recognition for Pianoform Music Jirí Mayer Milan Straka Jan Hajic Pavel Pecina 53 2 0 20 Mar 2024
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages Minsu Kim Jee-weon Jung Hyeongseop Rha Soumi Maiti Siddhant Arora Xuankai Chang Shinji Watanabe Y. Ro 95 7 0 25 Feb 2024
Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription Antonio Ríos-Vila Jorge Calvo-Zaragoza Thierry Paquet 91 11 0 12 Feb 2024
High-Fidelity Audio Compression with Improved RVQGAN Rithesh Kumar Prem Seetharaman Alejandro Luebs I. Kumar Kundan Kumar 105 337 0 11 Jun 2023
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation Yusong Wu Kai Chen Tianyu Zhang Yuchen Hui Marianna Nezhurina Taylor Berg-Kirkpatrick Shlomo Dubnov CLIP 131 540 0 12 Nov 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion Curtis Hawthorne Ian Simon Adam Roberts Neil Zeghidour Josh Gardner Ethan Manilow Jesse Engel DiffM 72 51 0 11 Jun 2022
Unaligned Supervision For Automatic Music Transcription in The Wild Ben Maman Amit H. Bermano 80 29 0 28 Apr 2022
Autoregressive Image Generation using Residual Quantization Doyup Lee Chiheon Kim Saehoon Kim Minsu Cho Wook-Shin Han VGen 282 378 0 03 Mar 2022
Deep Performer: Score-to-Audio Music Performance Synthesis Hao-Wen Dong Cong Zhou Taylor Berg-Kirkpatrick Julian McAuley 61 17 0 12 Feb 2022
Sequence-to-Sequence Piano Transcription with Transformers Curtis Hawthorne Ian Simon Rigel Swavely Ethan Manilow Jesse Engel 183 82 0 19 Jul 2021
High-resolution Piano Transcription with Pedals by Regressing Onset and Offset Times Qiuqiang Kong Bochen Li Xuchen Song Yuan Wan Yuxuan Wang 382 112 0 05 Oct 2020
Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity Ethan Manilow Gordon Wichern Prem Seetharaman Jonathan Le Roux 70 127 0 18 Sep 2019
Understanding Optical Music Recognition Jorge Calvo-Zaragoza Jan Hajic Alexander Pacha 51 118 0 07 Aug 2019
Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset Curtis Hawthorne Andriy Stasyuk Adam Roberts Ian Simon Cheng-Zhi Anna Huang Sander Dieleman Erich Elsen Jesse Engel Douglas Eck 442 452 0 29 Oct 2018
Onsets and Frames: Dual-Objective Piano Transcription Curtis Hawthorne Erich Elsen Jialin Song Adam Roberts Ian Simon Colin Raffel Jesse Engel Sageev Oore Douglas Eck 186 280 0 30 Oct 2017
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 803 132,454 0 12 Jun 2017
Learning Features of Music from Scratch John Thickstun Zaïd Harchaoui Sham Kakade 161 202 0 29 Nov 2016
Microsoft COCO: Common Objects in Context Nayeon Lee Michael Maire Serge J. Belongie Lubomir Bourdev Ross B. Girshick James Hays Pietro Perona Deva Ramanan C. L. Zitnick Piotr Dollár ObjD 434 43,832 0 01 May 2014