Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.12863
Cited By
Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio
19 May 2025
Jongmin Jung
Dongmin Kim
Sihun Lee
Seola Cho
Hyungjoon Soh
Irmak Bukey
Chris Donahue
Dasaem Jeong
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio"
19 / 19 papers shown
Title
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
Sungkyun Chang
Emmanouil Benetos
Holger Kirchhoff
Simon Dixon
67
3
0
05 Jul 2024
Practical End-to-End Optical Music Recognition for Pianoform Music
Jirí Mayer
Milan Straka
Jan Hajic
Pavel Pecina
53
2
0
20 Mar 2024
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
Minsu Kim
Jee-weon Jung
Hyeongseop Rha
Soumi Maiti
Siddhant Arora
Xuankai Chang
Shinji Watanabe
Y. Ro
95
7
0
25 Feb 2024
Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription
Antonio Ríos-Vila
Jorge Calvo-Zaragoza
Thierry Paquet
91
11
0
12 Feb 2024
High-Fidelity Audio Compression with Improved RVQGAN
Rithesh Kumar
Prem Seetharaman
Alejandro Luebs
I. Kumar
Kundan Kumar
105
337
0
11 Jun 2023
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
Yusong Wu
Kai Chen
Tianyu Zhang
Yuchen Hui
Marianna Nezhurina
Taylor Berg-Kirkpatrick
Shlomo Dubnov
CLIP
131
540
0
12 Nov 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion
Curtis Hawthorne
Ian Simon
Adam Roberts
Neil Zeghidour
Josh Gardner
Ethan Manilow
Jesse Engel
DiffM
72
51
0
11 Jun 2022
Unaligned Supervision For Automatic Music Transcription in The Wild
Ben Maman
Amit H. Bermano
80
29
0
28 Apr 2022
Autoregressive Image Generation using Residual Quantization
Doyup Lee
Chiheon Kim
Saehoon Kim
Minsu Cho
Wook-Shin Han
VGen
282
378
0
03 Mar 2022
Deep Performer: Score-to-Audio Music Performance Synthesis
Hao-Wen Dong
Cong Zhou
Taylor Berg-Kirkpatrick
Julian McAuley
61
17
0
12 Feb 2022
Sequence-to-Sequence Piano Transcription with Transformers
Curtis Hawthorne
Ian Simon
Rigel Swavely
Ethan Manilow
Jesse Engel
183
82
0
19 Jul 2021
High-resolution Piano Transcription with Pedals by Regressing Onset and Offset Times
Qiuqiang Kong
Bochen Li
Xuchen Song
Yuan Wan
Yuxuan Wang
382
112
0
05 Oct 2020
Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity
Ethan Manilow
Gordon Wichern
Prem Seetharaman
Jonathan Le Roux
70
127
0
18 Sep 2019
Understanding Optical Music Recognition
Jorge Calvo-Zaragoza
Jan Hajic
Alexander Pacha
51
118
0
07 Aug 2019
Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset
Curtis Hawthorne
Andriy Stasyuk
Adam Roberts
Ian Simon
Cheng-Zhi Anna Huang
Sander Dieleman
Erich Elsen
Jesse Engel
Douglas Eck
442
452
0
29 Oct 2018
Onsets and Frames: Dual-Objective Piano Transcription
Curtis Hawthorne
Erich Elsen
Jialin Song
Adam Roberts
Ian Simon
Colin Raffel
Jesse Engel
Sageev Oore
Douglas Eck
186
280
0
30 Oct 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
803
132,454
0
12 Jun 2017
Learning Features of Music from Scratch
John Thickstun
Zaïd Harchaoui
Sham Kakade
161
202
0
29 Nov 2016
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
434
43,832
0
01 May 2014
1