ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.12863
  4. Cited By
Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio

Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio

19 May 2025
Jongmin Jung
Dongmin Kim
Sihun Lee
Seola Cho
Hyungjoon Soh
Irmak Bukey
Chris Donahue
Dasaem Jeong
ArXiv (abs)PDFHTML

Papers citing "Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio"

19 / 19 papers shown
Title
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer
  Architectures and Cross-dataset Stem Augmentation
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
Sungkyun Chang
Emmanouil Benetos
Holger Kirchhoff
Simon Dixon
67
3
0
05 Jul 2024
Practical End-to-End Optical Music Recognition for Pianoform Music
Practical End-to-End Optical Music Recognition for Pianoform Music
Jirí Mayer
Milan Straka
Jan Hajic
Pavel Pecina
53
2
0
20 Mar 2024
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
Minsu Kim
Jee-weon Jung
Hyeongseop Rha
Soumi Maiti
Siddhant Arora
Xuankai Chang
Shinji Watanabe
Y. Ro
95
7
0
25 Feb 2024
Sheet Music Transformer: End-To-End Optical Music Recognition Beyond
  Monophonic Transcription
Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription
Antonio Ríos-Vila
Jorge Calvo-Zaragoza
Thierry Paquet
91
11
0
12 Feb 2024
High-Fidelity Audio Compression with Improved RVQGAN
High-Fidelity Audio Compression with Improved RVQGAN
Rithesh Kumar
Prem Seetharaman
Alejandro Luebs
I. Kumar
Kundan Kumar
105
337
0
11 Jun 2023
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion
  and Keyword-to-Caption Augmentation
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
Yusong Wu
Kai Chen
Tianyu Zhang
Yuchen Hui
Marianna Nezhurina
Taylor Berg-Kirkpatrick
Shlomo Dubnov
CLIP
131
540
0
12 Nov 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion
Multi-instrument Music Synthesis with Spectrogram Diffusion
Curtis Hawthorne
Ian Simon
Adam Roberts
Neil Zeghidour
Josh Gardner
Ethan Manilow
Jesse Engel
DiffM
72
51
0
11 Jun 2022
Unaligned Supervision For Automatic Music Transcription in The Wild
Unaligned Supervision For Automatic Music Transcription in The Wild
Ben Maman
Amit H. Bermano
80
29
0
28 Apr 2022
Autoregressive Image Generation using Residual Quantization
Autoregressive Image Generation using Residual Quantization
Doyup Lee
Chiheon Kim
Saehoon Kim
Minsu Cho
Wook-Shin Han
VGen
282
378
0
03 Mar 2022
Deep Performer: Score-to-Audio Music Performance Synthesis
Deep Performer: Score-to-Audio Music Performance Synthesis
Hao-Wen Dong
Cong Zhou
Taylor Berg-Kirkpatrick
Julian McAuley
61
17
0
12 Feb 2022
Sequence-to-Sequence Piano Transcription with Transformers
Sequence-to-Sequence Piano Transcription with Transformers
Curtis Hawthorne
Ian Simon
Rigel Swavely
Ethan Manilow
Jesse Engel
183
82
0
19 Jul 2021
High-resolution Piano Transcription with Pedals by Regressing Onset and
  Offset Times
High-resolution Piano Transcription with Pedals by Regressing Onset and Offset Times
Qiuqiang Kong
Bochen Li
Xuchen Song
Yuan Wan
Yuxuan Wang
382
112
0
05 Oct 2020
Cutting Music Source Separation Some Slakh: A Dataset to Study the
  Impact of Training Data Quality and Quantity
Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity
Ethan Manilow
Gordon Wichern
Prem Seetharaman
Jonathan Le Roux
70
127
0
18 Sep 2019
Understanding Optical Music Recognition
Understanding Optical Music Recognition
Jorge Calvo-Zaragoza
Jan Hajic
Alexander Pacha
51
118
0
07 Aug 2019
Enabling Factorized Piano Music Modeling and Generation with the MAESTRO
  Dataset
Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset
Curtis Hawthorne
Andriy Stasyuk
Adam Roberts
Ian Simon
Cheng-Zhi Anna Huang
Sander Dieleman
Erich Elsen
Jesse Engel
Douglas Eck
442
452
0
29 Oct 2018
Onsets and Frames: Dual-Objective Piano Transcription
Onsets and Frames: Dual-Objective Piano Transcription
Curtis Hawthorne
Erich Elsen
Jialin Song
Adam Roberts
Ian Simon
Colin Raffel
Jesse Engel
Sageev Oore
Douglas Eck
186
280
0
30 Oct 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
803
132,454
0
12 Jun 2017
Learning Features of Music from Scratch
Learning Features of Music from Scratch
John Thickstun
Zaïd Harchaoui
Sham Kakade
161
202
0
29 Nov 2016
Microsoft COCO: Common Objects in Context
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
434
43,832
0
01 May 2014
1