Insights into Transfer Learning between Image and Audio Music Transcription

doi:10.5281/zenodo.6797870

Published June 7, 2022 | Version v2

Conference paper Open

Insights into Transfer Learning between Image and Audio Music Transcription

1. University of Alicante

Optical Music Recognition (OMR) and Automatic Music Transcription (AMT) stand for the research fields that devise methods to transcribe music sources---documents or audio signals, respectively---into a structured digital format. Historically, they have followed different approaches to achieve the same goal. However, their recent definition in terms of sequence labeling tasks gathers them under a common formulation framework. Under this premise, one may wonder if there exist any synergies between the two fields that could be exploited to improve the individual recognition rates in their respective domains. In this work, we aim to further explore this question from a Transfer Learning (TL) point of view in the context of neural end-to-end recognition models. More precisely, we consider a music transcription system, trained on either image or audio data, and adapt its performance to the unseen domain during the training phase using different TL schemes. Results show that knowledge transfer slightly boosts model performance with sufficient available data, but it is not properly leveraged when the latter condition is not met. This opens up a new promising, yet challenging, research path towards building an effective bridge between two solutions of the same problem.

Files

38.pdf

Files (596.6 kB)

Name	Size	Download all
38.pdf md5:f18291d61f1e9acc2a934e449fe073fd	596.6 kB	Preview Download

	All versions	This version
Views	435	68
Downloads	144	43
Data volume	89.7 MB	29.2 MB

Insights into Transfer Learning between Image and Audio Music Transcription

Creators

Description

Files

38.pdf

Files (596.6 kB)