Abstract
This article presents a multimodal dataset comprising various representations and annotations of Franz Schubert’s song cycle Winterreise. Schubert’s seminal work constitutes an outstanding example of the Romantic song cycle—a central genre within Western classical music. Our dataset unifies several public sources and annotations carefully created by music experts, compiled in a comprehensive and consistent way. The multimodal representations comprise the singer’s lyrics, sheet music in different machine-readable formats, and audio recordings of nine performances, two of which are freely accessible for research purposes. By means of explicit musical measure positions, we establish a temporal alignment between the different representations, thus enabling a detailed comparison across different performances and modalities. Using these alignments, we provide for the different versions various musicological annotations describing tonal and structural characteristics. This metadata comprises chord annotations in different granularities, local and global annotations of musical keys, and segmentations into structural parts. From a technical perspective, the dataset allows for evaluating algorithmic approaches to tasks such as automated music transcription, cross-modal music alignment, or tonal analysis, and for testing these algorithms’ robustness across songs, performances, and modalities. From a musicological perspective, the dataset enables the systematic study of Schubert’s musical language and style in Winterreise and the comparison of annotations regarding different annotators and granularities. Beyond the research domain, the data may serve further purposes such as the didactic preparation of Schubert’s work and its presentation to a wider public by means of an interactive multimedia experience. With this article, we provide a detailed description of the dataset, indicate its potential for computational music analysis by means of several studies, and point out possibilities for future research.
- Jakob Abeßer, Klaus Frieler, Estefanía Cano, Martin Pfleiderer, and Wolf-Georg Zaddach. 2017. Score-informed analysis of tuning, intonation, pitch modulation, and dynamics in jazz solos. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25, 1 (2017), 168--177.Google ScholarDigital Library
- Frans G. J. Absil. 2017. Musical Analysis—Visiting the Great Composers (6th ed.). Frans Absil Music.Google Scholar
- Andreas Arzt. 2016. Flexible and Robust Music Tracking. Ph.D. Dissertation. Universität Linz, Austria.Google Scholar
- Andreas Arzt and Gerhard Widmer. 2015. Real-time music tracking using multiple performances as a reference. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 357--363.Google Scholar
- Héctor G. Bellmann. 2012. Categorization of Tonal Music Style: A Quantitative Investigation. Ph.D. Dissertation. Griffith University, Brisbane, Australia.Google Scholar
- Juan Pablo Bello. 2011. Measuring structural similarity in music. IEEE Transactions on Audio, Speech, and Language Processing 19, 7 (2011), 2013--2025.Google ScholarDigital Library
- Anthony J. Bishara and Gabriel A. Radvansky. 2005. The detection and tracing of melodic key changes. Perception & Psychophysics 67, 1 (2005), 36--47.Google ScholarCross Ref
- Leo Brütting. 2019. Hierarchical Tonal Analysis of Music Signals. Bachelor’s Thesis. Friedrich-Alexander-University of Erlangen-Nuremberg (FAU), Germany.Google Scholar
- Chris Cannam, Christian Landone, and Mark B. Sandler. 2010. Sonic visualiser: An open source application for viewing, analysing, and annotating music audio files. In Proceedings of the International Conference on Multimedia. 1467--1468.Google Scholar
- Carlos Eduardo Cancino Chacón, Maarten Grachten, Werner Goebl, and Gerhard Widmer. 2018. Computational models of expressive music performance: A comprehensive and critical review. Frontiers in Digital Humanities 5 (2018), 25.Google ScholarCross Ref
- Tsung-Ping Chen and Li Su. 2018. Functional harmony recognition of symbolic music data with multi-task recurrent neural networks. In Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR). 90--97.Google Scholar
- Taemin Cho and Juan Pablo Bello. 2014. On the relative importance of individual components of chord recognition systems. IEEE/ACM Transactions on Audio, Speech, and Language Processing 22, 2 (2014), 477--492.Google ScholarDigital Library
- W. Bas de Haas, Remco C. Veltkamp, and Frans Wiering. 2008. Tonal pitch step distance: A similarity measure for chord progressions. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 51--56.Google Scholar
- Jun-Qi Deng and Yu-Kwong Kwok. 2016. A hybrid Gaussian-HMM-deep learning approach for automatic chord estimation with very large vocabulary. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 812--818.Google Scholar
- Bruno Di Giorgi, Simon Dixon, Massimiliano Zanoni, and Augusto Sarti. 2017. A data-driven model of tonal chord sequence complexity. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25, 11 (2017), 2237--2250.Google ScholarDigital Library
- Christian Dittmar, Bernhard Lehner, Thomas Prätzlich, Meinard Müller, and Gerhard Widmer. 2015. Cross-version singing voice detection in classical opera recordings. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 618--624.Google Scholar
- Matthias Dorfer, Andreas Arzt, and Gerhard Widmer. 2016. Towards score following in sheet music images. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 789--795.Google Scholar
- Matthias Dorfer, Andreas Arzt, and Gerhard Widmer. 2017. Learning audio-sheet music correspondences for score identification and offline alignment. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 115--122.Google Scholar
- Jonathan Driedger, Stefan Balke, Sebastian Ewert, and Meinard Müller. 2016. Template-based vibrato analysis of music signals. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 239--245.Google Scholar
- Sebastian Ewert, Meinard Müller, and Peter Grosche. 2009. High resolution audio synchronization using chroma onset features. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 1869--1872.Google ScholarDigital Library
- Sebastian Ewert, Meinard Müller, Verena Konz, Daniel Müllensiefen, and Gerraint A. Wiggins. 2012. Towards cross-version harmonic analysis of music. IEEE Transactions on Multimedia 14, 3-2 (2012), 770--782.Google ScholarDigital Library
- Christian Fremerey, Frank Kurth, Meinard Müller, and Michael Clausen. 2007. A demonstration of the SyncPlayer system. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 131--132.Google Scholar
- Hiromasa Fujihara, Masataka Goto, Jun Ogata, and Hiroshi G. Okuno. 2011. LyricSynchronizer: Automatic synchronization system between musical audio signals and lyrics. IEEE Journal of Selected Topics in Signal Processing 5, 6 (2011), 1252--1261.Google ScholarCross Ref
- Michael Good. 2001. MusicXML for notation and analysis. Computing in Musicology 12 (2001), 113--124.Google Scholar
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press, Cambridge, MA.Google ScholarDigital Library
- Harald Grohganz. 2015. Algorithmen zur strukturellen analyse von musikaufnahmen. Ph.D. Dissertation. University of Bonn, Germany.Google Scholar
- Harald Grohganz, Michael Clausen, Nanzhu Jiang, and Meinard Müller. 2013. Converting path structures into block structures using eigenvalue decompositions of self-similarity matrices. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 209--214.Google Scholar
- Christopher Harte, Mark B. Sandler, Samer Abdallah, and Emilia Gómez. 2005. Symbolic representation of musical chords: A proposed syntax for text annotations. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 66--71.Google Scholar
- Aline Honingh and Rens Bod. 2010. Pitch class set categories as analysis tools for degrees of tonality. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 459--464.Google Scholar
- Ning Hu, Roger B. Dannenberg, and George Tzanetakis. 2003. Polyphonic audio matching and alignment for music retrieval. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).Google Scholar
- Min-Yen Kan, Ye Wang, Denny Iskandar, Tin Lay Nwe, and Arun Shenoy. 2008. LyricAlly: Automatic synchronization of textual lyrics to acoustic music signals. IEEE Transactions on Audio, Speech, and Language Processing 16, 2 (2008), 338--349.Google ScholarDigital Library
- Verena Konz, Meinard Müller, and Rainer Kleinertz. 2013. A cross-version chord labelling approach for exploring harmonic structures—A case study on Beethoven’s appassionata. Journal of New Music Research 42, 1 (2013), 61--77.Google ScholarCross Ref
- Hendrik Vincent Koops. 2019. Computational Modelling of Variance in Musical Harmony. Ph.D. Dissertation. Utrecht University, The Netherlands.Google Scholar
- Hendrik Vincent Koops, W. Bas de Haas, John Ashley Burgoyne, Jeroen Bransen, Anna Kent-Muller, and Anja Volk. 2019. Annotator subjectivity in harmony annotations of popular music. Journal of New Music Research 48, 3 (2019), 232--252.Google ScholarCross Ref
- Hendrik Vincent Koops, Anja Volk, and W. Bas de Haas. 2015. Corpus-based rhythmic pattern analysis of ragtime syncopation. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 483--489.Google Scholar
- Katerina Kosta, Oscar F. Bandtlow, and Elaine Chew. 2014. Practical implications of dynamic markings in the score: Is piano always piano? In Proceedings of the AES International Conference on Semantic Audio.Google Scholar
- Carol L. Krumhansl and Edward J. Kessler. 1982. Tracing the dynamic changes in perceived tonal organization in a spatial representation of musical keys. Psychological Review 89, 4 (1982), 334--368.Google ScholarCross Ref
- Cynthia C. S. Liem, Emilia Gómez, and Markus Schedl. 2015. PHENICX: Innovating the classical music experience. In Proceedings of the IEEE International Conference on Multimedia and Expo Workshops (ICMEW).Google ScholarCross Ref
- Matthias Mauch and Simon Dixon. 2010. Simultaneous estimation of chords and musical context from audio. IEEE Transactions on Audio, Speech, and Language Processing 18, 6 (2010), 1280--1289.Google ScholarDigital Library
- Matthias Mauch, Hiromasa Fujihara, and Masataka Goto. 2012. Integrating additional chord information into HMM-based lyrics-to-audio alignment. IEEE Transactions on Audio, Speech, and Language Processing 20, 1 (2012), 200--210.Google ScholarDigital Library
- Matthias Mauch, Robert M. MacCallum, Mark Levy, and Armand M. Leroi. 2015. The evolution of popular music: USA 1960–2010. Royal Society Open Science 2, 5 (2015), 150081.Google ScholarCross Ref
- Brian McFee and Juan Pablo Bello. 2017. Structured training for large-vocabulary chord recognition. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 188--194.Google Scholar
- Stylianos I. Mimilakis, Christof Weiss, Vlora Arifi-Müller, Jakob Abesser, and Meinard Müller. 2019. Cross-version singing voice detection in opera recordings: Challenges for supervised learning. In Machine Learning and Knowledge Discovery in Databases. Communications in Computer and Information Science, Vol. 1168. Springer, 429--436.Google Scholar
- Fabian C. Moss, Markus Neuwirth, Daniel Harasim, and Martin Rohrmeier. 2019. Statistical characteristics of tonal harmony: A corpus study of Beethovenś string quartets. PLOS ONE 14, 6 (2019), e0217242.Google ScholarCross Ref
- Meinard Müller, Frank Kurth, David Damm, Christian Fremerey, and Michael Clausen. 2007. Lyrics-based audio retrieval and multimodal navigation in music collections. In Proceedings of the European Conference on Digital Libraries (ECDL). 112--123.Google ScholarCross Ref
- Meinard Müller, Frank Kurth, and Tido Röder. 2004. Towards an efficient algorithm for automatic score-to-audio synchronization. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 365--372.Google Scholar
- Eita Nakamura and Kunihiko Kaneko. 2019. Statistical evolutionary laws in music styles. Scientific Reports 9, 1 (2019), 15993.Google ScholarCross Ref
- Markus Neuwirth, Daniel Harasim, Fabian C. Moss, and Martin Rohrmeier. 2018. The annotated Beethoven corpus (ABC): A dataset of harmonic analyses of all Beethoven string quartets. Frontiers in Digital Humanities 5 (2018), 16.Google ScholarCross Ref
- Yizhao Ni, Matt McVicar, Raúl Santos-Rodríguez, and Tijl De Bie. 2013. Understanding effects of subjectivity in measuring chord estimation accuracy. IEEE Transactions on Audio, Speech, and Language Processing 21, 12 (2013), 2607--2615.Google ScholarDigital Library
- Mitsunori Ogihara and Tao Li. 2008. N-gram chord profiles for composer style identification. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 671--676.Google Scholar
- Nicola Orio and Diemo Schwarz. 2001. Alignment of monophonic and polyphonic music to a score. In Proceedings of the International Computer Music Conference (ICMC). 155--158.Google Scholar
- Jean-François Paiement, Douglas Eck, and Samy Bengio. 2005. A probabilistic model for chord progressions. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 312--319.Google Scholar
- Elias Pampalk, Arthur Flexer, and Gerhard Widmer. 2005. Improvements of audio-based music similarity and genre classification. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 628--633.Google Scholar
- Hélène Papadopoulos and Geoffroy Peeters. 2012. Local key estimation from an audio signal relying on harmonic and metrical structures. IEEE Transactions on Audio, Speech, and Language Processing 20, 4 (2012), 1297--1312.Google ScholarDigital Library
- Jouni Paulus, Meinard Müller, and Anssi Klapuri. 2010. Audio-based music structure analysis. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 625--636.Google Scholar
- Martin Pfleiderer, Klaus Frieler, Jakob Abesser, Wolf-Georg Zaddach, and Benjamin Burkhart. 2017. Inside the Jazzomat. Schott Campus, Mainz, Germany.Google Scholar
- Thomas Prätzlich, Jonathan Driedger, and Meinard Müller. 2016. Memory-restricted multiscale dynamic time warping. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 569--573.Google ScholarDigital Library
- Thomas Prätzlich and Meinard Müller. 2016. Triple-based analysis of music alignments without the need of ground-truth annotations. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 266--270.Google ScholarDigital Library
- Christopher Raphael. 2006. Aligning music audio with symbolic scores using a hybrid graphical model. Machine Learning 65, 2-3 (2006), 389--409.Google ScholarDigital Library
- Thomas Rocher, Matthias Robine, Pierre Hanna, and Laurent Oudre. 2010. Concurrent estimation of chords and keys from audio. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 141--146.Google Scholar
- Pablo H. Rodriguez Zivic, Favio Shifres, and Guillermo A. Cecchi. 2013. Perceptual basis of evolving Western musical styles. Proceedings of the National Academy of Sciences 110, 24 (2013), 10034--10038.Google ScholarCross Ref
- Craig Stuart Sapp. 2001. Harmonic visualizations of tonal music. In Proceedings of the International Computer Music Conference (ICMC). 423--430.Google Scholar
- Craig Stuart Sapp. 2005. Visual hierarchical key analysis. ACM Computers in Entertainment 3, 4 (2005), 1–19.Google ScholarDigital Library
- Frank Scherbaum, Meinard Müller, and Sebastian Rosenzweig. 2017. Analysis of the Tbilisi State Conservatory recordings of Artem Erkomaishvili in 1966. In Proceedings of the International Workshop on Folk Music Analysis. 29--36.Google Scholar
- Hendrik Schreiber. 2020. Data-Driven Approaches for Tempo and Key Estimation of Music Recordings. Ph.D. Dissertation. Friedrich-Alexander-University of Erlangen-Nuremberg (FAU), Germany.Google Scholar
- Hendrik Schreiber, Christof Weiss, and Meinard Müller. 2020. Local key estimation in classical music recordings: A cross-version study on Schubert’s Winterreise. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 501--505.Google ScholarCross Ref
- Franz Schubert. 1827. Winterreise. Ein Cyclus von Liedern von Wilhelm Müller. Retrieved March 17, 2021 from http://ks4.imslp.info/files/imglnks/usimg/9/92/IMSLP00414-Schubert_-_Winterreise.pdf. Gesänge für eine Singstimme mit Klavierbegleitung, Edition Peters, No.20a, n.d. Plate 9023.Google Scholar
- Björn Schuller and Benedikt Gollan. 2012. Music theoretic and perception-based features for audio key determination. Journal of New Music Research 41, 2 (2012), 175--193.Google ScholarCross Ref
- Xavier Serra. 2014. Computational approaches to the art music traditions of India and Turkey. Journal of New Music Research: Special Issue on Computational Approaches to the Art Music Traditions of India and Turkey 43, 1 (2014), 1--2.Google Scholar
- Xavier Serra. 2014. Creating research corpora for the computational study of music: The case of the CompMusic project. In Proceedings of the AES International Conference on Semantic Audio.Google Scholar
- Siddharth Sigtia, Nicolas Boulanger-Lewandowski, and Simon Dixon. 2015. Audio chord recognition with a hybrid recurrent neural network. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 127--133.Google Scholar
- Daniel Stoller, Simon Durand, and Sebastian Ewert. 2019. End-to-end lyrics alignment for polyphonic music using an audio-to-character recognition model. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 181--185.Google ScholarCross Ref
- David Temperley. 1997. An algorithm for harmonic analysis. Music Perception 15, 1 (1997), 31--68.Google ScholarCross Ref
- Vladimir Viro. 2011. Peachnote: Music score search and analysis platform. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 359--362.Google Scholar
- Christof Weiss, Stefan Balke, Jakob Abesser, and Meinard Müller. 2018. Computational corpus analysis: A case study on jazz solos. In Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR). 416--423.Google Scholar
- Christof Weiss, Matthias Mauch, Simon Dixon, and Meinard Müller. 2019. Investigating style evolution of Western classical music: A computational approach. Musicae Scientiae 23, 4 (2019), 486--507.Google ScholarCross Ref
- Christof Weiss, Hendrik Schreiber, and Meinard Müller. 2020. Local key estimation in music recordings: A case study across songs, versions, and annotators. IEEE/ACM Transactions on Audio, Speech & Language Processing 28 (2020), 2919–2932.Google ScholarDigital Library
- Christof Weiss, Frank Zalkow, Meinard Müller, Stephanie Klauk, and Rainer Kleinertz. 2017. Versionsübergreifende visualisierung harmonischer verläufe: Eine fallstudie zu wagners ring-zyklus. In Proceedings of the Jahrestagung der Gesellschaft für Informatik (GI). 205--217.Google Scholar
- Christopher Wm. White. 2013. Some Statistical Properties of Tonality, 1650–1900. Ph.D. Dissertation. Yale University, New Haven, CT.Google Scholar
- Gerhard Widmer, Simon Dixon, Werner Goebl, Elias Pampalk, and Asmir Tobudic. 2003. In search of the Horowitz factor. AI Magazine 24, 3 (2003), 111--130.Google ScholarDigital Library
- Frank Zalkow, Angel Villar Corrales, T. J. Tsai, Vlora Arifi-Müller, and Meinard Müller. 2019. Tools for semi-automatic bounding box annotation of musical measures in sheet music. In Demos and Late Breaking News of the International Society for Music Information Retrieval Conference (ISMIR).Google Scholar
- Frank Zalkow, Christof Weiss, Thomas Prätzlich, Vlora Arifi-Müller, and Meinard Müller. 2017. A multi-version approach for transferring measure annotations between music recordings. In Proceedings of the AES International Conference on Semantic Audio. 148--155.Google Scholar
- Yongwei Zhu and Mohan S. Kankanhalli. 2004. Key-based melody segmentation for popular songs. In Proceedings of the International Conference on Pattern Recognition (ICPR), Vol. 3. 862--865.Google Scholar
Index Terms
- Schubert Winterreise Dataset: A Multimodal Scenario for Music Analysis
Recommendations
Development of a method for automatic basso continuo playing
The purpose of this study is to develop an automatic basso continuo playing system. In order to find a musically appropriate sequence of chords, we propose the principle of ''harmony cost'', which is defined as the sum of two different costs: one is the ...
The Greek Music Dataset
EANN '15: Proceedings of the 16th International Conference on Engineering Applications of Neural Networks (INNS)Music Information Research (MIR) requires musical data in order to test methods and to compare results. Greek music presents a number of unique characteristics that make its musical pieces distinct from popular tracks existing in currently available ...
An Algorithmic Approach to Automated Symbolic Transcription of Hindustani Vocals
DLfM '23: Proceedings of the 10th International Conference on Digital Libraries for MusicologyAlthough a sizable body of digital music scholarship has focused on automatic transcription, it has almost exclusively been applied to Western music. In this paper, we outline an algorithm to automate the transcription of vocal performances of ...
Comments