Abstract
To make audio monitoring, the state of the art in this area makes use of local alignment algorithms between the objective audio and musical interpretation.The inductive hypothesis of a local alignment tool is that the alignment is correct to the current position of an error this is drag and accumulate to subsequent errors which do not recover unless elaborate heuristics are used. Our approach uses a local non-alignment scheme based on the audio search the entire purpose of short segments of audio taken from musical performance to get the k nearest audio segments (the proximity is determined using audio tracks based on entropy signs).The current audio segment of the play is paired with the nearest (in time) between the k previously selected audio segments of the target audio.To our knowledge, this is the first algorithm able to start up from an arbitrary point in the audio end, for example, if the musical performance had already begun when the monitoring system just went on.We complemented the overall strategy through a simple heuristic of ignoring the candidates when they are all too far in time with respect to the last position reported by the system.We have tested our method with 62 musical pieces, some of which are pop and classical music mostly.For every song we have two interpretations, we use one as the audio object and the other as the interpretation which will be monitored.We obtained excellent results.
Similar content being viewed by others
References
Bilmes JA (1998) A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models. Technical report TR-97-021, Department of Electrical Engineering and Computer Science U.C. Berkeley
Burkhard WA, Keller RM (1973) Some approaches to best-match file searching. Commun ACM 16(4):230–236. doi:http://doi.acm.org/10.1145/362003.362025
Camarena-Ibarrola A, Chavez E (2006) On musical performances identification, entropy and string matching. In: Fifth Mexican international conference on artificial intelligence 2006 (MICAI2006)
Camarena-Ibarrola A, Chavez E, Tellez ES (2009) Robust radio broadcast monitoring using a multi-band spectral entropy signature. In: 14th Iberoamerican congress on pattern recognition. Springer, pp 587–594
Cano P, Loscos A, Bonada J (1999) Score-performance matching using hmms. In: ICMC99. Audiovisual Institute, Pompeu Fabra University, Spain
Dixon S (2005) Live tracking of musical performances using on-line time warping. In: 8th International conference on digital audio effects (DAFx’05). Austrian Research Institute for Artificial Intelligence, Vienna
Dixon S, Widmer G (2005) Match: a music alignment tool chest. In: 6th International conference on music information retrieval (ISMIR). Austrian Research Institute for Artificial Intelligence, Vienna
Edgar Chávez ACI (2010) Real time tracking of musical performances. In: 9th Mexican international conference on artificial intelligence (MICAI’2010), LNCS. Springer, pp 138–148
Figueroa K, Chávez E, Navarro G (2010) The SISAP metric indexing library.URL http://www.sisap.org/Metric_Space_Library.html
Haitsma J, Kalker T (2002) A highly robust audio fingerprinting system. In: International symposium on music information retrieval ISMIR
Navarro G, Raffinot M (2002) Flexible pattern matching in strings. practical on-line search for texts and biological sequences, vol 17. Cambridge University Press
Orio N, Déchelle F (2001) Score following using spectral analysis and hidden Markov models. In: Proceedings of the ICMC, pp 151–154
Orio N, Lemouton S, Schwarz D (2003) Score following: state of the art and new developments. In: Proceedings of the 2003 conference on new interfaces for musical expression. National University of Singapore, p 41
Rabiner L, Juang B (2003) An introduction to hidden markov models. ASSP Mag IEEE 3(1):4–16
Rabiner RL (1989) A tutorial on hidden markov models and selected aplications in speech recognition. Proc IEEE 77(2):257–286
Rabiner RL, Rosenberg AE, Levinson SE (1978) Considerations in dynamic time warping for discrete word recognition. In: IEEE trans on acoustics, speech and signal processing ASSP-26, pp 622–635
Sakoe H, Chiba S (1978) Dynamic programming algortihm optimization for spoken word recognition. IEEE transactions on acoustics and speech signal processing (ASSP), pp 43–49
Sethares W, Morris R, Sethares J (2005) Beat tracking of musical performances using low-level audio features. IEEE Trans Speech Audio Process 13(2):275–285
Acknowledgments
We thank the reviewers for their helpful suggestions and comments, and thank Dr. Grigori Sidorov for his help improving the wording of this article.
Author information
Authors and Affiliations
Corresponding author
Additional information
An earlier version of this work was presented at the 9th Mexican International Conference on Artificial Intelligence (MICAI’2010) [8].
Rights and permissions
About this article
Cite this article
Camarena-Ibarrola, A., Chávez, E. Online music tracking with global alignment. Int. J. Mach. Learn. & Cyber. 2, 147–156 (2011). https://doi.org/10.1007/s13042-011-0025-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-011-0025-0