Abstract
Probabilistic classifiers induce a similarity metric at each location in the space of the data. This is measured by the Fisher Information Matrix. Pairwise distances in this Riemannian space, calculated along geodesic paths, can be used to generate a similarity map of the data. The novelty in the paper is twofold; to improve the methodology for visualisation of data structures in low-dimensional manifolds, and to illustrate the value of inferring the structure from a probabilistic classifier by metric learning, through application to music data. This leads to the discovery of new structures and song similarities beyond the original genre classification labels. These similarities are not directly observable by measuring Euclidean distances between features of the original space, but require the correct metric to reflect similarity based on genre. The results quantify the extent to which music from bands typically associated with one particular genre can, in fact, crossover strongly to another genre.
Similar content being viewed by others
References
Amari SI (1998) Natural gradient works efficiently in learning. Neural comput 10(2):251–276
Amari Si WuS (1999) Improving support vector machine classifiers by modifying kernel functions. Neural Netw 12(6):783–789
Bogdanov D, Serra J, Wack N, Herrera P (2009) From low-level to high-level: Comparative study of music similarity measures. In: 2009 11th IEEE international symposium on multimedia, pp 453–458. IEEE
Carter KM, Raich R, Finn WG, Hero III AO (2009) Fine: fisher information nonparametric embedding. IEEE Trans Pattern Anal Mach Intell 31(11):2093–2098. https://doi.org/10.1109/TPAMI.2009.67
Casaña-Eslava RV, Lisboa PJ, Ortega-Martorell S, Jarman IH, Martín-Guerrero JD (2020) Probabilistic quantum clustering. Knowledge-Based Syst. https://doi.org/10.1016/j.knosys.2020.105567
Casaña-Eslava RV, Jarman IH, Lisboa PJ, Martín-Guerrero JD (2017) Quantum clustering in non-spherical data distributions: finding a suitable number of clusters. Neurocomputing 268:127–141
Casaña-Eslava RV, Martín-Guerrero JD, Ortega-Martorell S, Lisboa PJ, Jarman IH (2019) Scalable implementation of measuring distances in a riemannian manifold based on the fisher information metric. In: 2019 International joint conference on neural networks (IJCNN), pp 1–7. IEEE
Casey MA, Veltkamp R, Goto M, Leman M, Rhodes C, Slaney M (2008) Content-based music information retrieval: current directions and future challenges. Proc IEEE 96(4):668–696
Chambers SJ, Jarman IH, Etchells TA, Lisboa PJG (2013) Inference of number of prototypes with a framework approach to k-means clustering. Int J Biomed Eng Technol 13(4):323–340
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learning 20(3):273–297
Cox MA, Cox TF (2008) Multidimensional scaling. In: Chen C, Härdle WK, Unwin A (eds) Handbook of data visualization, Springer, Heidelberg, pp 315–347
Dijkstra EW (1959) A note on two problems in connexion with graphs. Numerische Mathematik 1(1):269–271
Floyd RW (1962) Algorithm 97: shortest path. Commun ACM 5(6):345. https://doi.org/10.1145/367766.368168
Goto M, Goto T (2005) Musicream: New music playback interface for streaming, sticking, sorting, and recalling musical pieces. In: ISMIR, pp 404–411
Gower JC (1966) Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53(3–4):325–338
Hamasaki M, Goto M (2013) Songrium: A music browsing assistance service based on visualization of massive open collaboration within music content creation community. In: Proceedings of the 9th international symposium on open collaboration, pp 1–10
Haykin SS (2009) Neural networks and learning machines, 3rd edn. Pearson Education, Upper Saddle River
Horn D, Gottlieb A (2001) Algorithm for data clustering in pattern recognition problems based on quantum mechanics. Phys Rev Lett 88(1):018702
Horn D, Gottlieb A (2001) The method of quantum clustering. Proc Neural Inf Process Syst NIPS 2001:769–776
Jaakkola T, Haussler D et al (1999) Exploiting generative models in discriminative classifiers. In: Kearns MJ, Solla SA, Cohn DA (eds) Advances in neural information processing systems 11. MIT Press, pp 487–493
Jones MC, Downie JS, Ehmann AF (2007) Human similarity judgments: Implications for the design of formal evaluations. In: ISMIR, pp 539–542
Kaski S, Sinkkonen J (2000) Metrics that learn relevance. In: Proceedings of the IEEE-INNS-ENNS international joint conference on neural networks. IJCNN 2000. Neural computing: new challenges and perspectives for the new millennium, vol 5, pp 547–552 https://doi.org/10.1109/IJCNN.2000.861526
Kaski S, Sinkkonen J, Peltonen J (2001) Bankruptcy analysis with self-organizing maps in learning metrics. IEEE Trans Neural Netw 12(4):936–947. https://doi.org/10.1109/72.935102
Kim JH, Tomasik B, Turnbull D (2009) Using artist similarity to propagate semantic information. ISMIR 9:375–380
Knees P, Pampalk E, Widmer G (2004) Artist classification with web-based data. In: ISMIR
Knees P, Schedl M, Pohle T, Widmer G (2006) An innovative three-dimensional user interface for exploring music collections enriched. In: Proceedings of the 14th ACM international conference on Multimedia, pp 17–24
Kullback S (1997) Information theory and statistics. Courier Corporation, New York
Lübbers D, Jarke M (2009) Adaptive multimodal exploration of music collections. In: Proceedings of the 10th international society for music information retrieval conference, pp 195–200. ISMIR, Kobe, Japan. https://doi.org/10.5281/zenodo.1415518
Li Y, Wang Y, Wang Y, Jiao L, Liu Y (2016) Quantum clustering using kernel entropy component analysis. Neurocomputing 202:36–48
Lippens S, Martens JP, De Mulder T (2004) A comparison of human and automatic musical genre classification. In: 2004 IEEE international conference on acoustics, speech, and signal processing, vol 4, pp. iv–iv. IEEE
Lisboa PJG, Etchells TA, Jarman IH, Chambers SJ (2013) Finding reproducible cluster partitions for the k-means algorithm. BMC Bioinf 14(Suppl. 1):S8
Mandel MI, Pascanu R, Eck D, Bengio Y, Aiello LM, Schifanella R, Menczer F (2011) Contextual tag inference. ACM Trans Multimed Comput Commun Appl (TOMM) 7(1):1–18
McKay C (2010) Automatic music classification with jMIR. Citeseer
McKay C, Fujinaga I, Depalle P (2005) jaudio: A feature extraction library. In: Proceedings of the international conference on music information retrieval, pp 600–603
Miotto R, Barrington L, Lanckriet GR (2010) Improving auto-tagging by modeling semantic co-occurrences. In: ISMIR, pp 297–302
Nash J (1954) C1 isometric imbeddings. Ann Math 60(3):383–396. https://doi.org/10.2307/1969840
Nash J (1956) The imbedding problem for Riemannian manifolds. Ann Math 63(1):20–63. https://doi.org/10.2307/1969989
Newman ME (2004) Detecting community structure in networks. Eur Phys J B Conden Matter Complex Syst 38(2):321–330
Parisi L, RaviChandran N, Manaog ML (2020) A novel hybrid algorithm for aiding prediction of prognosis in patients with hepatitis. Neural Comput Appl 32(8):3839–3852
Platt J et al (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classif 10(3):61–74
Rao CR (1992) Information and the accuracy attainable in the estimation of statistical parameters. In: Kotz S, Johnson NL (eds) Breakthroughs in statistics. Springer, New York, pp 235–247
Ruiz H, Etchells TA, Jarman IH, Martín JD, Lisboa PJ (2013) A principled approach to network-based classification and data representation. Neurocomputing 112:79–91
Sammon JW (1969) A nonlinear mapping for data structure analysis. IEEE Trans Comput 18(5):401–409
Schedl M, Flexer A, Urbano J (2013) The neglected user in music information retrieval research. J Intell Inf Syst 41(3):523–539
Schedl M, Gutiérrez EG, Urbano J (2014) Music information retrieval: Recent developments and applications. Foundations and Trends in Information Retrieval. 2014 Sept 12; 8 (2-3): 127–261
Schedl M, Pohle T, Knees P, Widmer G (2011) Exploring the music similarity space on the web. ACM Trans Inf Syst (TOIS) 29(3):1–24
Schindler A, Mayer R, Rauber A (2012) Facilitating comprehensive benchmarking experiments on the million song dataset. In: ISMIR, pp 469–474
Schindler A, Rauber A (2012) Capturing the temporal domain in echonest features for improved classification effectiveness. In: International workshop on adaptive multimedia retrieval, Springer, pp 214–227
Seyerlehner K, Schedl M, Pohle T, Knees P (2010) Using block-level features for genre classification, tag classification and music similarity estimation. Submission to Audio Music Similarity and Retrieval Task of MIREX 2010
Sordo M et al (2012) Semantic annotation of music collections: A computational approach. Ph.D. thesis, Universitat Pompeu Fabra
Torgerson WS (1952) Multidimensional scaling: I. theory and method. Psychometrika 17(4):401–419
Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293–302
Urbano J (2013) Evaluation in audio music similarity. Ph.D. thesis, Universidad Carlos III de Madrid
Urbano J, Morato J, Marrero M, Martín D (2010) Crowdsourcing preference judgments for evaluation of music similarity tasks. In: ACM SIGIR workshop on crowdsourcing for search evaluation, ACM New York, pp 9–16
Vincent P, Bengio Y (2003) Manifold parzen windows. In: Advances in neural information processing systems, pp 849–856
Warshall S (1962) A theorem on boolean matrices. J ACM 9(1):11–12. https://doi.org/10.1145/321105.321107
Young G, Householder AS (1938) Discussion of a set of points in terms of their mutual distances. Psychometrika 3(1):19–22
Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. Adv Neural Inf Process Syst 17:1601–1608
Zhang YC, Séaghdha DÓ, Quercia D, Jambor T (2012) Auralist: introducing serendipity into music recommendation. In: Proceedings of the fifth ACM international conference on Web search and data mining, pp 13–22
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
A Derivation of FI matrix as a function of MLP
For obtaining \(\mathbf {FI} \left( \mathbf {x} \right)\) as a function of the MLP output estimators, the soft-max logarithms and their derivatives are needed:
where \(p_j = p(c_j|\mathbf {x})\), \(\nabla = \nabla _{\mathbf {x}} = \frac{d}{d{\mathbf {x}}}\) and \(a_j = a_j( \mathbf {x} )\) for notation abbreviation. Now, combining Eqs. 2 and 31 and expanding the product:
Rearranging terms and considering that any variable t can be expressed as \(\sum _i^J tp_i = t \sum _i^J p_i = t\), we get:
After merging the summations, the final expression of the \(\mathbf {FI} \left( \mathbf {x} \right)\) for the MLP is obtained:
With this Eq. 36 (equivalent to Eq. 5), the metric is estimated locally, computing differential distances as:
B Silhouette figures
The Silhouette metric is especially adequate when the clustering method is based on minimizing pairwise distances within the cluster members. However, when the clustering also takes account of density similarity, non-spherical shapes are expected and hence, the Silhouette metric of a PQC probably might be worse than K-means Silhouette even if the PQC cluster better reflects the data profiles. In any case, the Silhouette metric is computed for all cluster solutions based in the cMDs data (Fig. 12).
Rights and permissions
About this article
Cite this article
Casaña-Eslava, R.V., Jarman, I.H., Ortega-Martorell, S. et al. Music genre profiling based on Fisher manifolds and Probabilistic Quantum Clustering. Neural Comput & Applic 33, 7521–7539 (2021). https://doi.org/10.1007/s00521-020-05499-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05499-x