Abstract
Sound Event Classification (SEC) aims to understand the real life events using sound information. A major problem of SEC is that it has to deal with uncontrolled environmental conditions, leading to extremely high levels of noise, reverberation, overlapping, attenuation and distortion. As a result, some parts of the captured signals could be masked out or completely missing. In this paper, we propose a novel missing feature classification method by utilizing a missing feature kernel in the classification optimization machine. The proposed method first transforms audio segments into the Subband Power Distribution (SPD), a novel image representation where the pure signal’s area is separable. A novel masking approach is then proposed to separate the SPD into reliable and non-reliable parts. Next, missing feature kernel (MFK), in forms of probabilistic distances on the intersection between reliable areas of the SPD images, is developed and integrated into SVM optimization framework. Experimental results show superiority of the proposed method for challenging tasks of SEC, when signals come out with severe noises and distortions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lozano, H., Hernáez, I., Picón, A., Camarena, J., Navas, E.: Audio classification techniques in home environments for elderly/dependant people. In: Miesenberger, K., Klaus, J., Zagler, W., Karshmer, A. (eds.) ICCHP 2010, Part 1. LNCS, vol. 6179, pp. 320–323. Springer, Heidelberg (2010)
Raj, B., Stern, R.M.: Missing-feature approaches in speech recognition. Sig. Process. Mag. 22, 101–116 (2005)
Dennis, J., Dat, T.H., Chng, E.: Image feature representation of the subband power distribution for robust sound event classification. IEEE Trans. Audio Speech Lang. Process. 321, 367–377 (2012)
Kadir, T., Brady, M.: Non-parametric estimation of probability distributions from sampled signals. Technical report, OUEL (2005)
Dat, T.H., Li, H.: Sound event recognition with probabilistic distance SVMs. IEEE Trans. Audio Speech Lang. Process. 19, 1556–1568 (2011)
Sound Effect Collections. http://www.sound-ideas.com/
Cold Gold contact microphone. http://www.contactmicrophones.com/
Dat, T.H., Takeda, K., Itakura, F.: On-line Gaussian mixture modeling in the log-power domain for signal-to-noise ratio estimation and speech enhancement. Speech Commun. 48, 1515–1527 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Dat, T.H., Dennis, J.W., Terence, N.W.Z. (2015). Missing Feature Kernel and Nonparametric Window Subband Power Distribution for Robust Sound Event Classification. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds) Speech and Computer. SPECOM 2015. Lecture Notes in Computer Science(), vol 9319. Springer, Cham. https://doi.org/10.1007/978-3-319-23132-7_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-23132-7_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23131-0
Online ISBN: 978-3-319-23132-7
eBook Packages: Computer ScienceComputer Science (R0)