Missing Feature Kernel and Nonparametric Window Subband Power Distribution for Robust Sound Event Classification

Dat, Tran Huy; Dennis, Jonathan William; Terence, Ng Wen Zheng

doi:10.1007/978-3-319-23132-7_34

Tran Huy Dat⁷,
Jonathan William Dennis⁷ &
Ng Wen Zheng Terence⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9319))

Included in the following conference series:

International Conference on Speech and Computer

1605 Accesses

Abstract

Sound Event Classification (SEC) aims to understand the real life events using sound information. A major problem of SEC is that it has to deal with uncontrolled environmental conditions, leading to extremely high levels of noise, reverberation, overlapping, attenuation and distortion. As a result, some parts of the captured signals could be masked out or completely missing. In this paper, we propose a novel missing feature classification method by utilizing a missing feature kernel in the classification optimization machine. The proposed method first transforms audio segments into the Subband Power Distribution (SPD), a novel image representation where the pure signal’s area is separable. A novel masking approach is then proposed to separate the SPD into reliable and non-reliable parts. Next, missing feature kernel (MFK), in forms of probabilistic distances on the intersection between reliable areas of the SPD images, is developed and integrated into SVM optimization framework. Experimental results show superiority of the proposed method for challenging tasks of SEC, when signals come out with severe noises and distortions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lozano, H., Hernáez, I., Picón, A., Camarena, J., Navas, E.: Audio classification techniques in home environments for elderly/dependant people. In: Miesenberger, K., Klaus, J., Zagler, W., Karshmer, A. (eds.) ICCHP 2010, Part 1. LNCS, vol. 6179, pp. 320–323. Springer, Heidelberg (2010)
Chapter Google Scholar
Raj, B., Stern, R.M.: Missing-feature approaches in speech recognition. Sig. Process. Mag. 22, 101–116 (2005)
Article Google Scholar
Dennis, J., Dat, T.H., Chng, E.: Image feature representation of the subband power distribution for robust sound event classification. IEEE Trans. Audio Speech Lang. Process. 321, 367–377 (2012)
Google Scholar
Kadir, T., Brady, M.: Non-parametric estimation of probability distributions from sampled signals. Technical report, OUEL (2005)
Google Scholar
Dat, T.H., Li, H.: Sound event recognition with probabilistic distance SVMs. IEEE Trans. Audio Speech Lang. Process. 19, 1556–1568 (2011)
Article Google Scholar
Sound Effect Collections. http://www.sound-ideas.com/
Cold Gold contact microphone. http://www.contactmicrophones.com/
Dat, T.H., Takeda, K., Itakura, F.: On-line Gaussian mixture modeling in the log-power domain for signal-to-noise ratio estimation and speech enhancement. Speech Commun. 48, 1515–1527 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Infocomm Research, A*STAR, Singapore, Singapore
Tran Huy Dat, Jonathan William Dennis & Ng Wen Zheng Terence

Authors

Tran Huy Dat
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan William Dennis
View author publications
You can also search for this author in PubMed Google Scholar
Ng Wen Zheng Terence
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tran Huy Dat .

Editor information

Editors and Affiliations

SPIIRAS, Saint-Petersburg, Russia
Andrey Ronzhin
Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova
University of Patras, Patras, Greece
Nikos Fakotakis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dat, T.H., Dennis, J.W., Terence, N.W.Z. (2015). Missing Feature Kernel and Nonparametric Window Subband Power Distribution for Robust Sound Event Classification. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds) Speech and Computer. SPECOM 2015. Lecture Notes in Computer Science(), vol 9319. Springer, Cham. https://doi.org/10.1007/978-3-319-23132-7_34

Download citation

DOI: https://doi.org/10.1007/978-3-319-23132-7_34
Published: 04 September 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23131-0
Online ISBN: 978-3-319-23132-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics