Skip to main content
Log in

Speech Bandwidth Extension Aided by Magnitude Spectrum Data Hiding

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

Public telephone systems transmit speech across a limited frequency range, about 300–3400 Hz, called narrowband (NB) which results in a significant reduction of quality and intelligibility of speech. This paper proposes a fully backward compatible novel method for bandwidth extension of NB speech. The method uses magnitude spectrum data hiding technique to provide a perceptually better wideband speech signal. The spectral envelope parameters are extracted from the down-sampled frequency shifted version of the high-frequency components of speech signal existing above NB, which are then encoded and spread by using spreading sequences, and are embedded in the low-amplitude high-frequency regions of the magnitude spectrum of NB speech signal. The embedded information is extracted at the receiving end to reconstruct the wideband speech signal. Theoretical and simulation analyses show that the proposed method is robust to quantization and channel noises. The comparison category rating listening and log spectral distortion tests clearly show that the reconstructed wideband signal gives a much better performance in terms of speech quality when compared to the conventional speech bandwidth extension methods employing data hiding.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. S. Andreas, P. Ted, A. Venkatraman, Audio Signal Processing and Coding (Wiley-Interscience Publication, USA, 2006)

  2. P. Bauer, T. Fingscheidt, An HMM based artificial bandwidth extension evaluated by cross-language training and test, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas, NV, April 2008, pp. 4589–4592

  3. S. Chen, H. Leung, Artificial bandwidth extension of telephony speech by data hiding, in Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS 2005), Kobe, Japan, May 2005, pp. 3151–3154

  4. S. Chen, H. Leung, Concurrent data transmission through analog speech channel using data hiding. IEEE Signal Process. Lett. 12(8), 581–584 (2005)

    Article  Google Scholar 

  5. S. Chen, H. Leung, Speech bandwidth extension by data hiding and phonetic classification, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Honolulu, HI, April 2007, vol. 4 (2007), pp. 593–596

  6. S. Chen, H. Leung, H. Ding, Telephony speech enhancement by data hiding. IEEE Trans. Instrum. Meas. 56(1), 63–74 (2007)

    Article  Google Scholar 

  7. Z. Chen, C. Zhao, G. Geng, F. Yin, An audio watermark based speech bandwidth extension method. EURASIP J. Audio Speech Music Process. 2013(10), 1–8 (2013)

    Google Scholar 

  8. E.H. Dinan, E.H. Jabbari, Spreading codes for direct sequence CDMA and wideband CDMA cellular networks. IEEE Commun. Mag. 36(9), 48–54 (1998)

    Article  Google Scholar 

  9. H. Ding, Wideband audio over narrowband low-resolution media, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Montreal, Quebec, Canada, March 2004, pp. 489–492

  10. J. Epps, W.H. Holmes, A new technique for wideband enhancement of coded narrowband speech, in Proceedings of IEEE Workshop on Speech Coding, Porvoo, June 1999, pp. 174–176

  11. European Telecommunications Standards Institute (ETSI) Standard, Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithms, ETSI ES 201 108 V1.1.2, April 2000

  12. W. Feller, An Introduction to Probability Theory and Its Applications, 3rd edn. (Wiley, New York, 1970)

    MATH  Google Scholar 

  13. J.S. Garofolo, Getting Started with the DARPA TIMIT CD-ROM: An Acoustic Phonetic Continuous Speech Database (National Institute of Standards and Technology (NIST), Gaithersburg, 1988)

    Google Scholar 

  14. B. Geiser, P. Jax, P. Vary, Artificial bandwidth extension of speech supported by watermark-transmitted side information, in Proceedings of INTERSPEECH 2005, Lisbon, Portugal, September 2005, pp. 1497–1500

  15. B. Geiser, P. Vary, Backwards compatible wideband telephony in mobile networks: CELP watermarking and bandwidth extension, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Honolulu, HI, April 2007, vol 4 (2007), pp. 533–536

  16. A. Goldsmith, Wireless Communications (Cambridge University Press, New York, 2005)

    Book  Google Scholar 

  17. E. Hansler, G. Schmidt, Speech and Audio Processing in Adverse Environments (Springer, Berlin, 2008)

    Book  Google Scholar 

  18. L. Hanzo, F.C.A. Somerville, J.P. Woodard, Voice Compression and Communications: Principles and Applications for Fixed and Wireless Channels (IEEE Press, Hoboken, 2001)

    Book  Google Scholar 

  19. International Telecommunications Union, Methods for subjective determination of transmission quality, ITU-T Recommendation P.800, August 1996

  20. International Telecommunications Union, Software tools for speech and audio coding standardization, ITU-T Rec. G.191, September 2005

  21. International Telecommunications Union, Perceptual evaluation of speech quality (PESQ): An objective method for end to-end speech quality assessment of narrow-band telephone networks and speech codecs, ITU-T Recommendation P.862, February 2001

  22. International Telecommunications Union, Wideband extension to recommendation P.862 for the assessment of wideband telephone networks and speech codecs, ITU-T Recommendation P.862.2, November 2005

  23. B. Iser, W. Minker, G. Schmidt, Bandwidth Extension of Speech Signals (Springer, New York, 2008)

    Book  MATH  Google Scholar 

  24. P. Jax, Enhancement of bandlimited speech signals: algorithms and theoretical bounds. Ph.D. thesis, RWTH Aachen University, 2002

  25. P. Jax, P. Vary, An upper bound on the quality of artificial bandwidth extension of narrowband speech signals, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Orlando, FL, USA, May 2002, vol 1 (2002), pp. 237–240

  26. P. Jax, P. Vary, On artificial bandwidth extension of telephone speech. Signal Process. 83(8), 1707–1719 (2003)

    Article  MATH  Google Scholar 

  27. P. Jax, P. Vary, Bandwidth extension of speech signals: a catalyst for the introduction of wideband speech coding? IEEE Commun. Mag. 44(5), 106–111 (2006)

    Article  Google Scholar 

  28. Y. Linde, A. Buzo, R.M. Gray, An algorithm for vector quantizer design. IEEE Trans. Commun. 28(1), 84–95 (1980)

    Article  Google Scholar 

  29. Y. Nakatoh, M. Tsushima, T. Norimatsu, Generation of broadband speech from narrowband speech using piecewise linear mapping, in Proceedings of EUROSPEECH, Rhodes, Greece, September, 1997, pp. 1643–1646

  30. M. Nilsson, W.B. Kleijn, Avoiding overestimation in bandwidth extension of telephony speech, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Salt Lake City, UT, May 2001, vol 2 (2001), pp. 869–872

  31. J.G. Proakis, Digital Communications, 2nd edn. (McGraw-Hill, New York, 1989)

    MATH  Google Scholar 

  32. H. Pulakka, P. Alku, Bandwidth extension of telephone speech using a neural network and a filter bank implementation for highband Melspectrum. IEEE Trans. Audio Speech Lang. Process. 19(7), 2170–2183 (2011)

    Article  Google Scholar 

  33. H. Pulakka, U. Remes, K. Palomaki, M. Kurimo, P. Alku, Speech bandwidth extension using gaussian mixture model-based estimation of the highband Mel spectrum, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Prague, May 2011, pp. 5100–5103

  34. Y. Qian, P. Kabal, Dual-mode wideband speech recovery from narrowband speech, in Proceedings of EUROSPEECH 2003, Geneva, September 2003, pp. 1433–1436

  35. T. Rabie, D. Guerchi, Magnitude spectrum speech hiding, in Proceedings of IEEE International Conference on Signal Processing and Communications (ICSPC 2007), Dubai, November 2007, pp. 1147–1150

  36. R. Hu, V. Krishnan, D.V. Anderson, Speech bandwidth extension by improved codebook mapping towards increased phonetic classification, in Proc. INTERSPEECH 2005, Lisbon, Portugal, September 2005, pp. 1501–1504

  37. A.H. Sayed, Adaptive Filters (Wiley, Hoboken, 2008)

    Book  Google Scholar 

  38. W. Strange, T.R. Edman, J.J. Jenkins, Acoustic and phonological factors in vowel identification. J. Exp. Psychol. Hum. Percept. Perform. 5(4), 643–656 (1979)

    Article  Google Scholar 

  39. P. Vary, B. Geiser, Steganographic wideband telephony using narrowband speech codecs, in Proceedings of Asilomar Conference on Signals, Systems, and Computers (ACSSC 2007), Pacific Grove, CA, November 2007, pp. 1475–1479

  40. S. Vaseghi, E. Zavarehei, Q. Yan, Speech bandwidth extension: Extrapolations of spectral envelop and harmonicity quality of excitation, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, May 2006, pp. 844–847

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to N. Prasad.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Prasad, N., Kishore Kumar, T. Speech Bandwidth Extension Aided by Magnitude Spectrum Data Hiding. Circuits Syst Signal Process 36, 4512–4540 (2017). https://doi.org/10.1007/s00034-017-0526-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-017-0526-5

Keywords

Navigation