Skip to main content

Speaking Technical Documents: Using Prosody to Convey Textual and Mathematical Material

  • Conference paper
  • First Online:
Computers Helping People with Special Needs (ICCHP 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2398))

Included in the following conference series:

Abstract

Though Braille is the most common means whereby blind people can access information of any kind, it is rapidly being superseded by spoken versions of the same material. Owing to the bulky nature of Braille, the ability to transport a small portable computer, rather than multiple volumes of a book has far greater appeal. However, to date the monotonous nature of synthetic speech has meant that both highly technical information, and the more visually oriented presentational styles (such as mathematics) have been largely inaccessible to blind people. While the ability to approximate human prosody is apparent in some synthesisers, these features are not utilised by the developers of screen-access software. Consequently, the ability to present anything other than purely textual material is distinctly lacking in this type of software. This lack ensures that blind students and professionals working in the scientific or technical arena are to a great extent prevented from reading large amounts of relevant material.

This paper describes a model of verbalising mathematics using spoken audio. The language of written mathematics can be translated in to an English representation based on the grammatical structures inherent in the language. The model discussed here encapsulates the structure of an equation in the most intuitive form of communication available; natural speech, while the content is enhanced by the use of alterations in the prosody (inflection) of the voice. It concludes with a discussion of some current areas of investigation. These include the application of certain acoustic affects to the speech signal to convey auditorily, those visual cues so readily apparent from the spatially oriented layout of mathematical content.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. J. Allen, S. Hunnicutt, and D. H. Klatt. From Text to Speech: The MITalkSystem. Cambridge: CUP, 1987.

    Google Scholar 

  2. E. Couper-Kuhlen. An Introduction To English Prosody. Edward Arnold Ltd., 1986.

    Google Scholar 

  3. D. Fitzpatrick and Monaghan A. Browsing technical documents: Document modelling and user interface design’. BULLETIN DE LINGUISTIQUE APPLIQUEE ET GENERALE, 24:5–18, 1999. Also available from http://www.compapp.dcu.ie/~alex/PUB/bulag99.html.

    Google Scholar 

  4. D. Fitzpatrick. Towards Accessible Technical Documents: Production of Speech and Braille Output from Formatted Documents. PhD thesis, School of Computer Applications, Dublin City University, 1999.

    Google Scholar 

  5. D. H. Klatt. Software for a cascade/parallel synthesiser. JASA, 67:971–995, 1980.

    Google Scholar 

  6. D. R. Ladd and Monaghan A. Modelling rhythmic and syntactic effects on accent in long noun phrases. Proceedings of Eurospeech., 2:29–32, 1987.

    Google Scholar 

  7. D. R. Ladd. Intonational Phonology. Cambridge: CUP, 1996.

    Google Scholar 

  8. A. Monaghan. Intonation in a Text-to-Speech Conversion System. PhD thesis, University of Edinburgh, 1991.

    Google Scholar 

  9. A. Monaghan and Ladd D. R. Manipulating synthetic intonation for speaker characterisation. ICASSP, 1:453–456, 1991.

    Google Scholar 

  10. T. V. Raman. Audio Systems for Technical Reading. PhD thesis, Department of Computer Science, Cornell University, NY, USA, May 1994.

    Google Scholar 

  11. Robert David Stevens. Principles for the Design of Auditory Interfaces to Present Complex Information to Blind People. PhD thesis, Department of Computer Science, January 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fitzpatrick, D. (2002). Speaking Technical Documents: Using Prosody to Convey Textual and Mathematical Material. In: Miesenberger, K., Klaus, J., Zagler, W. (eds) Computers Helping People with Special Needs. ICCHP 2002. Lecture Notes in Computer Science, vol 2398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45491-8_93

Download citation

  • DOI: https://doi.org/10.1007/3-540-45491-8_93

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43904-2

  • Online ISBN: 978-3-540-45491-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics