Abstract
The problem of information overload can be solved by the application of information filtering to the huge amount of data. Information on radio and television can be filtered using speech recognition of the audio track. A prototype system using closed captions has been developed on top of the INQUERY information access system. The challange of integrating speech recognition and information retrieval into a working system is a big one. The open problems are the selection of a document representation model, the recognition and selection of indexing features for speech retrieval and dealing with the erroneous output of recognition processes.
Keywords
This research has been conducted at the Digital Cambridge Research Lab, Massachusetts, United States of America
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
B.M. Arons. Interactively skimming recorded speech. PhD thesis, Massachusetts Institute of Technology, February 1994.
N.J. Belkin and W.B. Croft. Information filtering and information retrieval: Two sides of the same coin? Communications of the ACM, 35(12):29–38, 1992.
E.W. Brown, J.P. Callan, and W.B. Croft. Fast incremental indexing for full-text information retrieval. In Proceedings of the 20th International Conference on Very Large Databases (VLDB), Santiago, Chile, 1994.
E.W. Brown, J.P. Callan, W.B. Croft, and J.E.B. Moss. Supporting full-text information retrieval with a persistent object store. In EDBT '94, 1994.
J.P. Callan and W.B. Croft. An evaluation of query processing strategies using the TIPSTER collection. In Proceedings of the sixteenth annual international ACM SIGIR conference on research and development in information retrieval, pages 347–356, 1993.
J.P. Callan, W.B. Croft, and S.M. Harding. The INQUERY retrieval system. In Proceedings of the 3rd international conference on database and expert systems applications, pages 78–83, 1992.
W.B. Croft, S.M. Harding, K. Taghva, and J. Borsack. An evaluation of information retrieval accuracy with simulated OCR output. In Symposium of Document Analysis and Information Retrieval, 1992.
S.J. Cox. Speech and language processing, chapter Hidden Markov Models for automatic speech recognition: theory and application, pages 209–230. Chapman and Hall, 1990.
F.R. Chen and M.M. Withgott. The use of emphasis to automatically summarize a spoken discourse. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, San Fransisco, CA, March 1992.
E. Deardorff, T.D.C. Little, J.D. Marshall, D. Venkatesh, and R. Walzer. Video scene decomposition with the motion picture parser. In IS&T/SPIE Symposium on Electronic Imaging Science and Technology, San Jose, 1994.
A.P. de Vries. Multimedia information access. Master's thesis, University of Twente, August 1995.
R. Erfle. Specification of temporal constraints in multimedia documents using HyTime. Electronic publishing, 6(4):397–411, 1993.
Federal Communications Commission. 15.119 Closed caption decoder requirements for television receivers.
U. Glavitsch and P. Schäuble. A system for retrieving speech documents. In Proceedings of the 15th annual international SIGIR, pages 168–176, Denmark, 6 1992.
M.A. Hearst. Multi-paragraph segmentation of expository text. In ACL '94, Las Cruces, 1994.
T.D.C. Little, G. Ahanger, R.J. Folz, J.F. Gibbon, F.W. Reeve, D.H. Schelleng, and D. Venkatesh. A digital on-demand video service supporting content-based queries. In Proceedings of the first ACM international conference on multimedia, pages 427–436, Anaheim California, 1993.
M. Lesk. What to do when there's too much information. In Hypertext '89 Proceedings, pages 305–318, New York, 1989. ACM.
Levergood, Payne, Gettys, Treese, and Stewart. AudioFile: a network-transparent system for distributed audio applications. In USENIX Summer Conference, June 1993.
P. Maes. Agents that reduce work and information overload. Communications of the ACM, 37(7):31–42, July 1994.
J.K. Ousterhout. Tcl and the Tk toolkit. Addison-Wesley Publishing, 1994.
J. Pearl. Probabilistic reasoning in intelligent systems. Morgan Kaufmann, California, 1989.
Rudnicky, Hauptmann, and Lee. Survey of current speech technology. Communications of the ACM, 37(3):52–57, 1994.
L.R. Rabiner and R.W. Schafer. Digital processing of speech. Prentice-Hall, New-Jersey, 1978.
G. Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison Wesley Publishing, 1989.
A. Syrdal, R. Bennett, and S. Greenspan. Applied speech technology. CRC Press, Inc., Florida, 1994.
M. Sanderson and C.J. van Rijsbergen. NRT: news retrieval tool. Electronic Publishing, 4(4):205–217, 1991.
P. Schäuble and M. Wechsler. First experiences with a system for content based retrieval of information from speech recordings. http://www-ir.inf.ethz.ch/.
K. Taghva, J. Borsack, and A. Condit. Results of applying probabilistic IR to OCR text. In Proceedings of the seventeenth annual international ACM SIGIR Conference on research and development in information retrieval, Dublin, Ireland, 1994.
K. Taghva, J. Borsack, A. Condit, and S. Erva. The effects of noisy data on text retrieval. Journal of the American Society for Information Science, 45(1):50–58, 1994.
H. Turtle and W.B. Croft. Evaluation of an inference network-based retrieval model. ACM Transactions of information systems, 9(3), 1991.
P.A.C. Verkoulen and H.M. Blanken. SGML/HyTime for supporting cooperative authoring of multimedia applications. In Advanced Course: Multimedia Databases in Perspective, pages 179–212. Center for Telematics and Information Technology of the University of Twente, 1995.
C.J. van Rijsbergen. Information retrieval. Butterworths, London, 2nd edition, 1979.
Hein van Steenis. Spraakherkenning levert eindelijk produkten op. Automatiseringsgids, May 26 1995.
L.D. Wilcox and M.A. Bush. HMM-based wordspotting for voice editing and indexing. In Proceedings of the Second European Conference on Speech Communication and Technology, Genova, Italy, September 1991.
P. Willet. Document retrieval experiments using indexing vocabularies of varying size. II. Hashing, truncation, digram and trigram encoding of indexing terms. Journal of Documentation, 35(4):296–305, 1979.
T.W. Yan and H. Garcia-Molina. SIFT — a tool for wide-area information dissemination. http://sift.stanford.edu/.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Vries, A.P. (1996). Television information filtering through speech recognition. In: Butscher, B., Moeller, E., Pusch, H. (eds) Interactive Distributed Multimedia Systems and Services. IDMS 1996. Lecture Notes in Computer Science, vol 1045. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60938-5_5
Download citation
DOI: https://doi.org/10.1007/3-540-60938-5_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60938-4
Online ISBN: 978-3-540-49742-4
eBook Packages: Springer Book Archive