Abstract
This paper presents a method of integrating a probabilistic part-of-speech tagger and a chunker. This integration lead to the correction of a number of errors made by the tagger when used alone. Both tagger and chunker are implemented as weighted finite state machines. Experiments on a French corpus showed a decrease of the word error rate of about 12%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Mohri, M.: Weighted Grammars Tools: the GRM Library. In: Junqua, J.-C., Van Noord, G. (eds.) Robustness in Language and Speech Technology, pp. 19–40. Kluwer Academic Publishers, Dordrecht (2000)
Mohri, M.: Finite-state transducers in language and speech processing. Computational Linguistics 23 (1997)
Bahl, L.R., Mercer, R.L.: Part of speech assignment by a statistical decision algorithm. In: Proceedings IEEE International Symposium on Information Theory, pp. 88–89 (1976)
Katz, S.M.: Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech, and Signal Processing 35, 400–401 (1987)
Allauzen, C., Mohri, M., Roark, B.: Generalized algorithms for constructing statistical language models. In: 41st Meeting of the Association for Computational Linguistics, Sapporo, Japon, pp. 40–47 (2003)
Tzoukermann, E., Radev, D.R.: Use of weighted finite state trasducers in part of speech tagging. Natural Language Engineering (1997)
Kempe, A.: Finite state transducers approximating hidden markov models. In: 35th Meeting of the Association for Computational Linguistics (ACL 1997), Madrid, Spain, pp. 460–467 (1997)
Jurish, B.: A hybrid approach to part-of-speech tagging. Technical report, Berlin-Brandenburgishe Akademie der Wissenschaften (2003)
Abeillé, A., Clément, L., Toussenel, F.: Building a treebank for french. In: Abeillé, A. (ed.) Treebanks. Kluwer, Dordrecht (2003)
Abney, S.P.: Parsing by chunks. In: Berwick, R.C., Abney, S.P., Tenny, C. (eds.) Principle-Based Parsing: Computation and Psycholinguistics, pp. 257–278. Kluwer, Dordrecht (1991)
Abney, S.: Partial parsing via finite-state cascades. In: Workshop on Robust Parsing, 8th European Summer School in Logic, Language and Information, Prague, Czech Republic, pp. 8–15 (1996)
Abney, S.: Chunk stylebook (1996), http://www.vinartus.com/spa/publications.html
Mohri, M., Pereira, F.C.N.: Dynamic compilation of weighted context-free grammars. In: 36th Meeting of the Association for Computational Linguistics (ACL 1998) (1998)
Chen, K.H., Chen, H.H.: Extracting noun phrases from large-scale texts: A hybrid approach and its automatic evaluation. In: Meeting of the Association for Computational Linguistics, pp. 234–241 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nasr, A., Volanschi, A. (2006). Integrating a POS Tagger and a Chunker Implemented as Weighted Finite State Machines. In: Yli-Jyrä, A., Karttunen, L., Karhumäki, J. (eds) Finite-State Methods and Natural Language Processing. FSMNLP 2005. Lecture Notes in Computer Science(), vol 4002. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11780885_17
Download citation
DOI: https://doi.org/10.1007/11780885_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35467-3
Online ISBN: 978-3-540-35469-7
eBook Packages: Computer ScienceComputer Science (R0)