ABSTRACT
This paper introduces a novel Support Vector Machines (SVMs) based voting algorithm for reranking, which provides a way to solve the sequential models indirectly. We have presented a risk formulation under the PAC framework for this voting algorithm. We have applied this algorithm to the parse reranking problem, and achieved labeled recall and precision of 89.4%/89.8% on WSJ section 23 of Penn Treebank.
- E. Black, F. Jelinek, J. Lafferty, Magerman D. M., R. Mercer, and S. Roukos. 1993. Towards history-based grammars: Using richer models for probabilistic parsing. In Proceedings of the ACL 1993. Google ScholarDigital Library
- Rens Bod. 1998. Beyond Grammar: An Experience-Based Theory of Language. CSLI Publications/Cambridge University Press.Google Scholar
- E. Charniak. 2000. A maximum-entropy-inspired parser. In Proceedings of NAACL 2000. Google ScholarDigital Library
- Michael Collins and Nigel Duffy. 2001. Convolution kernels for natural language. In Proceedings of Neural Information Processing Systems (NIPS 14).Google Scholar
- Michael Collins and Nigel Duffy. 2002. New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In Proceedings of ACL 2002. Google ScholarDigital Library
- Michael Collins. 1999. Head-Driven Statistical Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania. Google ScholarDigital Library
- Michael Collins. 2000. Discriminative reranking for natural language parsing. In Proceedings of the 7th International Conference on Machine Learning. Google ScholarDigital Library
- N. Cristianini and J. Shawe-Tayor. 2000. An introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press. Google ScholarDigital Library
- Emma Dijkstra. 2001. Support vector machines for parse selection. Master's thesis, Univ. of Edinburgh.Google Scholar
- Yoav Freund and Robert E. Schapire. 1999. Large margin classification using the perceptron algorithm. Machine Learning, 37(3):277--296. Google ScholarDigital Library
- Claudio Gentile. 2001. A new approximate maximal margin classification algorithm. Journal of Machine Learning Research, 2:213--242. Google ScholarDigital Library
- Thore Graepel, Ralf Herbrich, and Robert C. Williamson. 2001. From margin to sparsity. In Advances in Neural Information Processing Systems 13.Google Scholar
- Ralf Herbrich, Thore Graepel, and Klaus Obermayer. 2000. Large margin rank boundaries for ordinal regression. In Advances in Large Margin Classifiers, pages 115--132. MIT Press.Google Scholar
- Thorsten Joachims. 1998. Making large-scale support vector machine learning practical. In Advances in Kernel Methods: Support Vector Machine. MIT Press. Google ScholarDigital Library
- A. Joshi and Y. Schabes. 1997. Tree-adjoining grammars. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 3, pages 69--124. Springer. Google ScholarDigital Library
- W. Krauth and M. Mezard. 1987. Learning algorithms with optimal stability in neural networks. Journal of Physics A, 20:745--752.Google ScholarCross Ref
- Taku Kudo and Yuji Matsumoto. 2001. Chunking with support vector machines. In Proceedings of NAACL 2001. Google ScholarDigital Library
- J. Lafferty, A. McCallum, and F. Pereira. 2001. Conditional random fields: Probabilistic models for segmentation and labeling sequence data. In Proceedings of ICML. Google ScholarDigital Library
- Yaoyong Li, Hugo Zaragoza, Ralf Herbrich, John Shawe-Taylor, and Jaz Kandola. 2002. The perceptron algorithm with uneven margins. In Proceedings of the International Conference of Machine Learning. Google ScholarDigital Library
- Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1994. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313--330. Google ScholarDigital Library
- John Platt. 1999. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers. MIT Press.Google Scholar
- Jesper Salomon, Simon King, and Miles Osborne. 2002. Framewise phone classification using support vector machines. In Proceedings of ICSLP 2002.Google Scholar
- John Shawe-Taylor, Peter L. Bartlett, Robert C. Williamson, and Martin Anthony. 1998. Structural risk minimization over data-dependent hierarchies. IEEE Trans. on Information Theory, 44(5): 1926--1940.Google ScholarDigital Library
- A.J. Smola, P. Bartlett, B. Schölkopf, and C. Schuurmans. 2000. Introduction to large margin classifiers. In A. J. Smola, P. Bartlett, B. Schölkopf, and C. Schuurmans, editors, Advances in Large Margin Classifiers, pages 1--26. MIT Press.Google Scholar
- Vladimir N. Vapnik. 1998. Statistical Learning Theory. John Wiley and Sons, Inc.Google Scholar
- Vladimir N. Vapnik. 1999. The Nature of Statistical Learning Theory. Springer, 2 edition. Google ScholarDigital Library
Recommendations
Using LTAG based features in parse reranking
EMNLP '03: Proceedings of the 2003 conference on Empirical methods in natural language processingWe propose the use of Lexicalized Tree Adjoining Grammar (LTAG) as a source of features that are useful for reranking the output of a statistical parser. In this paper, we extend the notion of a tree kernel over arbitrary sub-trees of the parse to the ...
Rich bitext projection features for parse reranking
EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational LinguisticsMany different types of features have been shown to improve accuracy in parse reranking. A class of features that thus far has not been considered is based on a projection of the syntactic structure of a translation of the text to be parsed. The ...
Dependency parse reranking with rich subtree features
In pursuing machine understanding of human language, highly accurate syntactic analysis is a crucial step. In this work, we focus on dependency grammar, which models syntax by encoding transparent predicate-argument structures. Recent advances in ...
Comments