Article

Free Access

Closing the gap: learning-based information extraction rivaling knowledge-engineering methods

Authors:
Hai Leong Chieu

DSO National Laboratories, Singapore

DSO National Laboratories, Singapore
View Profile

,
Hwee Tou Ng

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Yoong Keok Lee

DSO National Laboratories, Singapore

DSO National Laboratories, Singapore
View Profile

ACL '03: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1July 2003Pages 216–223https://doi.org/10.3115/1075096.1075124

Published:07 July 2003Publication History

ACL '03: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1

Pages 216–223

ABSTRACT

In this paper, we present a learning approach to the scenario template task of information extraction, where information filling one template could come from multiple sentences. When tested on the MUC-4 task, our learning approach achieves accuracy competitive to the best of the MUC-4 systems, which were all built with manually engineered rules. Our analysis reveals that our use of full parsing and state-of-the-art learning algorithms have contributed to the good performance. To our knowledge, this is the first research to have demonstrated that a learning approach to the full-scale information extraction task could achieve performance rivaling that of the knowledge engineering approach.

References

M. E. Califf and R. J. Mooney. 1999. Relational learning of pattern-match rules for information extraction. In Proceedings of AAAI99, pages 328--334. Google ScholarDigital Library
E. Charniak, C. Hendrickson, N. Jacobson, and M. Perkowitz. 1993. Equations for part-of-speech tagging. In Proceedings of AAA193, pages 784--789.Google Scholar
H. L. Chieu and H. T. Ng. 2002a. A maximum entropy approach to information extraction from semi-structured and free text. In Proceedings of AAAI02, pages 786--791. Google ScholarDigital Library
H. L. Chieu and H. T. Ng. 2002b. Named entity recognition: A maximum entropy approach using global information. In Proceedings of COLING02, pages 190--196. Google ScholarDigital Library
F. Ciravegna. 2001. Adaptive information extraction from text by rule induction and generalisation. In Proceedings of IJCAI01, pages 1251--1256. Google ScholarDigital Library
M. Collins. 1999. Head-driven statistical models for natural language parsing. Ph.D. thesis, Department of Computer and Information Science, University of Pennsylvania. Google ScholarDigital Library
R. O. Duda and P. E. Hart. 1973. Pattern Classification and Scene Analysis. Wiley, New York. Google ScholarDigital Library
D. Fisher, S. Soderland, J. McCarthy, F. Feng, and W. Lehnert. 1995. Description of the UMass system as used for MUC-6. In Proceedings of MUC-6, pages 127--140. Google ScholarDigital Library
D. Gildea and D. Jurafsky. 2000. Automatic labelling of semantic roles. In Proceedings of ACL00, pages 512--520. Google ScholarDigital Library
S. Miller, M. Crystal, H. Fox, L. Ramshaw, R. Schwartz, R. Stone, R. Weischedel, and the Annotation Group. 1998. Algorithms that learn to extract information BBN: Description of the SIFT system as used for MUC-7. In Proceedings of MUC-7.Google Scholar
J. R. Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco. Google ScholarDigital Library
A. Ratnaparkhi. 1998. Maximum Entropy Models for Natural Language Ambiguity Resolution. Ph.D. thesis, Department of Computer and Information Science, University of Pennsylvania. Google ScholarDigital Library
L. Rau, G. Krupka, and P. Jacobs. 1992. GE NL-TOOLSET: MUC-4 test results and analysis. In Proceedings of MUC-4, pages 94--99. Google ScholarDigital Library
D. Roth and W. Yih. 2001. Relational learning via propositional algorithms: An information extraction case study. In Proceedings of IJACI01, pages 1257--1263. Google ScholarDigital Library
S. Soderland. 1999. Learning information extraction rules for semi-structured and free text. Machine Learning, 34(1/2/3):233--272. Google ScholarDigital Library
W. M. Soon, H. T. Ng, and D. C. Y. Lim. 2001. A machine learning approach to coreference resolution of noun phrases. Computational Linguistics, 27(4):521--544. Google ScholarCross Ref
V. N. Vapnik. 1995. The Nature of Statistical Learning Theory. Springer-Verlag, New York. Google ScholarDigital Library

Closing the gap: learning-based information extraction rivaling knowledge-engineering methods
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Towards Closing the Security Gap of Tweak-aNd-Tweak (TNT)
Advances in Cryptology – ASIACRYPT 2020
Abstract
Tweakable block ciphers (TBCs) have been established as a valuable replacement for many applications of classical block ciphers. While several dedicated TBCs have been proposed in the previous years, generic constructions that build a TBC from a ...
$^{}$
$\sqrt{}^{}$ $^{}$ $\sqrt{}^{}$ $^{}$
Read More
Closing the Gap: A Learning Algorithm for Lost-Sales Inventory Systems with Lead Times
We consider a periodic-review, single-product inventory system with lost sales and positive lead times under censored demand. In contrast to the classical inventory literature, we assume the firm does not know the demand distribution a priori and makes an ...
Read More
Closing the Efficiency Gap Between Synchronous and Network-Agnostic Consensus
Advances in Cryptology – EUROCRYPT 2024
Abstract
In the consensus problem, n parties want to agree on a common value, even if some of them are corrupt and arbitrarily misbehave. If the parties have a common input m, then they must agree on m.
Protocols solving consensus assume either a ... $_{}$ $_{}$
$_{}$ $_{}$ $_{}_{}$ $_{}_{}$
$^{}$
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ACL '03: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
July 2003
571 pages
Program Chairs:
Erhard W. Hinrichs,
Dan Roth
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 7 July 2003
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate85of443submissions,19%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 11
  Total Citations
  View Citations
- 352
  Total Downloads
- Downloads (Last 12 months)35
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Closing the gap: learning-based information extraction rivaling knowledge-engineering methods

ACL '03: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1

ABSTRACT

References

Cited By

Recommendations

Towards Closing the Security Gap of Tweak-aNd-Tweak (TNT)

Closing the Gap: A Learning Algorithm for Lost-Sales Inventory Systems with Lead Times

Closing the Efficiency Gap Between Synchronous and Network-Agnostic Consensus

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Closing the gap: learning-based information extraction rivaling knowledge-engineering methods

ACL '03: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1

ABSTRACT

References

Cited By

Recommendations

Towards Closing the Security Gap of Tweak-aNd-Tweak (TNT)

Closing the Gap: A Learning Algorithm for Lost-Sales Inventory Systems with Lead Times

Closing the Efficiency Gap Between Synchronous and Network-Agnostic Consensus

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media