Skip to main content

LaSIE Jumps the GATE

  • Chapter

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 7))

Abstract

The benefits of the effective creation of Information Extraction (IE) in the last ten years, driven by the DARPA TIPSTER programme and the associated MUC evaluations, have been enormous, but it must now be time to ask what research issues face the systems we have built and what we should do next. We suggest that there are two classes of important research issues: those requiring detailed comparative evaluation of alternative approaches to IE subtasks and those to do with flexible adaptation of IE systems to new users and domains.

Both these classes of issues, we argue, can be profitably addressed within an architecture for language engineering called GATE, the General Architecture for Text Engineering. We describe GATE, which owes a great deal to the TIPSTER architecture, and also the LaSIE IE system, which is set within GATE and with which we have competed in MUC, and bring out the distinctive features that have led to its good performance in certain areas.

Within GATE, we can now reconfigure various Language Engineering modules so as to assemble alternative IE systems and then to compare their performance with LaSIE. In this way the environment provided by GATE will allow us to make significant strides in assessing alternative LE technologies and in rapidly adapting LE prototype systems for new users and domains.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aberdeen, J., J. Burger, D. Day, L. Hirschman, P. Robinson, and M. Vilain.(1995). MITRE: Description of the Alembic System Used for MUC-6. In Proceedings of the Sixth Message Understanding Conference (MUC-6),pp. 141–156, Morgan Kaufmann.

    Google Scholar 

  • Appelt, D., J. Hobbs, J. Bear, D. Israel, M. Kameyama, A. Kehler, D. Martin, K. Myers, and M. Tyson. (1995). SRI International FASTUS system: MUC-6 Test Results and Analysis. In Proceedings of the Sixth Message Understanding Conference (MUC-6),pp. 237–248. Morgan Kaufmann.

    Google Scholar 

  • Beale, D., S. Nirenburg, and K. Mahesh. (1995). Semantic Analysis in the Mikrokosmos Machine Translation Project. In Proceedings of the Second Symposium on Natural Language Processing (SNLP-95), pp. 173–191.

    Google Scholar 

  • Brill, E. (1992). A simple rule-based part-of-speech tagger. In Proceeding of the Third Conference on Applied Natural Language Processing, pp. 152–155, Trento, Italy.

    Chapter  Google Scholar 

  • Cowie, J. and W. Lehnert. (1996). Information extraction. Communications of the ACM, 39 (1), pp. 80–91.

    Article  Google Scholar 

  • Cunningham, H., Y. Wilks, and R.J. Gaizauskas. (1996). New Methods, Current Trends and Software Infrastructure for NLP. In Proceedings of the conference on New Methods in Natural Language Processing (NeMLaP-2),Bilkent University, Turkey, pp. 112. Also available as http://xxx.lanl.gov/ps/cmp-lg/9607025.

  • Cunningham, H., K. Humphreys, R. Gaizauskas, and Y. Wilks. (1997). Software Infrastructure for Natural Language Processing. In Proceedings of the Fifth Conference on Applied Natural Language Processing (ANLP-97),pp. 237–244. Available as http://xxx.lanl.gov/ps/9702005.

  • Defense Advanced Research Projects Agency. (1995). Proceedings of the Sixth Message Understanding Conference (MUC-6). Morgan Kaufmann.

    Google Scholar 

  • ECRAN: Extraction of Content: Research at Near-Market. http://www2.echo.lu/langeng/en/lel/ecran/ecran.html. Site visited 29/05/97.

  • FACILE: Fast and Accurate Categorisation of Information by Language Engineering. http://www2.echo.lu/langeng/en/lel/facile/facile.html. Site visited 29/05/97.

  • Gaizauskas, R. (1995). XI: A Knowledge Representation Language Based on Cross-Classification and Inheritance. Technical Report CS-95–24, Department of Computer Science, University of Sheffield.

    Google Scholar 

  • Gaizauskas, R., L.J. Cahill, and R. Evans. (1993). Description of the sussex system used for MUC-5. In Proceedings of the Fifth Message Undersanding Conference (MUC-5),pp. 321–335, Morgan Kaufmann.

    Google Scholar 

  • Gaizauskas, R. and K. Humphreys. (1997). Using a semantic network for information extraction. Journal of Natural Language Engineering. In press.

    Google Scholar 

  • Gaizauskas, R., T. Wakao, K Humphreys, H. Cunningham, and Y. Wilks. (1995). Description of the LaSIE system as used for MUC-6. In Proceedings of the Sixth Message Understanding Conference (MUC-6), pp. 207–220, Morgan Kaufman.

    Google Scholar 

  • Gaizauskas, R., H. Cunningham, Y. Wilks, P. Rodgers, and K. Humphreys. (1996). GATE — an Environment to Support Research and Development in Natural Language Engineering. In Proceedings of the 8th IEEE International Conference on Tools with Artificial Intelligence (ICTAI-96), Toulouse, France.

    Google Scholar 

  • Gazdar, G. and C. Mellish. (1989). Natural Language Processing in Prolog. Addison-Wesley, Wokingham.

    Google Scholar 

  • Grishman, R. (1995). TIPSTER Architecture Design Document Version 1. 52 ( Tinman Architecture). Technical report, Department of Computer Science, New York University. Available at http://www.cs.nyu.edu/tipster .

    Google Scholar 

  • Grishman, R. (1996). TIPSTER. Architecture Design Document Version 2. 2. Technical report, Defense Advanced Research Projects Agency. Available at http://www.tipster.org/.

    Google Scholar 

  • Grishman, R. and B. Sundheim. (1996). Message understanding conference–6: A brief history. In Proceedings of the 16th International Conference on Computational Linguistics, Copenhagen, pp. 466–471.

    Google Scholar 

  • Krupka, G.R. (1995). Description of the SRA System as used for MUC-6. In Proceedings of the Fourth Message Understanding Conference (MUC-6), pp. 221–236. Morgan Kaufmann.

    Google Scholar 

  • Marcus, M.P., B. Santorini, and M.A. Marcinkiewicz. (1993). Building a large annotated corpus of english: The Penn treebank. Computational Linguistics, 19(2), pp. 313 330.

    Google Scholar 

  • Miller, G.A., R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. (1993). Introduction to WordNet: On-line. Distributed with the WordNet Software.

    Google Scholar 

  • SPARKLE: Shallow parsing and knowledge extraction for language engineering. http://www2.echo.lu/langeng/en/lel/sparkle/sparkle.html. Site visited 10/06/97. Thompson, H.S. and D. McKelvie. (1996). A Software Architecture for Simple, Efficient

  • SGML Applications. In Proceedings of SGML Europe ‘86, Munich.Thurmair, G. (1997). Information extraction for intelligence systems. In Natural Language Processing: Extracting Information for Business Needs, Unicorn Seminars Ltd, London, pp. 135–149.

    Google Scholar 

  • TREE: Trans European Employment.http://www2.echo.lu/langeng/en/lel/tree/tree.html. Site visited 29/05/97.

  • Wilks, Y., L. Guthrie, and B. Slator. (1996). Electric Words. MIT Press, Cambridge,MA.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Wilks, Y., Gaizauskas, R. (1999). LaSIE Jumps the GATE. In: Strzalkowski, T. (eds) Natural Language Information Retrieval. Text, Speech and Language Technology, vol 7. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2388-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-2388-6_8

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-5209-4

  • Online ISBN: 978-94-017-2388-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics