ABSTRACT
A text retrieval method called the thematic geographical search method has been developed and applied to a Japanese encyclopedia called the World Encyclopædia. In this method, the user specifies a search theme using free words, then obtains a sorted list of excerpts and hyperlinks to encyclopedia sentences that contain geographical names. Using this list, the user can also open maps that indicate the locations of the names. To generate an index of names for this searching, a method of extracting geographical names has been developed. In this method, geographical names are extracted, matched to names in a geographical name database, and identified. Geographical names, however, often have several types of ambiguities. Ambiguities are resolved by using non-local context analysis, which uses a stack and several other techniques. As a result, the precision of extracted names is more than 96% on average. This method depends on features of the Japanese language, but the strategy and most of the techniques can be applied to texts in English or other languages.
- HDH 98.DVD/CD-ROM World Encyclopcedia, version 2, Hitachi Digital Heibonsha, 1998.Google Scholar
- HDH 99.CD-ROM Mypcedia 99, Hitachi Digital Heibonsha, 1999.Google Scholar
- His 97.Hisamitsu, T., and Niwa, Y.: Acquisition of Person Names from Newspaper Articles by Lexical Knowledge and Co-occurrence Analysis, SIG on Natural Language Processing, Information Processing Society of Japan, 118-1, pp. 1- 6, 1997 (in Japanese).Google Scholar
- Ino96.Inoue, Y., et al.: Template-based Products Information Extraction from Newspaper Articles, SIG on Natural Language Processing, Information Processing Society of lapan, 96-NL-115, pp. 83-90, 1996 (in Japanese).Google Scholar
- Kan 98.Kanada, Y.: Axis-specified Search: A New Full-text Search Method for Gathering and Structuring Excerpts, 3rd Int'l ACM Conf. on Digital Libraries, pp. 108-117, 1998. Google Scholar
- Kan 99.Kanada, Y.: Methods of Extracting Year References for Chronological-table-generating Text Searching, Int 'l Symposium, on Digital Libraries 1999, Univ. of Library and Information Sci., Tsukuba, 1999.Google Scholar
- MUC 98.Proceedings of the Seventh Message Understanding Conference (MUC-7). SAIC, 1998.Google Scholar
- Tak 99.Takao, Y., Nagai, H., Nakamura, S., and Nomura, H.: Information Extraction from Newspaper Articles of Multiple Products B classification of expression patterns -- SIG on Natural Language Processing, Information Processing Society of Japan, 129-17, pp. 117-124, 1999 (in Japanese).Google Scholar
Index Terms
- A method of geographical name extraction from Japanese text for thematic geographical search
Recommendations
Toponym ambiguity in geographical information retrieval
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrievalThe objectives of this research work is to study the effects of toponym (place name) ambiguity in the Geographical Information Retrieval (GIR) task. Our experience with GIR systems shows that toponym ambiguity may be an important factor in the inability ...
A deeply annotated testbed for geographical text analysis: The Corpus of Lake District Writing
GeoHumanities '17: Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial HumanitiesThis paper describes the development of an annotated corpus which forms a challenging testbed for geographical text analysis methods. This dataset, the Corpus of Lake District Writing (CLDW), consists of 80 manually digitised and annotated texts (...
Mapping Historical Documents to Geographical Space
MOBIQUITOUS 2016: Adjunct Proceedings of the 13th International Conference on Mobile and Ubiquitous Systems: Computing Networking and ServicesGeotagging is the process of recognizing place and facility names in a document, and assigning each set of latitude and longitude values. In the latter step, an external geographic database, which contains pairs of place/facility names and latitude/...
Comments