Skip to main content
Log in

Computational linguistics for metadata building (CLiMB): using text mining for the automatic identification, categorization, and disambiguation of subject terms for image metadata

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we present a system using computational linguistic techniques to extract metadata for image access. We discuss the implementation, functionality and evaluation of an image catalogers’ toolkit, developed in the Computational Linguistics for Metadata Building (CLiMB) research project. We have tested components of the system, including phrase finding for the art and architecture domain, functional semantic labeling using machine learning, and disambiguation of terms in domain-specific text vis a vis a rich thesaurus of subject terms, geographic and artist names. We present specific results on disambiguation techniques and on the nature of the ambiguity problem given the thesaurus, resources, and domain-specific text resource, with a comparison of domain-general resources and text. Our primary user group for evaluation has been the cataloger expert with specific expertise in the fields of painting, sculpture, and vernacular and landscape architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Some examples include OntoImage’2006—First International “Language Resources for Content-Based Image Retrieval” Workshop, held in conjunction with the Language Resources and Evaluation Conference (LREC) 2006, http://www.lrec-conf.org/lrec2006; OntoImage’2008—Second 2nd International “Language Resources for Content-Based Image Retrieval” Workshop, held in conjunction with LREC’2008, http://www.dfki.de/∼declerck/ontoimage.html; workshops on computational linguistics for image access held at the Visual Resources Association annual meetings, 2006, 2007, 2008, http://www.vraweb.org.

  2. http://vraweb.org/ccoweb/cco/parttwo_chapter6.html.

  3. One such project, T 3 : Text, Tagging and Trust to Improve Image Access for Museums and Libraries, has just been funded from the Institute for Museum and Library Science, imls.gov.

  4. Some metadata standards mentioned in Baca 2003 were: Categories for the Description of Works of Art (CDWA) from the Getty Research Institute and Cataloging Cultural Objects (CCO) from the Visual Resources Association.

  5. Notable controlled vocabularies noted in Baca 2003 were: Library of Congress Subject Headings; Library of Congress Name Authority File; the Getty Vocabularies; Thesaurus for Graphic Materials I and II.

  6. http://www.vernaculararchitectureforum.org/.

  7. http://www.sah.org/.

  8. http://www.lair.umd.edu/.

  9. http://www.artstor.org.

  10. Both the tagger and parser are available at: http://nlp.stanford.edu/software.

  11. Lucene is a search engine library: http://lucene.apache.org.

  12. Getty resources can be accessed at: http://getty.edu/research/conducting_research/vocabularies/aat.

  13. According to the documentation on the TGN, natural order refers to searching on the most common order of a name, e.g. Al-Hoceima, whereas inverted order would be Hoceima, Al-.

  14. Steve: The Museum Social Tagging Project. http://www.steve.museum.

  15. Luis von Ahn: The ESP Game at Games with a Purpose (GWAP).

    http://www.gwap.com/gwap/gamesPreview/espgame/.

  16. Jennifer Golbeck: FilmTrust. http://www.mindswap.org.

References

  1. Anderson JD, Perez-Carballo J (2001) The nature of indexing: how humans and machines analyze messages and texts for retrieval. Part I: research, and the nature of human indexing. Inf Process Manag 37:231–254

    Article  MATH  Google Scholar 

  2. Anderson JD, Perez-Carballo J (2001) The nature of indexing: how humans and machines analyze messages and texts for retrieval—part II: machine indexing, and the allocation of human versus machine effort. Inf Process Manag 37:255–277

    Article  MATH  Google Scholar 

  3. Baca M (2003) Practical issues in applying metadata schemas and controlled vocabularies to cultural heritage information. Cat Classif Q 36(3/4):47–55

    Article  Google Scholar 

  4. Banerjee S, Pedersen T (2003) Extended gloss overlaps as a measure of semantic relatedness. Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pp 805–810

  5. Barnard K, Forsyth DA (2001) Learning the semantics of words and pictures. Proceedings of International Conference on Computer Vision, pp 408–415

  6. Brill E (1995) Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput Linguist 21(4):543–565

    Google Scholar 

  7. Charniak E (1997) Statistical techniques for natural language parsing. AI Mag 18(4):33–44

    Google Scholar 

  8. Chen H (2001) An analysis of image retrieval tasks in the field of art history. Inf Process Manag 37:701–720

    Article  MATH  Google Scholar 

  9. Choi Y, Rasmussen E (2003) Searching for images: the analysis of users’ queries for image retrieval in American history. J Am Soc Inf Sci Technol 54:498–511

    Article  Google Scholar 

  10. Church KW (1988) A stochastic parts program and noun phrase parser for unrestricted text. Proceedings of the Second Conference on Applied Natural Language Processing, Austin, Texas, 9–12 February, pp 136–143

  11. Collins K (1998) Providing subject access to images: a study of user queries. Am Arch 61:36–55

    Google Scholar 

  12. Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):5–60

    Article  Google Scholar 

  13. Demner-Fushman D (2008) Combining medical domain ontological knowledge and low-level image features for multimedia indexing. OntoImage 2008: 2nd International Language Resources for Content-Based Image Retrieval Workshop in conjunction with LREC’2008, pp 18–23

  14. Fellbaum C (ed) (1998) WordNet: an electronic lexical database. MIT, Cambridge, MA

  15. Gale W, Church K, Yarowsky D (1993) A method for disambiguation word senses in a large corpus. Computers and Humanities 26:415–439

    Article  Google Scholar 

  16. Grishman R, Sundheim B (Eds) (1995) Design of the MUC-6 evaluation. Sixth Message Understanding Conference (MUC-6), NIST, Morgan-Kaufmann, Columbia, MD, pp 1–11

  17. Hatzivassiloglou V, Klavans JL, Eskin E (1999) Detecting text similarity over short passages: exploring linguistic feature combinations via machine learning. Proceedings of Empirical Methods in Natural Language Processing (EMNLP) and Very Large Corpora, MD, USA, pp 203–212

  18. Hatzivassiloglou V, Gravano L, Maganti A (2000) An investigation of linguistic features and clustering algorithms for topical document clustering. Proceedings of the Annual Meeting of ACM-SIGIR, pp 224–231

  19. Hearst M (1997) TextTiling: segmenting text into multi-paragraph subtopic passages. Comput Linguist 23(1):33–64

    Google Scholar 

  20. Kan M, Klavans JL, McKeown KR (1998) Linear segmentation and segment relevance. Proceedings of the 6th International Workshop of Very Large Corpora (WVLC-6), Montréal, Québec, Canada, pp 197–205

  21. Keister LH (1994) User types and queries: impact on image access systems. In: Fidel R, Hahn TB, Rasmussen E, Smith PJ (eds) Challenges in indexing electronic text and images. Learned Information for the American Society of Information Science, Medford, pp 7–22

    Google Scholar 

  22. Klavans JL, Chodorow MS, Wacholder N (1990) From dictionary to knowledge base via taxonomy. Proceedings of the sixth conference of the University of Waterloo Centre for the New Oxford English Dictionary and Text Research: Electronic Text Research, University of Waterloo, Waterloo, Canada, pp 110–132

  23. Klavans JL, Tzoukermann E (1996) Dictionaries and corpora: combining corpus and machine-readable dictionary data for building bilingual lexicons. Journal of Machine Translation 10(3–4):185–218

    Google Scholar 

  24. Klein S, Simmons RF (1963) A computational approach to grammatical coding of English words. J Assoc Comput Mach 10(3):334–347

    MATH  Google Scholar 

  25. Lesk M (1986) Automatic sense disambiguation: how to tell a pine cone from an ice cream cone. Proceedings of the 1986 ACM SIGDOC Conference, pp 24–26

  26. Lew MS (2000) Next-generation web searches for visual content. IEEE Computer 33:46–53

    Google Scholar 

  27. Maron ME (1961) Automatic indexing: an experimental inquiry. J Assoc Comput Mach 8(3):404–417

    MATH  Google Scholar 

  28. Palmer M, Ng HT, Dang HT (2006) Evaluation. In: Edmonds P, Agirre E (eds) Word sense disambiguation: algorithms, applications, and trends. text, speech, and language technology series. Kluwer, The Netherlands

    Google Scholar 

  29. Panofsky E (1962) Studies in iconology: humanistic themes in the art of the renaissance. Harper & Row, New York

    Google Scholar 

  30. Passonneau R, Yano T, Lippincott T, Klavans J (2008) Functional semantic categories for art history text: human labeling and preliminary machine learning. Proceedings of the 3rd International Conference on Computer Vision Theory and Applications, Workshop on Metadata Mining for Image Understanding, pp 13–22

  31. Pastra K, Saggion H, Wilks Y (2003) Intelligent indexing of crime-scene photographs. IEEE Intell Syst Their Appl 18(1):55–61

    Article  Google Scholar 

  32. Patwardhan S, Banerjee S, Pedersen T (2003) Using measures of semantic relatedness for word sense disambiguation. Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, pp 241–257

  33. Rasmussen EM (1997) Indexing images. Annu Rev Inf Sci Technol 32:169–196

    Google Scholar 

  34. Resnik R (1999) Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. J Artif Intell Res 11:95–130

    MATH  Google Scholar 

  35. Rorissa A, Iyer H (2008) Theories of cognition and image categorization: what category labels reveal about basic level theory. J Am Soc Inf Sci Technol 59(9):1383–1392

    Article  Google Scholar 

  36. Shatford S (1986) Analyzing the subject of a picture: a theoretical approach. Cat Classif Q 6(3):39–62

    Article  Google Scholar 

  37. Sidhu T, Klavans JL, Lin J (2007) Concept disambiguation for improved subject access using multiple knowledge sources. Proceedings of the Workshop on Language Technology for Cultural Heritage Data (LaTech 2007), 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, pp 25–32

  38. Tibbo HR (1994) Indexing for the humanities. J Am Soc Inf Sci 45(8):607–619

    Article  Google Scholar 

  39. Wilks Y, Catizone R (2002) What is lexical tuning? J Semant 19(2):167–190

    Article  Google Scholar 

  40. Yang Y, Liu X (1999) A re-examination of text categorization methods. Proceedings of the 22nd Annual International ACM SIGIR, pp 42–49

  41. Yarowsky D (1994) Decision lists for lexical ambiguity resolution. Proceedings of ACL-94, Las Cruces, NM, pp 88–95

  42. Yarowsky D (1992) Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. Proceedings of COLING’92 Conference, pp 454–460

Download references

Acknowledgements

We acknowledge the Program Office for Scholarly Communications of the Andrew W. Mellon Foundation, especially Don Waters and Suzanne Lodato; Dr. Murtha Baca, director of the Getty Vocabulary Program and Digital Resource Management, Getty Research Institute for providing us with research access to resources; cataloging and domain expert Angela Giral; collections partners, including Jeff Cohen, Bryn Mawr College and University of Pennsylvania for the vernacular architecture collection; Jack Sullivan, University of Maryland for landscape architecture; the Senate Museum and Library; and ARTStor. Finally, Joan Beaudoin (Drexel), Laura Jaeneman (Drexel), and Brooke Rosenblatt (the Phillips Gallery) helped with annotation, collections and user studies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carolyn Sheffield.

Additional information

This project, funded by the Andrew W. Mellon Foundation, was initiated at the Center for Research on Information Access at Columbia University and is currently based at the University of Maryland.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Klavans, J.L., Sheffield, C., Abels, E. et al. Computational linguistics for metadata building (CLiMB): using text mining for the automatic identification, categorization, and disambiguation of subject terms for image metadata. Multimed Tools Appl 42, 115–138 (2009). https://doi.org/10.1007/s11042-008-0253-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-008-0253-9

Keywords

Navigation