Skip to main content
Log in

Time-focused analysis of connectivity and popularity of historical persons in Wikipedia

  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

Wikipedia contains large amounts of content related to history. It is being used extensively for many knowledge intensive tasks within computer science, digital humanities and related fields. In this paper, we look into Wikipedia articles on historical people for studying link-related temporal features of articles on past people. Our study sheds new light on the characteristics of information about historical people recorded in the English Wikipedia and quantifies user interest in such data. We propose a novel style of analysis in which we use signals derived from the hyperlink structure of Wikipedia as well as from article view logs, and we overlay them over temporal dimension to understand relations between time periods, link structure and article popularity. In the latter part of the paper, we also demonstrate several ways for estimating person importance based on the temporal aspects of the link structure as well as a method for ranking cities using the computed importance scores of their related persons.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

Notes

  1. https://books.google.com/ngrams/datasets.

  2. https://dumps.wikimedia.org/enwiki.

  3. https://pypi.python.org/pypi/beautifulsoup4.

  4. Note that, naturally, the amount of possible links from the articles on “future people” decreases the closer to the latest decade due to the decreasing numbers of articles of people from the recent times from which such links could originate. Similar case applies to the links from the articles on “past people” when “moving away” from the present toward the distant past.

  5. Due to their large size we do not show networks for the most recent centuries.

  6. https://dumps.wikimedia.org/other/pagecounts-raw/.

  7. https://www.wikidata.org/wiki/Wikidata:Main_Page.

References

  1. Assmann, A.: Introduction to Cultural Studies. Schmidt Erich Verlag, Wirtschaft (2008). (in German)

    Google Scholar 

  2. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: A Nucleus for a Web of Open Data. In: ISWC’07/ASWC’07, pp. 722–735. Springer (2007)

  3. Yeung, C.-M. Au, Jatowt, A.: Studying how the past is remembered: towards computational history through large scale text mining. In: CIKM, pp. 1231–1240 (2011)

  4. Burns, J.F.: Bones under parking lot belonged to Richard III, 2/2013. http://www.nytimes.com/2013/02/05/world/europe/richard-the-third-bones.html?_r=1

  5. Carr, E.H.: What is History?. Penguin, London (1961)

    Google Scholar 

  6. Cook, J., Das Sarma, A., Fabrikant, A., Tomkins, A.: Weeks, your two, of fame and your grandmother’s. In: WWW: ACM, New York, NY. USA, pp. 919–928 (2012)

  7. Cronon, W.: Scholarly authority in a Wikified World. Perspectives in History (2012). https://www.historians.org/publications-and-directories/perspectives-on-history/february-2012/scholarly-authority-in-a-wikified-world

  8. Düring, M.: Can Network Analysis Reveal Importance? Degree Centrality and Leaders in the EU Integration Process. Social Informatics, pp. 314–318. Springer, Berlin (2014)

    Google Scholar 

  9. Ebbinghaus, H.: Memory: A Contribution to Experimental Psychology. Columbia University, New York (1913)

    Book  Google Scholar 

  10. Eom, Y.-H., Aragón, P., Laniado, D., Kaltenbrunner, A., Vigna, S., Shepelyansky, D.L.: Interactions of cultures and top people of Wikipedia from ranking of 24 language editions. PLoS ONE 10(3), e0114825 (2014)

    Article  Google Scholar 

  11. Ferron, M., Massa, P.: Collective memory building in Wikipedia: the case of North African uprisings. In: WikiSym ’11. ACM, New York, NY, USA, 114–123 (2011)

  12. Friedman, R.: The Life Millennium: The 100 Most Important Events and People of the Past 1000 Years. Bulfinch, New York City (1998)

    Google Scholar 

  13. Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. Proc. IJCAI 2007, 1606–1611 (2007)

    Google Scholar 

  14. Gabrilovich, E., et al.: Overcoming the brittleness bottleneck using Wikipedia: Enhancing text categorization with encyclopedic knowledge. In: AAAI (2006)

  15. Gadamer, H.-G.: Truth and Method. Sheed and Ward, London (1975)

    Google Scholar 

  16. Garcia-Fernandez, A., Ligozat, A.-L., Dinarelli, M., Bernhard, D.: When was it written? Automatically determining publication dates. In: SPIRE (2011)

  17. Geipel, M.: Self-organization applied to dynamic network layout. Int. J. Mod. Phys. C 18(10), 1537–1549 (2007)

    Article  Google Scholar 

  18. Giles, J.: Internet Encyclopaedias go head to head. Nature 438, 900–901 (2005)

    Article  Google Scholar 

  19. Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In VLDB 576–587, 2004 (2004)

    Google Scholar 

  20. Halbwachs, M.: La Mémoire Collective. Les Presses Universitaires de France (1950) (in French)

  21. Hart, M.H.: The 100: A Ranking of the Most Influential Persons in History. Citadel; Revised edition (2000)

  22. Hoerl, C., McCormack, T.: Time and Memory: Issues in Philosophy and Psychology. Oxford University Press, Oxford (2001)

    Google Scholar 

  23. Hoffart, J., et al.: YAGO2: Exploring and querying world knowledge in time, space, context, and many languages. In: WWW pp. 229–232 (2011)

  24. Hoffmann, L.: Looking back at big data. Commun. ACM 56(4), 21–23 (2013)

    Article  Google Scholar 

  25. Huet, T., Biega, J., Suchanek, F.: Mining history with Le Monde. In: AKBC 2013 workshop at CIKM2013 (2013)

  26. Jacoby, R.: Social Amnesia: A Critique of Contemporary Psychology. Transaction Publishers, Piscataway (1997)

    Google Scholar 

  27. Jatowt, A., Antoine, E., Kawai, Y., Akiyama, T.: Mapping temporal horizons. Analysis of collective future and past related attention in microblogging. In: WWW, pp. 484–494 (2015)

  28. Jatowt, A., Kawai, D., Tanaka, K.: Digital history meets Wikipedia: analyzing historical persons in Wikipedia. In: Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries. (JCDL 2016). ACM Press, Newark, USA, pp. 17–26 (2016)

  29. Jatowt, A., Kawai, D., Tanaka, K.: Predicting importance of historical persons using Wikipedia. In: Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM 2016), ACM Press, Indianapolis, IN, USA, pp. 1909–1912 (2016)

  30. Jatowt, A., Kawai, D., Tanaka, K.: Timestamping entities using contextual information. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2017). ACM Press, Tokyo, Japan, pp. 1205–1208 (2017)

  31. Joho, H., Jatowt, A., Blanco, R.: Temporal information searching behaviour and tactics. Inf. Process. Manag. J. 51(6), 834–850 (2015)

    Article  Google Scholar 

  32. Kanhabua, N., Niederée, C., Siberski, W.: Towards concise preservation by managed forgetting: research issues and case study. In: iPres (2013)

  33. Kanhabua, N., Nguyen, T.N., Niederée, C.: What triggers human remembering of events? A large-scale analysis of catalysts for collective memory in Wikipedia. In: JCDL, pp. 341–350 (2014)

  34. Kinzler, D.: WikiSense—Mining the Wiki. In: Proceedings of Wikimania 2005. In: The First International Wikimedia Conference. Wikimedia Foundation (2005)

  35. Kittur, N., Chi, E.H., Suh, B.: What’s in Wikipedia? Mapping topics and conflict using socially annotated category structure. In: CHI ’09, pp. 1509–1512 (2009)

  36. Kremer, M.: Population growth and technological change: one million B.C. to 1990. Quart. J. Econ. 108, 681–716 (1993)

    Article  Google Scholar 

  37. Lazer, D., et al.: Computational social science. Science 323, 721–723 (2009)

    Article  Google Scholar 

  38. Lendvai, P., Zervanou, K.: In: Proceedings of the 7th workshop on language technology for cultural heritage, social sciences, and humanities (LaTeCH 2013) at ACL’13 (2013)

  39. Malik, T.: Google Doodle Honors 16th Century Astronomer Nicolaus Copernicus, February 19, (2013). http://www.space.com/19868-nicolaus-copernicus-google-doodle.html

  40. McPherson, M., Smith-Lovin, L., Cook, J.M.: Birds of a feather: homophily in social networks. Annu. Rev. Sociol. 27, 415–444 (2001)

    Article  Google Scholar 

  41. Medelyan, O., Milne, D., Legg, C., Witten, Ian H.: Mining Meaning from Wikipedia. Int. J. Hum.-Comput. Stud. 67(9), 716–754 (2009)

    Article  Google Scholar 

  42. Michel, J.-B., et al.: Quantitative analysis of culture using millions of digitized books. Science 331(6014), 176–182 (2011)

    Article  Google Scholar 

  43. Milne, D., Medelyan, O., Witten, I.H.: Mining domain-specific thesauri from Wikipedia: a case study. In: WI’06, pp. 442–448 (2006)

  44. Nicolas Steno Google doodle Marks his 374th Birth Anniversary (2012). http://www.theguardian.com/technology/2012/jan/11/nicolas-steno-google-doodle

  45. Nunes, S., Ribeiro, C., David, G.: Using neighbors to date web documents. In: Proceedings of the WIDM’07 workshop associated to CIKM’07, pp. 129–136 (2007)

  46. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Technical Report, Stanford University (1998)

  47. Rosenzweig, R.: Can history be open source? Wikipedia and the future of the past. J. Am. Hist. 93(1), 117–46 (2006)

    Article  Google Scholar 

  48. Skiena, S., Ward, C.B.: Who’s Bigger. Where Historical Figures Really Rank. Cambridge University Press, Cambridge (2014)

  49. Strube, M., Ponzetto, S.: WikiRelate! Computing semantic relatedness using Wikipedia. In: AAAI-06, pp. 1419–1424 (2006)

  50. Sturrock, J.: Structuralism and since: from Lévi Strauss to Derrida, Introduction (1979)

  51. Takahashi, Y., Ohshima, H., Yamamoto, M., Iwasaki, H., Oyama, S., Tanaka, K.: Evaluating significance of historical entities based on tempo-spatial impacts analysis using wikipedia link structure. In: Proceedings of HT ’11. ACM, New York, NY, USA, pp. 83–92 (2011)

  52. Whiting, S., Jose, J.M., Alonso, O.: Wikipedia as a time machine. In: TempWeb’14 at WWW2014, pp. 857–861 (2014)

  53. Wood, T.: An introduction to civil registration. Federation of Family History Societies (Publications) (1994)

  54. Vrandečić, D., Krötzsch, M.: A free collaborative knowledge base. Commun. ACM 57(1), 78–85 (2014)

    Article  Google Scholar 

  55. Zaagsma, G.: On digital history. BMGN Low Ctries. Hist. Rev. 128(4), 3–29 (2013)

    Article  Google Scholar 

  56. Zhang, X., Asano, Y., Yoshikawa, M.: Mining knowledge on relationships between objects from the web. IEICE Trans. 97–D(1), 77–88 (2014)

    Article  Google Scholar 

  57. Au Yeung, C.M., Tomoharu, T.: Extracting multi-dimensional relations: a generative model of groups of entities in a corpus. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management

Download references

Acknowledgements

This research was supported in part by MEXT Grants-in-Aid for Scientific Research (#17H01828, #15K12158) and by MIC/SCOPE (#171507010).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adam Jatowt.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jatowt, A., Kawai, D. & Tanaka, K. Time-focused analysis of connectivity and popularity of historical persons in Wikipedia. Int J Digit Libr 20, 287–305 (2019). https://doi.org/10.1007/s00799-018-0231-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-018-0231-4

Keywords

Navigation