Skip to main content

Relative Quality Assessment of Wikipedia Articles in Different Languages Using Synthetic Measure

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 303))

Abstract

Online encyclopedia Wikipedia is one of the most popular sources of knowledge. It is often criticized for poor information quality. Articles can be created and edited even by anonymous users independently in almost 300 languages. Therefore, a difference in the information quality in various language versions on the same topic is observed. The Wikipedia community has created a system for assessing the quality of articles, which can be helpful in deciding which language version is more complete and correct. There are several issues: each Wikipedia language can use own grading scheme and there is usually a large number of unevaluated articles. In this paper, we propose to use a synthetic measure for automatic quality evaluation of the articles in different languages based on important features.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://meta.wikimedia.org/wiki/List_of_Wikipedias.

  2. 2.

    http://wikirank.net.

  3. 3.

    https://en.wikipedia.org/wiki/List_of_Wikipedias.

References

  1. Węcel, K., Lewoniewski, W.: Modelling the quality of attributes in Wikipedia infoboxes. In: Abramowicz, W. (ed.) BIS 2015. LNBIP, vol. 228, pp. 308–320. Springer, Cham (2015). doi:10.1007/978-3-319-26762-3_27

    Chapter  Google Scholar 

  2. Blumenstock, J.: Size matters: word count as a measure of quality on Wikipedia. In: Proceedings of the 17th International Conference on World Wide Web, pp. 1095–1096. ACM (2008)

    Google Scholar 

  3. Lewoniewski, W., Węcel, K., Abramowicz, W.: Quality and importance of Wikipedia articles in different languages. In: Dregvaite, G., Damasevicius, R. (eds.) ICIST 2016. CCIS, vol. 639, pp. 613–624. Springer, Cham (2016). doi:10.1007/978-3-319-46254-7_50

    Chapter  Google Scholar 

  4. Warncke-Wang, M., Cosley, D., Riedl, J.: Tell me more: an actionable quality model for Wikipedia. In: Proceedings of the 9th International Symposium on Open Collaboration, p. 8. ACM, August 2013

    Google Scholar 

  5. Anderka, M.: Analyzing and predicting quality flaws in user-generated content: the case of Wikipedia. Ph.D., Bauhaus-Universitaet, Weimar, Germany (2013)

    Google Scholar 

  6. Lex, E., et al.: Measuring the quality of web content using factual information. In: Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality, pp. 7–10. ACM (2012)

    Google Scholar 

  7. Khairova, N., Lewoniewski, W., Węcel, K.: Estimating the quality of articles in Russian Wikipedia using the logical-linguistic model of fact extraction. In: Abramowicz, W. (ed.) Business Information Systems, BIS 2017. LNBIP, vol. 288, pp. 28–40. Springer, Cham (2017). doi:10.1007/978-3-319-59336-4_3

    Google Scholar 

  8. Lipka, N., Stein, B.: Identifying featured articles in Wikipedia: writing style matters. In: Proceedings of the 19th International Conference on World Wide Web, pp. 1147–1148. ACM (2010)

    Google Scholar 

  9. Xu, Y., Luo, T.: Measuring article quality in Wikipedia: lexical clue model. In: 2011 3rd Symposium on Web Society (SWS), pp. 141–146. IEEE (2011)

    Google Scholar 

  10. Wu, G., Harrigan, M., Cunningham, P.: Characterizing Wikipedia pages using edit network motif profiles. In: Proceedings of the 3rd International Workshop on Search and Mining User-Generated Contents, pp. 45–52. ACM (2011)

    Google Scholar 

  11. Suzuki, Y., Nakamura, S.: Assessing the quality of Wikipedia editors through crowdsourcing. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 1001–1006. International World Wide Web Conferences Steering Committee (2016)

    Google Scholar 

  12. Wilkinson, D.M., Huberman, B.A.: Cooperation and quality in Wikipedia. In: Proceedings of the 2007 International Symposium on Wikis, pp. 157–164. ACM (2007)

    Google Scholar 

  13. Ingawale, M., Dutta, A., Roy, R., Seetharaman, P.: Network analysis of user generated content quality in Wikipedia. Online Inf. Rev. 37(4), 602–619 (2013)

    Article  Google Scholar 

  14. Halfaker, A., Taraborelli, D.: Artificial intelligence service gives Wikipedians ‘x-ray specs’ to see through bad edits (2015). https://blog.wikimedia.org/2015/11/30/artificial-intelligence-x-ray-specs. Accessed 25 April 2017

  15. Dalip, D.H., Lima, H., Gonçalves, M.A., Cristo, M., Calado, P.: Quality assessment of collaborative content with minimal information. In: 2014 IEEE/ACM Joint Conference on Digital Libraries (JCDL), pp. 201–210. IEEE (2014)

    Google Scholar 

  16. Dang, Q.V., Ignat, C.L.: Quality assessment of Wikipedia articles without feature engineering. In: 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL), pp. 27–30. IEEE (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Włodzimierz Lewoniewski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Lewoniewski, W., Węcel, K. (2017). Relative Quality Assessment of Wikipedia Articles in Different Languages Using Synthetic Measure. In: Abramowicz, W. (eds) Business Information Systems Workshops. BIS 2017. Lecture Notes in Business Information Processing, vol 303. Springer, Cham. https://doi.org/10.1007/978-3-319-69023-0_24

Download citation

Publish with us

Policies and ethics