research-article

Automatically characterizing resource quality for educational digital libraries

Authors:
Steven Bethard

University of Colorado, Boulder, CO, USA

University of Colorado, Boulder, CO, USA
View Profile

,
Philipp Wetzer

University of Colorado, Boulder, CO, USA

University of Colorado, Boulder, CO, USA
View Profile

,
Kirsten Butcher

University of Utah, Salt Lake City, UT, USA

University of Utah, Salt Lake City, UT, USA
View Profile

,
James H. Martin

University of Colorado, Boulder, CO, USA

University of Colorado, Boulder, CO, USA
View Profile

,
Tamara Sumner

University of Colorado, Boulder, CO, USA

University of Colorado, Boulder, CO, USA
View Profile

JCDL '09: Proceedings of the 9th ACM/IEEE-CS joint conference on Digital librariesJune 2009Pages 221–230https://doi.org/10.1145/1555400.1555436

Published:15 June 2009Publication History

JCDL '09: Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries

Pages 221–230

ABSTRACT

With the rise of community-generated web content, the need for automatic characterization of resource quality has grown, particularly in the realm of educational digital libraries. We demonstrate how identifying concrete factors of quality for web-based educational resources can make machine learning approaches to automating quality characterization tractable. Using data from several previous studies of quality, we gathered a set of key dimensions and indicators of quality that were commonly identified by educators. We then performed a mixed-method study of digital library curation experts, showing that our characterization of quality captured the subjective processes used by the experts when assessing resource quality for classroom use. Using key indicators of quality selected from a statistical analysis of our expert study data, we developed a set of annotation guidelines and annotated a corpus of 1000 digital resources for the presence or absence of these key quality indicators. Agreement among annotators was high, and initial machine learning models trained from this corpus were able to identify some indicators of quality with as much as an 18% improvement over the baseline.

References

B. T. Adler and L. de Alfaro. A content-driven reputation system for the wikipedia. In Proceedings of the 16th international conference on World Wide Web, pages 261--270, Ban, Alberta, Canada, 2007. ACM. Google ScholarDigital Library
J. E. Blumenstock. Size matters: Word count as a measure of quality on wikipedia. In Proceedings of the 17th International World Wide Web Conference, pages 1095--1096, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
T. Carey and G. L. Hanley. Extending the impact of open educational resources through alignment with pedagogical content knowledge and institutional strategy: Lessons learned from the merlot community experience. In Opening up education: the collective advancement of education through open technology, open content, and open knowledge, chapter 12. MIT Press, 2008.Google Scholar
CLEANEVAL home page. http://cleaneval.sigwac.org.uk/, Oct. 2008.Google Scholar
Climate change collection. http://serc.carleton.edu/climatechange/, Oct. 2008.Google Scholar
M. Custard and T. Sumner. Using machine learning to support quality judgments. D-Lib Magazine, 11(10), Oct. 2005.Google ScholarCross Ref
S. de la Chica. Generating Conceptual Knowledge Representations to Support Students Writing Scientific Explanations. PhD thesis, University of Colorado, 2008.Google Scholar
H. Devaul, A. Diekema, and J. Ostwald. Computer-assisted assignment of educational standards using natural language processing. Unpublished technical report, Digital Learning Sciences, Boulder, CO, 2007.Google Scholar
Digital library for earth system education. http://www.dlese.org/, Oct. 2008.Google Scholar
Digital water education library. http://www.csmate.colostate.edu/DWEL/, Jan. 2004.Google Scholar
DLESE Community Collection (DCC) scope statement. http://www.dlese.org/Metadata/collections/scopes/dcc-scope.php, Oct. 2008.Google Scholar
P. Dmitriev. As we may perceive: Finding the boundaries of compound documents on the web. In Proceedings of the 17th International World Wide Web Conference, 2008. Google ScholarDigital Library
D. F. Dufty, D. Mcnamara, M. Louwerse, Z. Cai, and A. C. Graesser. Automatic evaluation of aspects of document quality. In Proceedings of the 22nd annual international conference on Documentation, 2004. Google ScholarDigital Library
N. Eiron. Untangling compound documents on the web. In Proceedings of the 14th ACM Conference on Hypertext and Hypermedia, pages 85--94, 2003. Google ScholarDigital Library
K. A. Ericsson and H. A. Simon. Protocol Analysis: Verbal Reports as Data. The MIT Press, revised edition, Apr. 1993.Google Scholar
B. J. Fogg, J. Marshall, O. Laraki, A. Osipovich, C. Varma, N. Fang, J. Paul, A. Rangnekar, J. Shon, P. Swani, and M. Treinen. What makes web sites credible?: a report on a large quantitative study. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 61--68, Seattle, Washington, United States, 2001. ACM. Google ScholarDigital Library
M. Y. Ivory, R. R. Sinha, and M. A. Hearst. Empirically validated web page design metrics. In Proceedings of the SIGCHI conference on Human factors in computing systems, pages 53--60, Seattle, Washington, United States, 2001. ACM. Google ScholarDigital Library
T. Joachims. Making large-scale support vector machine learning practical. In Advances in kernel methods: support vector learning, pages 169--184. MIT Press, 1999. Google ScholarDigital Library
P. V. Ogren, P. G. Wetzler, and S. Bethard. ClearTK: A UIMA toolkit for statistical natural language processing. In UIMA for NLP workshop at Language Resources and Evaluation Conference (LREC), 2008.Google Scholar
R. Reitsma, B. Marshall, M. Dalton, and M. Cyr. Exploring educational standard alignment: in search of 'relevance'. In Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries, pages 57--65, Pittsburgh PA, PA, USA, 2008. ACM. Google ScholarDigital Library
S. Y. Rieh. Judgment of information quality and cognitive authority in the web. Journal of the American Society for Information Science and Technology, 53:145--161, 2002. Google ScholarDigital Library
T. Sumner, M. Khoo, M. Recker, and M. Marlino. Understanding educator perceptions of "quality" in digital libraries. In Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, pages 269--279, Houston, Texas, 2003. IEEE Computer Society. Google ScholarDigital Library
H. Zeng, M. Alhossaini, L. Ding, R. Fikes, and D. Mcguinness. Computing trust from revision history. In Proceedings of the 2006 International Conference on Privacy, Security and Trust, Oct. 2006. Google ScholarDigital Library
X. Zhu and S. Gauch. Incorporating quality metrics in centralized/distributed information retrieval on the world wide web. In SIGIR '00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, pages 288--295, New York, NY, USA, 2000. ACM. Google ScholarDigital Library

Index Terms

Automatically characterizing resource quality for educational digital libraries

Recommendations

Characterizing and Predicting the Multifaceted Nature of Quality in Educational Web Resources

Efficient learning from Web resources can depend on accurately assessing the quality of each resource. We present a methodology for developing computational models of quality that can assist users in assessing Web resources. The methodology consists of ...
Read More
Automatically assessing resource quality for educational digital libraries
WICOW '09: Proceedings of the 3rd workshop on Information credibility on the web

With the rise of community-generated web content, the need for automatic assessment of resource quality has grown. We demonstrate how developing a concrete characterization of quality for web-based resources can make machine learning approaches to ...
Read More
A new AENOR project for measuring the quality of digital educational materials
TEEM '13: Proceedings of the First International Conference on Technological Ecosystem for Enhancing Multiculturality

This article presents the design of a new Spanish standard for measuring the quality of Digital Educational Materials as well as the results of previous work. This standard is being developed within the Spanish National Agency for Standardization (AENOR)...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
JCDL '09: Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
June 2009
502 pages
ISBN:9781605583228
DOI:10.1145/1555400
General Chairs:
Fred Heath
University of Texas Libraries, USA
,
Mary Lynn Rice-Lively
University of Texas at Austin, USA
,
Program Chair:
Richard Furuta
Texas A&M University, USA
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 June 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
educational digital library
learning resource
machine learning
quality
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate415of1,482submissions,28%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 14
  Total Citations
  View Citations
- 471
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatically characterizing resource quality for educational digital libraries

JCDL '09: Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries

ABSTRACT

References

Cited By

Index Terms

Recommendations

Characterizing and Predicting the Multifaceted Nature of Quality in Educational Web Resources

Automatically assessing resource quality for educational digital libraries

A new AENOR project for measuring the quality of digital educational materials

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Automatically characterizing resource quality for educational digital libraries

JCDL '09: Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries

ABSTRACT

References

Cited By

Index Terms

Recommendations

Characterizing and Predicting the Multifaceted Nature of Quality in Educational Web Resources

Automatically assessing resource quality for educational digital libraries

A new AENOR project for measuring the quality of digital educational materials

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media