Abstract
Syllabi are rich educational resources. However, finding Computer Science syllabi on a generic search engine does not work well. Towards our goal of building a syllabus collection we have trained various Decision Tree, Naive-Bayes, Support Vector Machine and Feed-Forward Neural Network classifiers to recognize Computer Science syllabi from other web pages. We have also trained our classifiers to distinguish between Artificial Intelligence and Software Engineering syllabi. Our best classifiers are 95% accurate at both the tasks. We present an analysis of the various feature selection methods and classifiers we used hoping to help others developing their own collections.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley (1999)
Ensemble Computing Portal, http://www.computingportal.org/
Matsunaga, Y., Yamada, S.: A web syllabus crawler and its efficiency evaluation. In: Proceedings of ISEE (2002)
Yu, X., Tungare, M., Fan, W., Yuan, Y., Pérez-Quiñones, M., Fox, E., Cameron, W., Cassel, L.: Automatic Syllabus Classification using Support Vector Machines. In: Handbook of Research on Text and Web Mining Technologies. Information Science Reference (2008)
Yang, Y., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: Proceedings of the 14 th International Conference on Machine Learning, pp. 412–420 (1997)
Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python. O’Reilly Media (2009)
Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Kim, S., Han, K., Rim, H., Myaeng, S.: Some effective techniques for naive bayes text classification. IEEE Transactions on Knowledge and Data Engineering 18 (2006)
Joachims, T.: Text Categorization with Support Vector Machines: Learning with many re-levant Features. In: Proceedings of the European Conference on Machine Learning (1998)
Anderson, C.: Learning and problem solving with multilayer connectionist systems. Technical Report, University of Massachusetts (1986)
Syllabus Finder, http://chnm.gmu.edu/syllabus-finder/syllabi/
SylViA: The Syllabus Viewer Application, http://groups.sims.berkeley.edu/sylvia/
Tungare, M., Yu, X., Cameron, W., Teng, G., Pérez-Quiñones, M., Fox, E., Fan, W., Cassel, L.: Towards a syllabus repository for computer science courses. In: Proceedings of the 38th Technical Symposium on Computer Science Education, SIGCSE (2007)
Kennedy, A., Shepherd, M.: Automatic Identification of Home Pages on the Web. In: Proceedings of the 38th Hawaii International Conference on System Sciences (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rathod, N., Cassel, L.N. (2012). Machine Learning in Building a Collection of Computer Science Course Syllabi. In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds) Theory and Practice of Digital Libraries. TPDL 2012. Lecture Notes in Computer Science, vol 7489. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33290-6_38
Download citation
DOI: https://doi.org/10.1007/978-3-642-33290-6_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33289-0
Online ISBN: 978-3-642-33290-6
eBook Packages: Computer ScienceComputer Science (R0)