ICADL 2007 - LNCS 4822
   

Using Automatic Metadata Extraction to Build a Structured Syllabus Repository

Xiaoyan Yu1, Manas Tungare1, Weiguo Fan1, Manuel Pérez-Quiñones1, Edward A. Fox1, William Cameron2, and Lillian Cassel2

1Virginia Tech, Blacksburg VA 24061, USA
xiaoyany@vt.edu
manas@vt.edu
wfan@vt.edu
perez@vt.edu
fox@vt.edu
http://syllabus.cs.vt.edu/

2Villanova University, Villanova PA 19085, USA
william.cameron@villanova.edu
lillian.cassel@villanova.edu

Abstract. Syllabi are important documents created by instructors for students. Gathering syllabi that are freely available, and creating useful services on top of the collection, will yield a digital library of value for the educational community. However, gathering and building a repository of syllabi is complicated by the unstructured nature of syllabus representation and the lack of a unified vocabulary for syllabus construction. In this paper, we propose an intelligent approach to automatically annotate freely-available syllabi from the Web to benefit the educational community through supporting services such as semantic search. We discuss our detailed process for converting unstructured syllabi to structured representations through entity recognition, segmentation, and association. Our evaluation results demonstrate the effectiveness of our extractor and also suggest improvements. We hope our work will benefit not only users of our services but also people who are interested in building other genre-specific repositories.

LNCS 4822, p. 337 ff.

Full article in PDF | BibTeX


lncs@springer.com
© Springer-Verlag Berlin Heidelberg 2007