ICADL 2007 - LNCS 4822
|
|
|
Personal Name Disambiguation in Web Search Results Based on a Semi-supervised Clustering Approach
Kazunari Sugiyama and Manabu Okumura
Precision and Intelligence Laboratory, Tokyo Institute of Technology, 4259 Nagatsuta, Midori, Yokohama, Kanagawa 226-8503, Japan
sugiyama@lr.pi.titech.ac.jp
oku@pi.titech.ac.jp
Abstract. Most of the previous works that disambiguate personal names in Web search results often employ agglomerative clustering approaches. In contrast, we have adopted a semi-supervised clustering approach in order to guide the clustering more appropriately. Our proposed semi-supervised clustering approach is novel in that it controls the fluctuation of the centroid of a cluster, and achieved a purity of 0.72 and inverse purity of 0.81, and their harmonic mean F was 0.76.
Keywords: Information retrieval, Semi-supervised clustering, Personal name disambiguation LNCS 4822, p. 250 ff.
Full article in PDF | BibTeX
lncs@springer.com
© Springer-Verlag Berlin Heidelberg 2007
|
|