ICADL 2007 - LNCS 4822
   

Identification of FRBR Works Within Bibliographic Databases: An Experiment with UNIMARC and Duplicate Detection Techniques

Nuno Freire, José Borbinha, and Pável Calado

INESC-ID, Rua Alves Redol 9, Apartado 13069, 1000-029 Lisboa, Portugal
nuno.freire@ist.utl.pt
jlb@ist.utl.pt
pavel.calado@tagus.ist.utl.pt

Abstract. Many experiments and studies have been conducted on the application of FRBR as an implementation model for bibliographic databases, in order to improve the services of resource discovery and transmit better perception of the information spaces represented in catalogues. One of these applications is the attempt to identify the FRBR work instances shared by several bibliographic records. In our work we evaluate the applicability to this problem of techniques based on string similarity, used in duplicate detection procedures mainly by the database research community. We describe the particularities of the application of these techniques to bibliographic data, and empirically compare the results obtained with these techniques to those obtained by current techniques, which are based on exact matching. Experiments performed on the Portuguese national union catalogue show a significant improvement over currently used approaches.

Keywords: Functional Requirements for Bibliographic Records, FRBR, Bibliographic databases, string similarity, duplicate detection

LNCS 4822, p. 267 ff.

Full article in PDF | BibTeX


lncs@springer.com
© Springer-Verlag Berlin Heidelberg 2007