Publication: Automatic indexing of scientific articles on Library and Information Science with SISA, KEA and MAUI
Authors
Gil-Leiva, Isidoro ; Díaz Ortuño, Pedro ; Fernandes Correa, Renato
item.page.secondaryauthor
item.page.director
Publisher
Consejo Superior de Investigaciones Científicas
publication.page.editor
publication.page.department
DOI
doi.org/10.3989/redc.2022.4.1917
item.page.type
info:eu-repo/semantics/article
Description
© 2022 CSIC. This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) http://creativecommons.org/licenses/by /4.0/
This document is the Accepted version of a Published Work that appeared in final form in Revista Española de Documentación Científica. To access the final edited and published work see doi.org/10.3989/redc.2022.4.1917
Abstract
This article evaluates the SISA (Automatic Indexing System), KEA (Keyphrase Extraction Algorithm) and MAUI (Multi-Purpose Automatic Topic Indexing) automatic indexing systems to find out how they perform in relation to human indexing. SISA’s algorithm is based on rules about the position of terms in the different structural components
of the document, while the algorithms for KEA and MAUI are based on machine learning and the statistical features of
terms. For evaluation purposes, a document collection of 230 scientific articles from the Revista Española de Document ación Científica published by the Consejo Superior de Investigaciones Científicas (CSIC) was used, of which 30 were
used for training tasks and were not part of the evaluation test set. The articles were written in Spanish and indexed by
human indexers using a controlled vocabulary in the InDICES database, also belonging to the CSIC. The human indexing
of these documents constitutes the baseline or golden indexing, against which to evaluate the output of the automatic
indexing systems by comparing terms sets using the evaluation metrics of precision, recall, F-measure and consistency.
The results show that the SISA system performs best, followed by KEA and MAUI.
publication.page.subject
Citation
Revista Española de Documentación Científica vOL. 45, Nº 4 e338
item.page.embargo
Collections
Ir a Estadísticas
Este ítem está sujeto a una licencia Creative Commons. http://creativecommons.org/licenses/by/4.0/