Publication:
Automatic indexing of scientific articles on Library and Information Science with SISA, KEA and MAUI

relationships.isAuthorOfPublication
relationships.isSecondaryAuthorOf
relationships.isDirectorOf
Authors
Gil-Leiva, Isidoro ; Díaz Ortuño, Pedro ; Fernandes Correa, Renato
item.page.secondaryauthor
item.page.director
Publisher
Consejo Superior de Investigaciones Científicas
publication.page.editor
publication.page.department
DOI
doi.org/10.3989/redc.2022.4.1917
item.page.type
info:eu-repo/semantics/article
Description
© 2022 CSIC. This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) http://creativecommons.org/licenses/by /4.0/ This document is the Accepted version of a Published Work that appeared in final form in Revista Española de Documentación Científica. To access the final edited and published work see doi.org/10.3989/redc.2022.4.1917
Abstract
This article evaluates the SISA (Automatic Indexing System), KEA (Keyphrase Extraction Algorithm) and MAUI (Multi-Purpose Automatic Topic Indexing) automatic indexing systems to find out how they perform in relation to human indexing. SISA’s algorithm is based on rules about the position of terms in the different structural components of the document, while the algorithms for KEA and MAUI are based on machine learning and the statistical features of terms. For evaluation purposes, a document collection of 230 scientific articles from the Revista Española de Document ación Científica published by the Consejo Superior de Investigaciones Científicas (CSIC) was used, of which 30 were used for training tasks and were not part of the evaluation test set. The articles were written in Spanish and indexed by human indexers using a controlled vocabulary in the InDICES database, also belonging to the CSIC. The human indexing of these documents constitutes the baseline or golden indexing, against which to evaluate the output of the automatic indexing systems by comparing terms sets using the evaluation metrics of precision, recall, F-measure and consistency. The results show that the SISA system performs best, followed by KEA and MAUI.
Citation
Revista Española de Documentación Científica vOL. 45, Nº 4 e338
item.page.embargo
Collections