DigitalUM :: Browsing by Subject "Data mining"

Browsing by Subject "Data mining"

Now showing 1 - 7 of 7

Open Access
Adapting Knowledge Inference Algorithms to Measure Geometry Competencies through a Puzzle Game
(ACM, 2023-09-06) Strukova, Sofia; Gómez Mármol, Félix; Ruipérez Valiente, José Antonio; Ingeniería de la Información y las Comunicaciones
The rapid technological evolution of the last years has motivated students to develop capabilities that will prepare them for an unknown future in the 21st century. In this context, many teachers intend to optimise the learning process, making it more dynamic and exciting through the introduction of gamification. Thus, this article focuses on a data-driven assessment of geometry competencies, which are essential for developing problem-solving and higher-order thinking skills. Our main goal is to adapt, evaluate and compare Bayesian Knowledge Tracing (BKT), Performance Factor Analysis (PFA), Elo, and Deep Knowledge Tracing (DKT) algorithms applied to the data of a geometry game named Shadowspect, in order to predict students’ performance by means of several classifier metrics. We analysed two algorithmic configurations, with and without prioritisation of Knowledge Components (KCs) – the skills needed to complete a puzzle successfully, and we found Elo to be the algorithm with the best prediction power with the ability to model the real knowledge of students. However, the best results are achieved without KCs because it is a challenging task to differentiate between KCs effectively in game environments. Our results prove that the above-mentioned algorithms can be applied in formal education to improve teaching, learning, and organisational efficiency.
Open Access
Algoritmos de clasificación y redes neuronales en la observación automatizada de registros
(Murcia: Servicio de Publicaciones de la Universidad de Murcia, 2015) González Ruiz, Sergio Luis; Gómez-Gallego, I.; Pastrana Brincones, José Luis; Hernández-Mendo, Antonio
El objetivo del presente estudio es analizar los datos obtenidos a través de una plataforma on-line, mediante diferentes técnicas de clasificación y aprendizaje orientadas al descubrimiento del conocimiento. Se aplican técnicas de minería de datos para obtener relaciones de fiabilidad que informen del interés de los usuarios por cumplimentar de manera rigurosa el cuestionario on-line atendiendo al modo de realizar el mismo. Aunque existen técnicas que nos permiten observar el comportamiento de los usuarios mientras realizan el cuestionario, en este caso se emplean Redes Neuronales Artificiales para predecir el comportamiento de aquellos, atendiendo a variables obtenidas al realizar el cuestionario. La muestra consta de 1.636 participantes de diferentes zonas geográficas y rangos de edad, obtenida al contestar de manera anónima o identificada al cuestionario Inventario Psicológico para el Seguimiento de Talentos Deportivos (IPSETA). Los resultados obtenidos mediante las diferentes técnicas de análisis informan que el género femenino pre#ere realizar el registro en la plataforma para cumplimentar el cuestionario, alcanzando un alto porcentaje de fiabilidad (70%).
Open Access
An open IoT platform for the management and analysis of energy data
(Elsevier, 2019-03-01) Fernando Terroso-Sáenz; González Vidal, Aurora; Ramallo González, Alfonso Pablo; Skarmeta Gómez, Antonio; Ingeniería de la Información y las Comunicaciones; Facultades de la UMU::Facultad de Informática
Buildings are key players when looking at end-use energy demand. It is for this reason that during the last few years, the Internet of Things (IoT) has been considered as a tool that could bring great opportunities for energy reduction via the accurate monitoring and control of a large variety of energy-related agents in buildings. However, there is a lack of IoT platforms specifically oriented towards the proper processing, management and analysis of such large and diverse data. In this context, we put forward in this paper the IoT Energy Platform (IoTEP) which attempts to provide the first holistic solution for the management of IoT energy data. The platform we show here (that has been based on FIWARE) is suitable to include several functionalities and features that are key when dealing with energy quality insurance and support for data analytics. As part of this work, we have tested the platform IoTEP with a real use case that includes data and information from three buildings totalizing hundreds of sensors. The platform has exceed expectations proving robust, plastic and versatile for the application at hand.
Open Access
Computational approaches to detect experts in distributed online communities: a case study on Reddit
(Springer, 2024-04) Strukova, Sofia; Gómez Mármol, Félix; Ruipérez Valiente, José Antonio; Ingeniería de la Información y las Comunicaciones
The irreplaceable key to the triumph of Question & Answer (Q & A) platforms is their users providing high-quality answers to the challenging questions posted across various topics of interest. From more than a decade, the expert finding problem attracted much attention in information retrieval research. Based on the encountered gaps in the expert identification across several Q & A portals, we inspect the feasibility of identifying data science experts in Reddit. Our method is based on the manual coding results where two data science experts labelled not only expert and non-expert comments, but also out-of-scope comments, which is a novel contribution to the literature, enabling the identification of more groups of comments across web portals. We present a semi-supervised approach which combines 1113 labelled comments with 100,226 unlabelled comments during training. We proved that it is possible to develop models that can identify expert, non-expert and out-of-scope comments peaking the AUC score at 0.93, accuracy at 0.83, MAE at 0.15 degrees and R2 score at 0.69. The proposed model uses the activity behaviour of every user, including Natural Language Processing (NLP), crowdsourced and user feature sets. We conclude that the NLP and user feature sets contribute the most to the better identification of these three classes. It means that this method can generalise well within the domain. Finally, we make a novel contribution by presenting different types of users in Reddit, which opens many future research directions.
Open Access
Un modelo para estimar las acciones a aplicar a los alumnos de E.S.O.
(Universidad de Murcia. Servicio de Publicaciones, 2008-07-01) Muñoz Ledesma, Antonio; Cadenas Figueredo, José Manuel
We propose a complement to the information provided by the first grades the pupils obtain in Spanish and Mathematics (logical-mathematical and linguistics intelligences). It can be seen that certain parameters related to these abilities have a big impact when determining the decisions that the Guidance, Spanish and Mathematics Departments should take as regards the pupils who have just arrived to school. These decisions can be classified as follows: — to give Spanish and Mathematics support lessons, — to give Spanish and Mathematics reinforcement lessons, — absence of decision, — to question performance according to pupils ability. Using Knowledge Discovery Techniques in Data Bases, it has been proved and confirmed that those parameters are good predictors of the actions to be taken with each as they give results of over 80% of conformity. Thus, we endeavoured, through the values obtained in these parameters, to predict which actions should be adopted in order to improve from the beginning performance in Spanish and Mathematics.
Open Access
Reviewing ensemble classification methods in breast cancer
(Elsevier, 2019-05-20) Hosni, Mohamed; Abnane, Ibtissam; Idri, Ali; Carrillo de Gea, Juan Manuel; Fernández Alemán, José Luis; Informática y Sistemas
Context: Ensemble methods consist of combining more than one single technique to solve the same task. This approach was designed to overcome the weaknesses of single techniques and consolidate their strengths. Ensemble methods are now widely used to carry out prediction tasks (e.g. classification and regression) in several fields, including that of bioinformatics. Researchers have particularly begun to employ ensemble techniques to improve research into breast cancer, as this is the most frequent type of cancer and accounts for most of the deaths among women. Objective and method: The goal of this study is to analyse the state of the art in ensemble classification methods when applied to breast cancer as regards 9 aspects: publication venues, medical tasks tackled, empirical and research types adopted, types of ensembles proposed, single techniques used to construct the ensembles, validation framework adopted to evaluate the proposed ensembles, tools used to build the ensembles, and optimization methods used for the single techniques. This paper was undertaken as a systematic mapping study. Results: A total of 193 papers that were published from the year 20 0 0 onwards, were selected from four online databases: IEEE Xplore, ACM digital library, Scopus and PubMed. This study found that of the six medical tasks that exist, the diagnosis medical task was that most frequently researched, and that the experiment-based empirical type and evaluation-based research type were the most dominant ap- proaches adopted in the selected studies. The homogeneous type was that most widely used to perform the classification task. With regard to single techniques, this mapping study found that decision trees, support vector machines and artificial neural networks were those most frequently adopted to build en- semble classifiers. In the case of the evaluation framework, the Wisconsin Breast Cancer dataset was the most frequently used by researchers to perform their experiments, while the most noticeable vali- dation method was k-fold cross-validation. Several tools are available to perform experiments related to ensemble classification methods, such as Weka and R Software. Few researchers took into account the optimisation of the single technique of which their proposed ensemble was composed, while the grid search method was that most frequently adopted to tune the parameter settings of a single classifier. Conclusion: This paper reports an in-depth study of the application of ensemble methods as regards breast cancer. Our results show that there are several gaps and issues and we, therefore, provide researchers in the field of breast cancer research with recommendations. Moreover, after analysing the papers found in this systematic mapping study, we discovered that the majority report positive results concerning the ac- curacy of ensemble classifiers when compared to the single classifiers. In order to aggregate the evidence reported in literature, it will, therefore, be necessary to perform a systematic literature review and meta- analysis in which an in-depth analysis could be conducted so as to confirm the superiority of ensemble classifiers over the classical techniques.
Open Access
Técnicas de clasificación de data mining: una aplicación al consumo de tabaco en adolescentes
(Murcia: Universidad de Murcia, Editum, 2014-05) Montaño-Moreno, Juan J.; Gervilla-García, Elena; Cajal-Blasco, Berta; Palmer, Alfonso
El presente trabajo tiene el propósito de analizar el poder predictivo de diversas variables psicosociales y de personalidad sobre el consumo o no consumo de nicotina en la población adolescente mediante el uso de diversas técnicas de clasificación procedentes de la metodología Data Mining. Más concretamente, se analizan las RNA –Perceptrón Multicapa (MLP), Funciones de Base Radial (RBF) y Redes Probabilísticas (PNN)–, los árboles de decisión, el modelo de regresión logística y el análisis discriminante. Para ello, se ha trabajado con una muestra de 2666 adolescentes, de los cuales 1378 no consumen nicotina mientras que 1288 son consumidores de nicotina. Los modelos analizados han sido capaces de discriminar correctamente entre ambos tipos de sujeto en un rango comprendido entre el 77.39% y el 78.20%, alcanzando una sensibilidad del 91.29% y una especificidad del 74.32%. Con este estudio, se pone a disposición del especialista en conductas adictivas, un conjunto de técnicas estadísticas avanzadas capaces de manejar simultáneamente una gran cantidad de variables y sujetos, así como aprender de forma automática patrones y relaciones complejas, siendo muy adecuadas para la predicción y prevención del comportamiento adictivo.

Browsing by Subject "Data mining"

Results Per Page

Sort Options