Browsing by Subject "Feature engineering"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- PublicationOpen AccessCompilation and evaluation of the Spanish SatiCorpus 2021 for satire identification using linguistic features and transformers(Springer , 2021-12-17) García Díaz, José Antonio; Valencia García, Rafael; Informática y Sistemas; Facultades de la UMU::Facultad de Informática
- PublicationOpen AccessEvaluating feature combination strategies for hate-speech detection in Spanish using linguistic features and transformers(Springer, 2023) García Díaz, José Antonio; Jiménez Zafra, Salud María; García Cumbreras, Miguel Ángel; Valencia García, Rafael; Informática y Sistemas; Facultades de la UMU::Facultad de InformáticaThe rise of social networks has allowed misogynistic, xenophobic, and homophobic people to spread their hate-speech to intimidate individuals or groups because of their gender, ethnicity or sexual orientation. The consequences of hate-speech are devastating, causing severe depression and even leading people to commit suicide. Hate-speech identification is challenging as the large amount of daily publications makes it impossible to review every comment by hand. Moreover, hate-speech is also spread by hoaxes that requires language and context understanding. With the aim of reducing the number of comments that should be reviewed by experts, or even for the development of autonomous systems, the automatic identification of hate-speech has gained academic relevance. However, the reliability of automatic approaches is still limited specifically in languages other than English, in which some of the state-of-the-art techniques have not been analyzed in detail. In this work, we examine which features are most effective in identifying hate-speech in Spanish and how these features can be combined to develop more accurate systems. In addition, we characterize the language present in each type of hate-speech by means of explainable linguistic features and compare our results with state-of-the-art approaches. Our research indicates that combining linguistic features and transformers by means of knowledge integration outperforms current solutions regarding hate-speech identification in Spanish.
- PublicationOpen AccessSmart analysis of economics sentiment in Spanish based on linguistic features and transformers(IEEE, 2023-02-10) García Díaz, José Antonio; García-Sánchez, Francisco ; Valencia García, Rafael; Informática y Sistemas; Facultad de InformáticaTexts related to economics and finances are characterized by the use of words and expressions whose meaning (and the sentiments they convey) substantially depend on the context. This poses a major challenge to Natural Language Processing tasks in general, and Sentiment Analysis in particular. For lowresource languages such as Spanish, this situation becomes even more acute. Yet, the latest advancements in the field, including word embeddings and transformers, have allowed to boost the performance of Sentiment Analysis solutions. In this work we explore the impact of the combination of different feature sets in the accuracy of Sentiment Analysis in Spanish financial texts. For this, a corpus with 15,915 tweets has been compiled and manually annotated as either positive, negative, or neutral. Then, feature sets based on contextual and non-contextual embeddings along with linguistic features were evaluated both individually and combined. The best results, with a weighted F1-score of 73.15880%, were obtained with a combination of feature sets by means of knowledge integration