COMPARATIVE ANALYSIS OF APPROACHES AND METHODS FOR SENTIMENT ANALYSIS OF TEXT IN THE CONTEXT OF PROCESSING CITY RESIDENTS’ FEEDBACK

Authors

DOI:

https://doi.org/10.35546/kntu2078-4481.2025.2.2.23

Keywords:

sentiment analysis, lexicon, rule-based approach, hybrid methods, transformer models, natural language

Abstract

The article presents a comprehensive comparative analysis of contemporary approaches and methods used to determine the emotional tone of text, with a particular emphasis on Ukrainian-language content. The relevance of this study stems from the growing need for effective tools to process citizen feedback collected via digital platforms, mobile applications, and social media – all of which serve as valuable sources of information for improving urban governance and service delivery. The research aims to address both methodological and linguistic gaps in sentiment analysis for the Ukrainian language, which, unlike English, remains under-resourced in terms of lexicons, corpora, and pretrained models.The study systematizes sentiment analysis methods into four main paradigms: those relying on predefined lexicons, those governed by linguistic rules, data-driven learning techniques, and integrated models that combine multiple strategies.For each paradigm, the theoretical foundations, typical algorithms, linguistic tools, and practical application examples are discussed. Lexicon-based methods, such as those utilizing the NRC EmoLex dictionary and corpus tools like Sketch Engine, are noted for their simplicity and adaptability to low-resource environments. Rule-based systems, including VADER and LIWC, stand out for their ability to account for syntactic structure, intensifiers, and negations, offering better interpretability, albeit with limited language generalizability.The section on machine learning explores both traditional classification algorithms – including models based on decision boundaries, probabilistic inference, and tree-like structures – and modern deep learning architectures, such as multilayer neural networks. The maximum entropy approach is also examined as a representative of statistical modeling that requires minimal assumptions about input features.Particular attention is given to recent advancements in deep neural networks, namely convolutional neural networks (CNN), recurrent LSTM networks, and transformer-based architectures such as BERT, RoBERTa, and GPT. Empirical results from recent studies confirm the high effectiveness of transformer models in multilingual contexts, especially in sentiment analysis of Ukrainian texts, where models like XLM-RoBERTa and Ukr-RoBERTa have achieved accuracy levels exceeding 91 %.The final part of the article discusses hybrid models that combine the strengths of different paradigms to enhance robustness, accuracy, and adaptability to domain-specific data. The proposed classification framework, developed on the basis of the conducted research, provides a coherent overview of existing methods and serves as a methodological foundation for the development of intelligent decision-support systems in urban governance with active citizen participation.The conclusions outline directions for further research, including the localization of deep learning models for Ukrainian, integration of sentiment analysis with topic modeling and named entity recognition, and the application of semi-supervised and active learning techniques to improve outcomes in scenarios with limited annotated data. Overall, the proposed taxonomy not only reflects the current state of sentiment analysis technologies but also outlines future development trajectories in multilingual and socially oriented applications.

References

Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems, 28(2), 15–21.

Kotsyba, N., Romanyshyn, O., & Shevchuk, O. (2021). Challenges in Sentiment Analysis for the Ukrainian Language. CEUR Workshop Proceedings, 2917, 118–126.

Vargas-Sierra, C., & Orts, M. (2023). Sentiment and emotion in financial journalism: a corpus-based, cross- linguistic analysis of the effects of COVID. Humanities and Social Sciences Communications, 10. https://doi.org/10.1057/s41599-023-01725-8

Abdulla, N. A., Ahmed, N. A., Shehab, M. A., & Al-Ayyoub, M. (2013). Arabic sentiment analysis: lexicon-based and corpus-based. IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT).

Abiola, O., Abayomi-Alli, A., Tale, O.A. et al. (2023). Sentiment analysis of COVID-19 tweets from selected hashtags in Nigeria using VADER and Text Blob analyser. Journal of Electrical Systems and Information Technology, 10, 5. https://doi.org/10.1186/s43067-023-00070-9

Karyawati, A. E., Utomo, P. A., & Wibawa, I. G. A. (2022). Comparison of SVM and LIWC for Sentiment Analysis of SARA. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 16(1), 45–54.

Birjali, M., Kasri, M., & Beni-Hssane, A. (2021). A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowledge-Based Systems, 226, 107134.

Khan, M. J., Abbas, Q., & Hussain, M. (2024). Comparative study of supervised learning techniques for sentiment classification. Computers, Materials & Continua, 78(1), 1–16. https://doi.org/10.32604/cmc.2023.043301

Aliman, N. M., Mustaffa, M., Jambari, H., & Jusoh, M. S. (2022). Performance evaluation of supervised learning algorithms for sentiment analysis: A comparative study. Journal of Engineering and Applied Sciences, 17(5), 143–150.

Lu, W., Wang, J., Wang, Y., Zhou, Y., & Qin, H. (2021). Sentiment analysis of social media texts with deep learning models and attention mechanism. Information, 12(10), 392. https://doi.org/10.3390/info12100392

Wabang, G. S., Ahmad, T., & Wijaya, D. E. (2022). Application of the Naive Bayes Classifier Algorithm to Classify Community Complaints. Journal of Physics: Conference Series, 2180(1), 012045. https://doi.org/10.1088/1742-6596/2180/1/012045

Rhohmawati, A., Sari, R. F., & Puspitasari, R. (2019). Sentiment analysis using maximum entropy for Shopee reviews. In 2019 4th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE) (pp. 237–242). IEEE.

Kim, J., & Jeong, Y. (2019). Sentiment classification using convolutional neural networks. International Journal of Advanced Computer Science and Applications, 10(5), 303–308.

Tholusuri, H., Gadde, P. R., & Sista, S. R. (2019). Sentiment analysis using LSTM with IMDB dataset. In 2019 4th International Conference on Communication and Electronics Systems (ICCES) (pp. 1192–1195). IEEE.

Roccabruna, S., Nesi, P., & Pantaleo, G. (2022). A comparison of BERT-based models for sentiment analysis in social media. Applied Sciences, 12(6), 3016. https://doi.org/10.3390/app12063016

Prytula, Y. (2024). Evaluation of transformer-based models for sentiment analysis of Ukrainian texts. Proceedings of the International Conference on Computational Linguistics and Intelligent Systems (COLINS), 2024.

Kheiri, M., & Karimi, M. (2023). SentimentGPT: Prompt-based sentiment analysis using generative pre-trained transformers. Journal of Big Data, 10, Article 42. https://doi.org/10.1186/s40537-023-00708-4

Riaz, S., Fatima, M., Kamran, M., & Nisar, M. W. (2019). Opinion mining on large scale data using sentiment analysis and k-means clustering. Cluster Computing, 22, 7149–7164.

Andriyani, F., & Puspitarani, Y.. (2022). Performance Comparison of K-Means and DBScan Algorithms for Text Clustering Product Reviews. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 6(3), 944–949. https://doi.org/10.33395/sinkron.v7i3.11569

Xue, J., Chen, J., Chen, C., Zheng, C., Li, S., & Zhu, T. (2020). Public discourse and sentiment during the COVID-19 pandemic: Using Latent Dirichlet Allocation for topic modeling on Twitter. PloS one, 15(9), e0239441.

Riaz, S., Fatima, M., Kamran, M., & Nisar, M. W. (2019). Opinion mining on large scale data using sentiment analysis and k-means clustering. Cluster Computing, 22, 7149–7164.

Andriyani, F., & Puspitarani, Y.. (2022). Performance Comparison of K-Means and DBScan Algorithms for Text Clustering Product Reviews. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 6(3), 944–949. https://doi.org/10.33395/sinkron.v7i3.11569

Xue, J., Chen, J., Chen, C., Zheng, C., Li, S., & Zhu, T. (2020). Public discourse and sentiment during the COVID-19 pandemic: Using Latent Dirichlet Allocation for topic modeling on Twitter. PloS one, 15(9), e0239441.

Published

2025-06-05