RESEARCH OF MACHINE LEARNING METHODS FOR SEARCH INFORMATION
DOI:
https://doi.org/10.32782/KNTU2618-0340/2021.4.2.2.7Keywords:
machine learning, clustering, collaborative filtering, search for associative rulesAbstract
The advantages of using machine learning in search are that the search engine can learn and thus lead to more personalized answers, rather than the common results. In well-known search engines, such algorithms have been used for a long time and are constantly being improved. In the work on the examples were studied methods and algorithms of machine learning, which are used to search for information, their advantages and disadvantages. Collaborative filtering, clustering, and search for associative rules were chosen. The main approaches of collaborative filtering - correlation and latent models are considered. The correlation models - user similarity filtering (user-based filtration) and link similarity filtering (item-based filtration). These models are considered in the examples, which show how the algorithms work. Link similarity filtering predicts an estimate based on the estimates of another link, and uses regression analysis or, alternatively, uses a simplified predictor called the SlopeOne algorithm. The metrics Euclidean distance, cosine coefficient and Pearson correlation coefficient, which are used to determine the user similarity coefficient in the filtering model by user similarity, are considered. Clustering algorithms such as biclasterization, DBSCAN noise clustering algorithm, and fuzzy c-means fuzzy clustering algorithm are considered as latent models. All these algorithms are designed to form data clusters according to a certain criterion. The search for associative rules is considered on the example of the Apriori algorithm, which is generated on the basis of all frequent search sets found in the database of search queries that meet the specified match criterion. To apply this algorithm, the data were reduced to a binary form and the corresponding data structure. It is concluded that each of these methods has its drawbacks and only by combining them can achieve the desired result to improve the quality of the search depending on the tasks set by the customer.
References
Щербаков Д. Как искусственный интеллект повлиял на поисковые системы. URL: https://www.uplab.ru/blog/artificial-intelligence/
Segaran T., Programming Collective Intelligence (O’Reilly Media Inc., California, 2007), pp. 27–46.
Yao Z., Weibin C., “Review of research on collaborative filtering recommendation”, Micro Machines and Applications 6, 2013, pp. 4-10.
Owen S., Anil R., Dunning T. and Friedman E., Mahout in Action (Manning Publications Co, Shelter Island, 2012), pp. 48–56.
Pu Wang and HongWu Ye, “A Personalized Recommendation Algorithm Combining Slope One Scheme and User Based Collaborative Filtering”, IIS '09, 2009, pp. 152-154.
Bo F. and Jiujun C. “Collaborative filtering and recommendation algorithm based on multiple similarities among users”, Computer Science, No.39, 2012, pp. 23-26.
Hofmann T. and Puzicha J., “Latent class models for collaborative filtering”, in Proceedings of the International Joint Conference on Artificial Intelligence, 1999, pp. 668–693.
Madeira S. C. and Oliveira A. L., "Biclustering Algorithms for Biological Data Analysis: A Survey", IEEE/ACM Transactions on Computational Biology and Bioinformatics, VOL 1, NO. 1, pp. 24-45 January-March 2004.
Bhavithra, J. and Saradha, A. Personalized Web Page Recommendation Using Case-Based Clustering and Weighted Association Rule Mining. Cluster Computing, 2019, 22, 6991-7002