REGRESSION MODELS FOR EARLY ESTIMATING THE LINES OF CODE COUNT OF WEB APPLICATIONS CREATED USING THE CODEIGNITER FRAMEWORK

Authors

DOI:

https://doi.org/10.32782/mathematical-modelling/2025-8-1-18

Keywords:

regression model, estimation, lines of code, web application, PHP, framework, Codeigniter, class, method, depth of inheritance tree, Box – Cox transformation

Abstract

The problem of early estimation of lines of code count in software projects holds significant importance, as it directly influences the prediction of software development effort, including Web applications created using such the well-known PHP framework, as the Codeigniter. The object of the study is the process of early estimating the lines of code count of web applications created using the Codeigniter framework. The subject of the study is the regression models for early estimating the lines of code count of web applications created using the Codeigniter framework.The goal of the work is to build some regression models with three factors for early estimating the lines of code count of web applications created using the Codeigniter framework depending on the factors that can be found in the class diagram.In the work, we built two linear regression models for early estimating the lines of code count of web applications created using the Codeigniter framework depending on three factors: the number of classes, the average number of methods per class, metric DIT (Depth of Inheritance Tree) on the application level. These factors were chosen for two reasons. First, the values of these factors can be found from the class diagram, and, second, there is no problem of multicollinearity for them. The parameter estimates of the obtained models were found by the method of least squares. The above models are constructed based on splitting the four-dimensional data set (the actual size in thousands of lines of code, the number of classes, the average number of methods per class, the DIT metric on the application level) into two clusters. This data of metrics for 50 open-source web applications created using the Codeigniter framework was obtained using the PhpMetrics tool (https://phpmetrics.org/). These applications are hosted on the GitHub platform (https://github. com/). The comparison of the constructed linear regression models with the non-linear regression models is performed.These two linear models, in comparison with existing non-linear ones, allow us to describe all the four-dimensional data on which they were built.

References

Brar P., Nandal D. A systematic literature review of machine learning techniques for software effort estimation models. Computational intelligence and communication technologies (CCICT) : proceedings of 2022 Fifth International conference, Sonepat, India: IEEE, 2022. P. 494–499. https://doi.org/10.1109/CCiCT56684.2022.00093.

Kumar S., Arora M., Sakshi Chopra S. A review of effort estimation in agile software development using machine learning techniques. Inventive research in computing applications (ICIRCA) : proceedings of the 2022 4th International conference, Coimbatore, India: IEEE, 2022. P. 416–422. https://doi.org/10.1109/ICIRCA54612.2022.9985542.

Rahman M., Sarwar H., Kader M.A., Gonçalves T., Tin, T.T. Review and empirical analysis of machine learning-based software effort estimation. IEEE Access. 2024. Vol. 12. P. 85661–85680. https://doi.org/10.1109/ACCESS.2024.3404879.

Hussain I., Malik A.A. Determining the utility of use case points and class points in early software size estimation. Emerging Technologies (ICET) : proceedings of 2023 18th International Conference, Peshawar, Pakistan: IEEE, 2023. P. 171–175. https://doi.org/10.1109/ ICET59753.2023.10374977.

Yuan X., Su J., Yu C., Ye S. Power grid software cost estimation based on improved COCOMO model. Electronic technology, communication and information (ICETCI) : proceedings of the 2023 IEEE 3rd International conference, Changchun, China, Los Alamitos: IEEE, 2023. P. 1265–1269. https://doi.org/10.1109/ICETCI57876.2023.10176686.

Prykhodko S.B., Shutko I.S., Prykhodko A.S. Early size estimation of web apps created using Codeigniter framework by nonlinear regression models. Radio-electronic and computer systems. 2022. Vol. 103. № 3. P. 84–94. https://doi.org/10.32620/reks.2022.3.06.

Prykhodko S., Prykhodko N. Building nonlinear regression models for estimating the number of clusters and their initial centroids. Computer sciences and information technologies (CSIT) : proceedings of the 2023 IEEE 18th International conference, Lviv, Ukraine: IEEE, 2023. P. 1–4. https://doi.org/10.1109/CSIT61576.2023.10324095.

Daud M., Malik A.A. Improving the accuracy of early software size estimation using analysis-to-design adjustment factors (ADAFs). IEEE Access. 2021. Vol. 9. P. 81986–81999. https://doi.org/10.1109/ACCESS.2021.3085752.

Dewi R.S., Araynawa T.K., Prasanna F.M., Felianasari N., Rahmawati R., Hartantc A.E., … Mazaya Al-K. Improving software size estimation using data complexity (Case study: Research and community service monitoring apps). Electrical engineering, computer science and informatics (EECSI) : proceedings of 2024 11th International conference, Yogyakarta, Indonesia: IEEE, 2024. P. 315–319. https://doi.org/10.1109/EECSI63442.2024.10776530.

Dewi R.S., Zahrah F.A., Nugraha D.A., Prabowo P.S., Safitri A., Jayadi P. Predicting software size based on conceptual data model (Case study: Shrimp pond system management). Electrical engineering and computer science (ICECOS) : proceedings of 2024 International conference, Palembang, Indonesia: IEEE, 2024. P. 175–178. https://doi.org/10.1109/ ICECOS63900.2024.10791154.

Nassif A.B., AbuTalib M., Capretz L.F. Software effort estimation from Use Case diagrams using nonlinear regression analysis. In On electrical and computer engineering : proceedings of IEEE Canadian conference, London, ON, Canada: IEEE, 2020. P. 1–4. https://doi.org/10.1109/CCECE47787.2020.9255712.

Sahoo P., Behera D.K., Mohanty J.R., Kumar Dash C.S. Effort estimation of software products by using UML sequence models with regression analysis. Information Technology (OCIT) : proceedings of the 2022 OITS International Conference, Bhubaneswar, India, Los Alamitos: IEEE, 2022. P. 97–101. https://doi.org/10.1109/OCIT56763.2022.00028.

Manisha Rishi R. Early size estimation using machine learning. Computing for sustainable global development (INDIACom) : proceedings of the 2021 8th International conference, New Delhi, India, Los Alamitos: IEEE, 2021. P. 757–762. https://doi.org/10.1109/INDIACom51348.2021.00135.

Nhung H.L.T.K., Hai V.V., Silhavy R., Prokopova Z., Silhavy P. Parametric software effort estimation based on optimizing correction factors and multiple linear regression. IEEE Access. 2022. Vol. 10. P. 2963–2986. https://doi.org/10.1109/ACCESS.2021.3139183.

Published

2025-05-27