DEVELOPMENT AND EVALUATION OF THE EFFECTIVENESS OF A HANDWRITTEN TEXT RECOGNITION MODEL BASED ON CONVOLUTIONAL NEURAL NETWORKS
DOI:
https://doi.org/10.35546/kntu2078-4481.2025.2.2.8Keywords:
handwritten text recognition, convolutional neural networks, LSTM, deep learning, image processing, IAM Sentences dataset, TensorFlowAbstract
The article is devoted to developing and evaluating the effectiveness of a handwritten text recognition model based on convolutional neural networks using the IAM Sentences dataset. The paper details the process of creating a model that combines modern deep learning methods to solve the complex task of converting handwritten information into digital format. A comprehensive analysis of scientific research in text recognition confirms the relevance of neural networks.The research methodology includes detailed pre-processing of the IAM Sentences dataset, which contains images of handwritten sentences with corresponding text labels. The data preparation includes metadata reading, filtering of false entries, image normalisation, and creating a dictionary of unique characters. Particular attention is paid to data augmentation methods using random brightness changes and scaling to improve model robustness.The architecture of the developed model is based on a combination of convolutional layers to extract spatial features from images and LSTM layers to capture sequential dependencies between characters. The Connectionist Temporal Classification (CTC) loss function allows the model to predict symbol sequences without explicit alignment between input and output, critical for processing variable-length handwritten text.The experiments’ results demonstrate the developed system’s high efficiency, achieving a CER of 11.04 % after training for 67 epochs. This indicator shows high character recognition accuracy, a competitive result for handwritten text recognition tasks. Analysis of the training curves via TensorBoard showed a steady improvement in metrics with minor fluctuations, confirming the correctness of the chosen architecture and training parameters.
References
IAM Handwriting Database. URL: https://fki.tic.heia-fr.ch/databases/iam-handwriting-database (дата звернення: 07.06.2025).
Discover LSTM. NVIDIA Developer. 2024. URL: https://developer.nvidia.com/discover/lstm (дата звернення: 07.06.2025).
The Role of Softmax in Neural Networks: Detailed Explanation and Applications. GeeksforGeeks. 2024. URL: https://www.geeksforgeeks.org/the-role-of-softmax-in-neural-networks-detailed-explanation-and-applications/ (дата звернення: 07.06.2025).
Adam Optimizer. Keras. 2024. URL: https://keras.io/api/optimizers/adam/ (дата звернення: 07.06.2025).
WER, CER, MER Metrics. Kolena. 2024. URL: https://docs.kolena.com/metrics/wer-cer-mer/ (дата звернення: 07.06.2025).
Keras Callbacks API. TensorFlow. 2024. URL: https://www.tensorflow.org/api_docs/python/tf/keras/callbacks (дата звернення: 07.06.2025).
TensorBoard. TensorFlow. 2024. URL: https://www.tensorflow.org/tensorboard (дата звернення: 07.06.2025).
Nebauer C. Evaluation of convolutional neural networks for visual recognition. IEEE Transactions on Neural Networks. 1998. Vol. 9, no. 4. P. 685–696. DOI: 10.1109/72.701181.
Yamashita R., Nishio M., Do R. K. G. et al. Convolutional neural networks: an overview and application in radiology. Insights Imaging. 2018. Vol. 9. P. 611–629. DOI: 10.1007/s13244-018-0639-9.
Mienye I. D., Swart T. G., Obaido G. Recurrent Neural Networks: A Comprehensive Review of Architectures, Variants, and Applications. Information. 2024. Vol. 15, no. 9. P. 517. DOI: 10.3390/info15090517.







