COMPARATIVE ANALYSIS OF GEOMETRIC AND VECTOR METRICS OF 3D FACE IDENTIFICATION UNDER DYNAMIC OCCLUSIONS
DOI:
https://doi.org/10.32782/mathematical-modelling/2026-9-1-6Keywords:
3D face reconstruction, monocular video, dynamic occlusions, biometric authentication, FLAME model, MICA, vector-based metrics, L2 distance, geometric metrics, real-timeAbstract
This study investigates the problem of occlusion-robust three-dimensional (3D) face reconstruction from monocular video for real-time authentication systems. Random dynamic occlusions – such as hand movements, objects entering the field of view, and self-occlusions driven by pose changes – remain a dominant source of failure in modern algorithms because they are temporally non-stationary and unpredictably corrupt the evidence for facial surface points. The objective of the research is a comparative analysis of various metric types (vector-based and geometric) to determine identity under conditions closely matching real-world operational scenarios. The proposed method utilizes a two-stage input stream filtration based on a sharpness metric (Laplacian operator) and structural similarity (SSIM) to select high-quality, diverse frames. Reconstruction is performed using the MICA neural network, which maps images into the parametric space of the FLAME model. To enhance identity stability and mitigate the impact of short-term dynamic interference, median fusion of shape vectors is applied. Experimental evaluation was conducted using MIT OpenCourseWare lecture videos, which feature non-ideal lighting, active gesturing, and specific occlusions caused by virtual transparent boards. The identification task was formulated as binary classification using a suite of metrics, including L2 distance, cosine similarity, Chamfer Distance, and Hausdorff Distance. The results of the analysis demonstrate a significant advantage of vector-based representations over direct surface geometry comparisons. It was established that L2 distance in the parametric shape-space is the most effective metric, providing the highest Precision, Recall, and F1-score, as well as the largest Area Under the Curve (AUC). Conversely, geometric metrics, particularly the Hausdorff distance, proved to be the least reliable due to their critical sensitivity to outliers and local reconstruction noise caused by occlusions. The practical value of the work lies in confirming the feasibility of using 3D face reconstruction for reliable biometric authentication in uncontrolled environments while adhering to the latency constraints of modern mobile devices.
References
Kartynnik Y., Ablavatski A., Grishchenko I., Grundmann M. Real-time Facial Surface Geometry from Monocular Video on Mobile GPUs. DOI: https://doi.org/10.48550/arXiv.1907.06724 (дата звернення : 17.04.2024).
Grishchenko I., Ablavatski A., Kartynnik Y., Tsai C., Grundmann M. Attention Mesh: Highfidelity Face Mesh Prediction in Real-time. DOI: https://doi.org/10.48550/arXiv.2006.10962 (дата звернення : 09.03.2024).
Tran A. T., Hassner T., Masi I., Medioni G. Extreme 3D Face Reconstruction: Seeing Through Occlusions. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2018. P. 3936–3945. DOI: https://doi.org/10.1109/CVPR.2018.00414
Feng Y., Feng H., Cuevas C., Dasgupta S., Bolkart T., Wuhrer S. Learning an Animatable Detailed 3D Face Model from In-The-Wild Images (DECA). ACM Transactions on Graphics. 2021. Vol. 40. № 1. P. 1–14. DOI: https://doi.org/10.1145/3450626.3459936
Wen Y., Chen W., Li T., Yi G., Qiao Y., Ma L. Self-Supervised 3D Face Reconstruction via Conditional Estimation (CEST). Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2021. P. 13289–13298. DOI: https://doi.org/10.1109/ICCV48922.2021.01304
Wood E., Baltrusaitis T., Hewitt C., Sementiev S., Cashman T. J., Shotton J. 3D Face Reconstruction with Dense Landmarks. Proceedings of the European Conference on Computer Vision (ECCV). 2022. P. 160–177. DOI: https://doi.org/10.1007/978-3-031-19778-9_10
Dey R., Boddeti V. N. Generating Diverse 3D Reconstructions from a Single Occluded face Image (Diverse3DFace). Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022. P. 11721–11731. DOI: https://doi.org/10.48550/arXiv.2112.00879
Selvaraju P., Rai J. S. J., Barker J., Chandran P., Bradley D., McDonagh S., Beeler S. OFER: Occluded Face Expression Reconstruction. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2025. P. 1–11. DOI: https://doi.org/10.48550/arXiv.2410.21629
Shen J., Zafeiriou S., Chrysos G. G., Kossaifi J., Tzimiropoulos G., Mantic M. The First Facial Landmark Tracking in-the-Wild Challenge: Benchmark and Results (300VW). Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCVW). 2015. P. 50–58. DOI: https://doi.org/10.1109/ICCVW.2015.132
Pillai R. K., Goodyear L. J., Wood E., Baltrusaitis T., Morency L. P., Cohn J. F. The 2nd 3D Face Alignment in the Wild Challenge (3DFAW-Video): Dataset and Evaluation Protocol. Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). 2019. P. 1–8. DOI: https://doi.org/10.1109/iccvw.2019.00371





