DEVELOPMENT OF A MODULAR DATA UNIFICATION PIPELINE FOR REAL-TIME ENVIRONMENTAL THREAT ANALYTICS

Authors

DOI:

https://doi.org/10.35546/kntu2078-4481.2025.3.2.26

Keywords:

software system, real-time analytics, data unification, H3, DuckDB, materialized views, environmental monitoring

Abstract

The article presents the results of research and development of a modular software system for real-time environmental risk analytics, built on the concept of unified event ingestion and the application of H3-based hexagonal spatial aggregation.The relevance of this work is driven by the high heterogeneity of primary data sources, including open analytical reports (OSINT), satellite fire detections (FRP), ionizing radiation dose rate measurements, and meteorological fields. These sources rely on different temporal scales, measurement units, and data schemas. Such diversity creates significant challenges in integration, resulting in timestamp shifts, unit inconsistencies, schema conflicts, and event duplication, which complicates the timely production of consistent results and slows down updates of information layers. The objective of the study is to design a compact and reproducible data processing pipeline capable of transforming heterogeneous event streams into daily aggregated layers of risk and demand with sub-second latency. The proposed architecture consists of three main components. The first is the Plugin Source Adapter Interface (PSAI), which maps each data source into a standardized event table; adapters are responsible only for source-specific parsing, while subsequent unification logic is shared. The second component is the Deterministic Event Harmonization (DEH) module, which converts all timestamps to UTC format, normalizes measurement units, verifies coordinates, and guarantees idempotent inserts. This ensures safe reprocessing and proper handling of late-arriving data. The third component is a query layer oriented towards H3-first spatio-temporal aggregation: FRP points are aggregated into 15-minute intervals at H3 resolution level 10 using the MAX(FRP) operator to avoid overlaps, after which daily summaries at H3 resolution level 7 integrate FRP data with OSINT events, radiation monitoring signals, and meteorological indicators, including a wind alignment proxy, to construct a comprehensive risk index.The system is implemented in the DuckDB environment without the use of dedicated servers or network services, which simplifies infrastructure and reduces operational costs. The adopted approach ensures transparency and testability of integration contracts: the PSAI interface enforces data ingestion rules, the DEH module standardizes time and units, the H3 hierarchy guarantees spatial consistency, and DuckDB provides materialization of results through standard SQL.Experimental evaluations confirmed the high efficiency of the proposed solution: data updates are performed within fractions of a second even on consumer-grade laptop hardware; duplication of satellite FRP detections is reduced by approximately 99 %; daily summary exports remain under one megabyte in size, significantly simplifying their transfer and storage. The developed system combines the flexibility of modular architecture, the reproducibility of integration procedures, and the efficiency of the computational model, establishing a foundation for practical applications in environmental monitoring, risk management decision support, and the advancement of ecological analytics services.

References

Raasveldt M., Mühleisen H. (2019). DuckDB: an in-process analytical database. Proceedings of the 2019 ACM SIGMOD International Conference on Management of Data (Demo). URL: https://duckdb.org (access date: 03.09.2025).

Uber Engineering. (2018). H3: Uber’s Hexagonal Hierarchical Spatial Index. URL: https://www.uber.com/blog/h3/ (access date: 03.09.2025).

NASA LANCE. (2023). Fire Information for Resource Management System (FIRMS). URL: https://firms.modaps.eosdis.nasa.gov (access date: 03.09.2025).

Schroeder W., Oliva P., Giglio L., Csiszar I. (2014). The VIIRS 375 m active fire detection data suite. Remote Sensing of Environment, vol. 143, pp. 85–96. https://doi.org/10.1016/j.rse.2013.12.008 (access date: 03.09.2025).

European Commission. EURDEP – European Radiological Data Exchange Platform. URL: https://rem.jrc.ec.europa.eu (access date: 05.09.2025).

SaveEcoBot. Radiation map & data access. URL: https://www.saveecobot.com/en (access date: 03.09.2025).

OpenWeather. OpenWeatherMap API. URL: https://openweathermap.org/api (access date: 03.09.2025).

Boeing G. (2017). OSMnx: New methods for acquiring, constructing, analyzing, and visualizing complex street networks. Computers, Environment and Urban Systems, vol. 65, pp. 126–139. https://doi.org/10.1016/j.compenvurbsys.2017.05.004 (access date: 03.09.2025).

Valhalla. Open Source Routing Engine. URL: https://github.com/valhalla/valhalla / (access date: 05.09.2025).

Downloads

Published

2025-11-28