2026
This paper presents a robust system for real-time object detection and counting in ecological video streams. It is based on the YOLOv8 architecture integrated within a multi-threaded video processing architecture. The system reduces latency and improves throughput by parallelizing object detection and preprocessing tasks. This leads to outperforming traditional single-threaded implementations in continuous video analysis. The system also incorporates dynamic thresholding methods, fine-tuning, and data augmentation to enhance object detection reliability in dynamic natural environments. These mechanisms improve robustness to changing lighting, occlusions, and background complexity, common challenges in outdoor footage. The system is thoroughly evaluated through performance comparisons between multi-threaded and single-threaded implementations, environmental stress tests, and an ablation study. Results demonstrate improved consistency in object detection and counting in dynamic environments, along with significant gains in processing speed. Designed for deployment on lightweight and low-power devices, the system is suitable for remote or resource-constrained settings. While designed for biodiversity monitoring, the approach is applicable to other domains requiring efficient, real-time video analysis in unstructured environments.
This work has been supported by the Generalitat de Catalunya for the financial support to the primary author, beneficiary of a predoctoral grant funded under the Program contract between the Generalitat of Catalonia and the International Center for Numerical Methods in Engineering (CIMNE), for the period 2020–2023. This research was also been developed within the PIKSEL project, ‘‘Portal for the integration of knowledge for a sustainable ecosystems and land management’’ (AG01D/442899500/5710/0000) funded by Generalitat de Catalunya. The authors also acknowledge the financial support through the Severo Ochoa Centers of Excellence Program (CEX 2018-000797-S) funded by MCIN (MCIN/AEI/10.13039/501100011033).
Artículo
Versión publicada
Inglés
Object detection; Biodiversity; Multithreading; Real-time; Scene understanding; Habitat mapping
Elsevier
Reproducció del document publicat a https://doi.org/10.1016/j.cviu.2025.104606
Computer Vision and Image Understanding, 2026, vol. 263, 104606
cc-by (c) Oluwakemi Akinwehinmi et al., 2026
Attribution 4.0 International
http://creativecommons.org/licenses/by/4.0/
Documents de recerca [18326]