Deep learning-based player tracking in sports videos

dc.contributor
Universitat Politècnica de Catalunya. Universitat de Barcelona
dc.contributor
Universitat Rovira i Virgili
dc.contributor
Universitat de Barcelona
dc.contributor
Game On
dc.contributor
Wang, Ling
dc.contributor
Escalera Guerrero, Sergio
dc.contributor.author
Poniatowski, Kacper Krzysztof
dc.date.accessioned
2026-04-18T01:26:28Z
dc.date.available
2026-04-18T01:26:28Z
dc.date.issued
2026-01-28
dc.identifier
https://hdl.handle.net/2117/460744
dc.identifier
203552
dc.identifier.uri
https://hdl.handle.net/2117/460744
dc.description.abstract
Multi-camera player tracking is a fundamental prerequisite for advanced sports analytics, yet it remains a computationally challenging task due to frequent inter-player occlusions, rapid motion, and the visual homogeneity of team uniforms. This thesis presents a robust end-to-end pipeline for the detection and tracking of football players using a calibrated four-camera setup. The proposed system integrates state-of-the-art deep learning techniques with geometric computer vision. We employ a fine-tuned object detector paired with ByteTrack for local perception. To resolve the Multi-Dimensional Assignment (MDA) problem across views, we introduce a Hierarchical Divide- and-Conquer fusion strategy. Unlike naive greedy clustering approaches, this method utilises recursive bipartite matching with a multi-cue cost function incorporating position, velocity, shape, and colour histograms. Furthermore, a Temporal Hinting mechanism is implemented to recover player identities following extended occlusions or spatial discontinuities. Comparative evaluation against a greedy geometric baseline demonstrates substantial improvements in tracking accuracy, with the hierarchical approach achieving 0.844 GS-HOTA compared to 0.416 for the baseline-a 103% relative improvement. Comprehensive evaluation on held-out test sequences across temporal horizons from 5 to 45 minutes confirms exceptional detection stability, with detection accuracy (DetA) maintaining 93.6% and MOTA sustaining 97.0% regardless of sequence length. The system exhibits a consistent identity switch rate of approximately 110 switches per minute, demonstrating temporal stability without compounding drift. These results establish a strong foundation for automated game state reconstruction and tactical analysis in professional sports.
dc.format
application/pdf
dc.language
eng
dc.publisher
Universitat Politècnica de Catalunya
dc.rights
Open Access
dc.subject
Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic
dc.subject
Computer vision
dc.subject
Deep learning
dc.subject
Soccer
dc.subject
Machine learning
dc.subject
AI
dc.subject
Computer vision
dc.subject
Football
dc.subject
Visió per ordinador
dc.subject
Aprenentatge profund
dc.subject
Futbol
dc.title
Deep learning-based player tracking in sports videos
dc.type
Master thesis


Fitxers en aquest element

FitxersGrandàriaFormatVisualització

No hi ha fitxers associats a aquest element.

Aquest element apareix en la col·lecció o col·leccions següent(s)