Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Universitat Politècnica de Catalunya. IDEAI-UPC - Intelligent Data sciEnce and Artificial Intelligence Research Group
2024
Web tracking technology is prevalent on the Internet today, with most websites employing user identification systems that can accurately identify users or devices behind browsers. While numerous works in literature attempt to create machine learning models for detecting these identification systems, many rely on features susceptible to obfuscation techniques and are only partially capable of identifying specific subsets of web tracking algorithms they were trained on. Additionally, classification is typically done over entire resources, making it difficult to distinguish between web tracking code and legitimate code within the same file. In this work, we propose AST-GNN, a graph neural network model applied to the abstract syntax tree structure of JavaScript files, which can predict portions of code used for tracking purposes. By focusing on the code structure, AST-GNN can detect various web tracking systems and it is robust against obfuscation techniques. Our results show that the system has an accuracy rate above 95% in identifying web tracking code snippets, with computation performance in the order of milliseconds, fast enough to be used in real-time.
This work was supported by the CHISTERA grant CHIST-ERA-22- SPiDDS-02 corresponding to the GRAPHS4SEC project (reference nº PCI2023-145974-2) funded by the Agencia Estatal de Investigación through the PCI 2023 call. This work is also supported by the Catalan Institution for Research and Advanced Studies (ICREA Academia).
Peer Reviewed
Postprint (author's final draft)
Conference report
Anglès
Àrees temàtiques de la UPC::Informàtica::Seguretat informàtica; Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial; Graph neural network; Web tracking; Privacy
Association for Computing Machinery (ACM)
https://dl.acm.org/doi/10.1145/3694811.3697816
info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PCI2023-145974-2/ES/GRAPH NEURAL NETWORKS FOR ROBUST AI%2FML-DRIVEN NETWORK SECURITY APPLICATIONS/
Open Access
E-prints [72263]