Other authors

Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors

Universitat Politècnica de Catalunya. IDEAI-UPC - Intelligent Data sciEnce and Artificial Intelligence Research Group

Publication date

2024



Abstract

Web tracking technology is prevalent on the Internet today, with most websites employing user identification systems that can accurately identify users or devices behind browsers. While numerous works in literature attempt to create machine learning models for detecting these identification systems, many rely on features susceptible to obfuscation techniques and are only partially capable of identifying specific subsets of web tracking algorithms they were trained on. Additionally, classification is typically done over entire resources, making it difficult to distinguish between web tracking code and legitimate code within the same file. In this work, we propose AST-GNN, a graph neural network model applied to the abstract syntax tree structure of JavaScript files, which can predict portions of code used for tracking purposes. By focusing on the code structure, AST-GNN can detect various web tracking systems and it is robust against obfuscation techniques. Our results show that the system has an accuracy rate above 95% in identifying web tracking code snippets, with computation performance in the order of milliseconds, fast enough to be used in real-time.


This work was supported by the CHISTERA grant CHIST-ERA-22- SPiDDS-02 corresponding to the GRAPHS4SEC project (reference nº PCI2023-145974-2) funded by the Agencia Estatal de Investigación through the PCI 2023 call. This work is also supported by the Catalan Institution for Research and Advanced Studies (ICREA Academia).


Peer Reviewed


Postprint (author's final draft)

Document Type

Conference report

Language

English

Publisher

Association for Computing Machinery (ACM)

Related items

https://dl.acm.org/doi/10.1145/3694811.3697816

info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PCI2023-145974-2/ES/GRAPH NEURAL NETWORKS FOR ROBUST AI%2FML-DRIVEN NETWORK SECURITY APPLICATIONS/

Recommended citation

This citation was generated automatically.

Rights

Open Access

This item appears in the following Collection(s)

E-prints [72263]