Data de publicació

2025-11-06T16:29:10Z

2025-11-06T16:29:10Z

2025



Resum

Treball fi de màster de: Erasmus Mundus joint Master in Artificial Intelligence (EMAI)


Supervisor: Jude Wells Co-Supervisor: Vicenç Gómez


Recent advances in chemical language models have enabled rapid exploration of chemical space through generative design of novel molecules. However, precise control over key molecular properties—such as size, aqueous solubility, and lipophilicity—remains challenging without retraining or introducing complex optimization steps. This thesis investigates a lightweight approach based on contrastive activation addition, where differences in model activations between molecules with favorable and unfavorable properties are used to compute steering vectors. These vectors are applied during generation to bias the model towards producing molecules with desired characteristics, without modifying model weights. Using a GPT-style molecular generator conditioned on protein targets, we demonstrate that steering can consistently shift molecular property distributions: reducing median heavy-atom counts, improving predicted solubility by up to 1.4 logS units, and increasing the fraction of molecules within the optimal lipophilicity window for oral drugs. The approach preserves high validity rates, typically above 90%, and requires minimal computation, making it suitable for early-stage drug discovery workflows. Two variants of the method are compared: a global steering vector applied uniformly, and a tokenaligned vector field adapting dynamically to each generation step. While the latter amplifies property shifts, it also increases the risk of generating invalid molecules under certain settings. Overall, this work demonstrates that activation steering offers an interpretable, low-overhead mechanism for fine-tuning molecular properties, providing a practical tool to accelerate the design–make–test cycle in drug development. Future directions include extending this strategy to multi-property optimization and models that capture three-dimensional molecular structures.

Tipus de document

Treball fi de màster

Llengua

Anglès

Matèries i paraules clau

Molècules

Citació recomanada

Aquesta citació s'ha generat automàticament.

Drets

Llicència CC Reconeixement-NoComercial-SenseObraDerivada 4.0 Internacional (CC BY-NC-ND 4.0)

https://creativecommons.org/licenses/by-nc-nd/4.0/

Aquest element apareix en la col·lecció o col·leccions següent(s)