Self-supervised and in-context learning techniques for automated optical inspection

dc.contributor.author
Figueira, Joaquín
dc.date.accessioned
2025-11-05T20:29:06Z
dc.date.available
2025-11-05T20:29:06Z
dc.date.issued
2025-11-04T18:03:09Z
dc.date.issued
2025-11-04T18:03:09Z
dc.date.issued
2025
dc.identifier
http://hdl.handle.net/10230/71773
dc.identifier.uri
http://hdl.handle.net/10230/71773
dc.description.abstract
Treball fi de màster de: Erasmus Mundus joint Master in Artificial Intelligence (EMAI)
dc.description.abstract
Supervisora: Lejla Batina Co-Supervisor: Faysal Boughorbel
dc.description.abstract
Automated Optical Inspection (AOI) is a family of techniques used to find defects and anomalies in electronic devices from high-quality photographs of different regions of an integrated component and its packaging. Current methods use computer vision models and image preprocessing pipelines specific to each chip design and manufacturer. As a result, the current deep learning approach for AOI requires a long retraining process whenever new devices are introduced or significant covariate shifts occur in the input image distribution. In this work, we adapt and evaluate different pre-training techniques (DINO, iBOT, and MAE) for small vision transformers (ViT and FasterViT) to streamline the design process of AOI semantic segmentation models and shorten the training time needed to adapt the models to new input conditions. We use a custom, relatively small dataset for model pre-training with only 7000 unlabeled images, showing how the pre-training strategies perform well in small data regimes. Furthermore, we introduce a set of retrieval-based scene understanding techniques to solve the task of semantic segmentation of wire-bonded devices with virtually no training time in labeled data. Our results demonstrate how our custom pre-trained encoders and retrieval strategies outperform comparable convolutional architectures pre-trained using full supervision in semantic segmentation, both in speed and quality, when training time is constrained. Moreover, we show how our proposed image retrieval strategies generalize to existing ViT models pretrained on different datasets, and how the techniques can be used to predict images of a single device and produce high-quality segmentation masks using a relatively small number of labeled training images. Finally, we show how the retrieval strategies outperform fine-tuned, convolutional encoder-decoder models in the context of out-of-distribution, unseen images.
dc.format
application/pdf
dc.language
eng
dc.rights
Llicència CC Reconeixement-NoComercial-SenseObraDerivada 4.0 Internacional (CC BY-NC-ND 4.0)
dc.rights
https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights
info:eu-repo/semantics/openAccess
dc.subject
Aprenentatge
dc.title
Self-supervised and in-context learning techniques for automated optical inspection
dc.type
info:eu-repo/semantics/masterThesis


Fitxers en aquest element

FitxersGrandàriaFormatVisualització

No hi ha fitxers associats a aquest element.

Aquest element apareix en la col·lecció o col·leccions següent(s)