Detecting Outliers with Semi-Supervised Machine Learning : a Fraud Prediction Application

dc.contributor.author
Palacio, Sebastian M.
dc.contributor.author
Xarxa de Referència en Economia Aplicada (XREAP)
dc.date.issued
2018
dc.identifier
https://ddd.uab.cat/record/201398
dc.identifier
urn:oai:ddd.uab.cat:201398
dc.description.abstract
Abnormal pattern prediction has received a great deal of attention from both academia and industry, with applications that range from fraud, terrorism and intrusion detection to sensor events, medical diagnoses, weather patterns, etc. In practice, most abnormal pattern prediction problems are characterized by the presence of a small number of labeled data and a huge number of unlabeled data. While this points most obviously to the adoption of a semi-supervised approach, most empirical studies have opted for a simplification and treated it as a supervised problem, resulting in a severe bias of false negatives. In this paper, we propose an innovative methodology based on semi-supervised techniques and introduce a new metric the Cluster-Score for abnormal homogeneity measurement. Specifically, the methodology involves transmuting unsupervised models to supervised models using the Cluster-Score metric, which defines the objective boundaries between clusters and evaluates the homogeneity of the abnormalities in the cluster construction. We apply this methodology to a problem of fraud detection among property insurance claims. The objectives are to increase the number of fraudulent claims detected and to reduce the proportion of claims investigated that are, in fact, non-fraudulent. The results from applying our methodology considerably improved these objectives.
dc.format
application/pdf
dc.language
eng
dc.publisher
Xarxa de Referència en Economia Aplicada (XREAP)
dc.relation
Xarxa de Referència en Economia Aplicada (XREAP). Documents de treball de la Xarxa de Referència en Economia Aplicada (XREAP) ;
dc.rights
open access
dc.rights
Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial, la distribució, i la comunicació pública de l'obra, sempre que no sigui amb finalitats comercials, i sempre que es reconegui l'autoria de l'obra original. No es permet la creació d'obres derivades.
dc.rights
https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject
Aprenentatge automàtic
dc.subject
Frau
dc.subject
Previsió
dc.subject
Assegurances
dc.subject
Mineria de dades
dc.subject
Machine learning
dc.subject
Forecasting
dc.subject
Fraud
dc.subject
Insurance
dc.subject
Data mining
dc.title
Detecting Outliers with Semi-Supervised Machine Learning : a Fraud Prediction Application
dc.type
Working paper


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)