2018-01-30T09:22:38Z
2019-01-02T06:10:27Z
2018-01-02
2018-01-30T09:22:38Z
Breath analysis holds the promise of a non-invasive technique for the diagnosis of diverse respiratory conditions including COPD and lung cancer. Breath contains small metabolites that may be putative biomarkers of these conditions. However, the discovery of reliable biomarkers is a considerable challenge in the presence of both clinical and instrumental confounding factors. Among the latter, instrumental time drifts are highly relevant, as since question the short and long-term validity of predictive models. In this work we present a methodology to counter instrumental drifts using information from interleaved blanks for a case study of GC-MS data from breath samples. The proposed method includes feature filtering, and additive, multiplicative and multivariate drift corrections, the latter being based on Component Correction. Biomarker discovery was based on Genetic Algorithms in a filter configuration using Fisher´s ratio computed in the Partial Least Squares - Discriminant Analysis subspace as a figure of merit. Using our protocol, we have been able to find nine peaks that provide a statistically significant Area under the ROC Curve (AUC) of 0.75 for COPD discrimination. The method developed has been successfully validated using blind samples in short-term temporal validation. However, in the attempt to use this model for patient screening six months later was not successful. This negative result highlights the importance of increasing validation rigour when reporting biomarker discovery results
Article
Versió acceptada
Anglès
Marcadors bioquímics; Respiració; Quimiometria; Biochemical markers; Respiration; Chemometrics
Institute of Physics (IOP)
Versió postprint del document publicat a: https://doi.org/10.1088/1752-7163/aaa492
Journal of Breath Research, 2018
https://doi.org/10.1088/1752-7163/aaa492
(c) Institute of Physics (IOP), 2018