Author:
|
Barrera Gómez, José; Agier, Lydiane; Portengen, Lützen; Chadeau-Hyam, Marc; Giorgis-Allemand, Lise; Siroux, Valérie; Robinson, Oliver; Vlaanderen, Jelle; González, Juan Ramón; Nieuwenhuijsen, Mark J.; Vineis, Paolo; Vrijheid, Martine; Vermeulen, Roel; Slama, Rémy; Basagaña, Xavier
|
Abstract:
|
BACKGROUND: There is growing interest in examining the
simultaneous effects of multiple exposures and, more generally,
the effects of mixtures of exposures, as part of the exposome
concept (being defined as the totality of human environmental
exposures from conception onwards). Uncovering such combined
effects is challenging owing to the large number of exposures,
several of them being highly correlated. We performed a
simulation study in an exposome context to compare the
performance of several statistical methods that have been
proposed to detect statistical interactions. METHODS:
Simulations were based on an exposome including 237 exposures
with a realistic correlation structure. We considered several
statistical regression-based methods, including two-step
Environment-Wide Association Study (EWAS2), the
Deletion/Substitution/Addition (DSA) algorithm, the Least
Absolute Shrinkage and Selection Operator (LASSO), Group-Lasso
INTERaction-NET (GLINTERNET), a three-step method based on
regression trees and finally Boosted Regression Trees (BRT). We
assessed the performance of each method in terms of model size,
predictive ability, sensitivity and false discovery rate.
RESULTS: GLINTERNET and DSA had better overall performance than
the other methods, with GLINTERNET having better properties in
terms of selecting the true predictors (sensitivity) and of
predictive ability, while DSA had a lower number of false
positives. In terms of ability to capture interaction terms,
GLINTERNET and DSA had again the best performances, with the
same trade-off between sensitivity and false discovery
proportion. When GLINTERNET and DSA failed to select an exposure
truly associated with the outcome, they tended to select a
highly correlated one. When interactions were not present in the
data, using variable selection methods that allowed for
interactions had only slight costs in performance compared to
methods that only searched for main effects. CONCLUSIONS:
GLINTERNET and DSA provided better performance in detecting
two-way interactions, compared to other existing methods. |