Latent representation of H&E images retains biological information in a breast cancer cohort

Abstract

Imaging technologies and staining based pathology are important components of common practice cancer care. Specifically, H&E imaging is standard for almost all cancer patients. Traditionally, H&E images can serve, when used by experienced trained pathologists, to infer important biological properties of the samples. Recent work demonstrated that machine learning and machine vision analysis of H&E images can further expand the scope of the inference. However, H&E images are high-resolution, making them difficult to analyze and possibly noisy. In this work, we propose an autoencoder-based pipeline that greatly reduces the dimension of the data representation while maintaining valuable properties. In particular, we investigate how different latent space dimensions affect bulk label predictions from H&E. We use autoencoders applied to image tiles as a tool in this investigation and also examine other information that may be inferred from image tiles. For example, we show classification results for tiles, such as Luminal A versus Luminal B, with an F1 score larger than 0.85. We also show that Ki67 levels can be inferred from H&E tiles, as shown before on other cohorts, and that inference is still possible when working with lower dimensional latent representations. The two main contributions of this paper are as follows. First, demonstrating that the use of image tiles can be informative, both at the global classification level, and, more importantly, to support the assessment of heterogeneity. Second, reasonably accurate inference can be performed with lower dimensional latent representations of the H&E images.

Document Type

Article


Published version

Language

English

Publisher

Public Library of Science (PLoS)

Related items

Reproducció del document publicat a: https://doi.org/10.1371/journal.pone.0329221

PLoS One, 2025, vol. 20, num.9

https://doi.org/10.1371/journal.pone.0329221

Recommended citation

This citation was generated automatically.

Rights

cc-by (c) Benmussa, Chloé et al., 2025

http://creativecommons.org/licenses/by/4.0/