bigSCale: an analytical framework for big-scale single-cell data

dc.contributor.author
Iacono, Giovanni
dc.contributor.author
Mereu, Elisabetta
dc.contributor.author
Guillaumet-Adkins, Amy
dc.contributor.author
Corominas Castiñeira, Roser
dc.contributor.author
Cuscó, Ivon
dc.contributor.author
Rodríguez Esteban, Gustavo
dc.contributor.author
Gut, Marta
dc.contributor.author
Pérez-Jurado, Luis Alberto
dc.contributor.author
Gut, Ivo G.
dc.contributor.author
Heyn, Holger
dc.date.issued
2021-03-25T18:06:08Z
dc.date.issued
2021-03-25T18:06:08Z
dc.date.issued
2018-06-28
dc.date.issued
2021-03-25T18:06:08Z
dc.identifier
1088-9051
dc.identifier
https://hdl.handle.net/2445/175786
dc.identifier
687304
dc.identifier
29724792
dc.description.abstract
Single-cell RNA sequencing (scRNA-seq) has significantly deepened our insights into complex tissues, with the latest techniques capable of processing tens of thousands of cells simultaneously. Analyzing increasing numbers of cells, however, generates extremely large data sets, extending processing time and challenging computing resources. Current scRNA-seq analysis tools are not designed to interrogate large data sets and often lack sensitivity to identify marker genes. With bigSCale, we provide a scalable analytical framework to analyze millions of cells, which addresses the challenges associated with large data sets. To handle the noise and sparsity of scRNA-seq data, bigSCale uses large sample sizes to estimate an accurate numerical model of noise. The framework further includes modules for differential expression analysis, cell clustering, and marker identification. A directed convolution strategy allows processing of extremely large data sets, while preserving transcript information from individual cells. We evaluated the performance of bigSCale using both a biological model of aberrant gene expression in patient-derived neuronal progenitor cells and simulated data sets, which underlines the speed and accuracy in differential expression analysis. To test its applicability for large data sets, we applied bigSCale to assess 1.3 million cells from the mouse developing forebrain. Its directed down-sampling strategy accumulates information from single cells into index cell transcriptomes, thereby defining cellular clusters with improved resolution. Accordingly, index cell clusters identified rare populations, such as reelin (Reln)-positive Cajal-Retzius neurons, for which we report previously unrecognized heterogeneity associated with distinct differentiation stages, spatial organization, and cellular function. Together, bigSCale presents a solution to address future challenges of large single-cell data sets.
dc.format
14 p.
dc.format
application/pdf
dc.language
eng
dc.publisher
Cold Spring Harbor Laboratory Press
dc.relation
Reproducció del document publicat a: https://doi.org/10.1101/gr.230771.117
dc.relation
Genome Research, 2018, vol. 28, num. 6, p. 878-890
dc.relation
https://doi.org/10.1101/gr.230771.117
dc.relation
info:eu-repo/grantAgreement/EC/H2020/656359/EU//7DUP
dc.rights
cc-by-nc (c) Iacono, Giovanni et al., 2018
dc.rights
http://creativecommons.org/licenses/by-nc/3.0/es
dc.rights
info:eu-repo/semantics/openAccess
dc.source
Articles publicats en revistes (Genètica, Microbiologia i Estadística)
dc.subject
Bioinformàtica
dc.subject
Cèl·lules
dc.subject
Bioinformatics
dc.subject
Cells
dc.title
bigSCale: an analytical framework for big-scale single-cell data
dc.type
info:eu-repo/semantics/article
dc.type
info:eu-repo/semantics/publishedVersion


Ficheros en el ítem

FicherosTamañoFormatoVer

No hay ficheros asociados a este ítem.

Este ítem aparece en la(s) siguiente(s) colección(ones)