Fast computation and applications of genome mappability

Publication date

2014-06-20T08:18:15Z

2014-06-20T08:18:15Z

2012

Abstract

We present a fast mapping-based algorithm to compute the mappability of each region of a reference genome up to a specified number of mismatches. Knowing the mappability of a genome is crucial for the interpretation of massively parallel sequencing experiments. We investigate the properties of the mappability of eukaryotic DNA/RNA both as a whole and at the level of the gene family, providing for various organisms tracks which allow the mappability information to be visually explored. In addition, we show that mappability varies greatly between species and gene classes. Finally, we suggest several practical applications where mappability can be used to refine the analysis of high-throughput sequencing data (SNP calling, gene expression quantification and paired-end experiments). This work highlights mappability as an important concept which deserves to be taken into full account, in particular when massively parallel sequencing technologies are employed. The GEM mappability program belongs to the GEM (GEnome Multitool) suite of programs, which can be freely downloaded for any use from its website (http://gemlibrary.sourceforge.net).


This work has been partially supported by grants BIO2006-03380 (to RG) and CONSOLIDER CSD2007-00050 (to RG and PR) from the Spanish Ministerio de Educacion y Ciencia

Document Type

Article


Published version

Language

English

Publisher

Public Library of Science (PLoS)

Related items

PLoS One. 2012;7(1):e30377

info:eu-repo/grantAgreement/ES/2PN/BIO2006-03380

info:eu-repo/grantAgreement/ES/2PN/CSD2007-00050

Recommended citation

This citation was generated automatically.

Rights

© 2012 Thomas Derrien et al. This is an Open Access article distributed under the terms of a Creative Commons Attribution License

http://creativecommons.org/licenses/by/2.5/

This item appears in the following Collection(s)