Genome analysis is the foundation of many scientific and medical discoveries as well as a key pillar of personalized medicine. Any analysis of a genome fundamentally starts with the reconstruction of the genome from its sequenced fragments. This process is called read mapping. One key goal of read mapping is to find the variations and similarities that are present between the sequenced genome and reference genome(s) and to tolerate the errors introduced by the genome sequencing process. Read mapping is currently a major bottleneck in the entire genome analysis pipeline because stateof- the-art genome sequencing technologies are able to sequence a genome much faster than the computational techniques that are employed to reconstruct the genome. New sequencing technologies, like nanopore sequencing, greatly exacerbate this problem while at the same time making genome sequencing much less costly. This talk describes our ongoing journey in greatly improving the performance of genome read mapping as well as broader genome analysis. We first provide a brief background on read mappers that can comprehensively find genomic variations/similarities and tolerate sequencing errors. Then, we describe both algorithmic and hardware-based acceleration approaches. Algorithmic approaches exploit the structure of the genome, the structure of the problem at hand, as well as the structure of the underlying hardware. Hardware-based acceleration approaches exploit specialized microarchitectures or new execution paradigms like processing in memory. We show that significant improvements are possible with both algorithmic and hardware-based approaches and their combination. We conclude with a foreshadowing of future challenges brought about by very low-cost new sequencing technologies and their potential use cases in public health, science, and medicine.
Conference report
English
Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors; High performance computing; Càlcul intensiu (Informàtica)
Barcelona Supercomputing Center
http://creativecommons.org/licenses/by-nc-nd/4.0/
Open Access
Attribution-NonCommercial-NoDerivatives 4.0 International
Congressos [11159]