Data de publicació

2022-09-29



Resum

K-mers are used on a daily basis in bioinformatics. Although they have existed at the core of several popular tools for genome assembly for quite some time, until recently they have been woefully underutilized. Although k-mer counting is simple and straightforward, it becomes a real challenge when attempting to deal with the huge amounts of data generated in high-throughput sequencing. However, having a simple representation of the actual data with few degrees of freedom (i.e. the k-value and the 4 letters – when dealing with nucleotide sequences), does provide the perfect opportunity to investigate novel mixes of methods and techniques derived from various fields. In that context, the real challenge is to map the biological questions to a corresponding modelling approach. Such examples could be the application of Gödel numbering as a means of transforming the search space for sequence similarity, application of pruned trees and entropy for identifying novel features in sequences, and binning methods for metagenomics classification.

Tipus de document

Conference report

Llengua

Anglès

Publicat per

Barcelona Supercomputing Center

Citació recomanada

Aquesta citació s'ha generat automàticament.

Drets

http://creativecommons.org/licenses/by-nc-nd/4.0/

Open Access

Attribution-NonCommercial-NoDerivatives 4.0 International

Aquest element apareix en la col·lecció o col·leccions següent(s)

Congressos [11159]