Automating installation, testing and development of bcbio-nextgen pipeline

Inici | Què és? | Contacte

English | Castellano

Consultar RECERCAT

Per comunitats i
col·leccions Per data Per autors Per títols Per matèries

Consultar col·lecció

Per data Per autors Per títols Per matèries

Estadístiques

Del document Tot RECERCAT

El meu RECERCAT

Entrar Alertes per correu-e

Directori d’altres repositoris

Pàgina inicial del RECERCAT > Universitat Politècnica de Catalunya > Tesines i projectes i treballs de final de carrera > Visualitza document

Per accedir als documents amb el text complet, si us plau, seguiu el següent enllaç: http://hdl.handle.net/2099.1/18704

Títol:	Automating installation, testing and development of bcbio-nextgen pipeline; Paral·lelització del pipeline Bcbio-nextgen per al tractament de dades genòmiques
Autor/a:	Carrasco Hernandez, Guillermo
Altres autors:	Svensson, Thomas; Jiménez González, Daniel
Abstract:	In the recent years, the costs of obtaining biological data have been drastically reduced. This has lead into an exponential growth of the available data. Having such growth of data to analyze sometimes results in very platform-dependent and difficult to scale software solutions. This final project tries to provide a solution to those problems in a real bioinformatics core facility in the Science For Life Laboratory. Science For Life Laboratory is a center for large-scale biosciences with the focus in health and environmental research. It is located in Stockholm, Sweden. This laboratory has 15 next generation sequencing instruments at present, with a combined capacity for DNA sequencing equal to several hundreds of complete human genomes per year. This implies a massive amount of data to be managed and analyzed. This data is analyzed using bcbio-nextgen. bcbio-nextgen is an in-house maintained genomics pipeline, originally developed by Brad Chapman at Harvard School of Public Health [Rom12]. The first goal of this project is to automate the installation, deployment and testing of the aforementioned pipeline. On the other hand, the alignment1 step of the analysis will be modified to use Seal, a Hadoop based aligner. This will allow us to check that all automations are working properly, as the pipeline will have to be installed and tested in several nodes.
Matèries:	-Àrees temàtiques de la UPC::Informàtica::Aplicacions de la informàtica::Bioinformàtica -Bioinformatics -Hadoop -automation -parallelization -continuous integration -cluster -dna -sequence alignment -bioinformatics -fastq -Bioinformàtica
Drets:
Tipus de document:	Projecte/Treball fi de carrera o de grau
Publicat per:	Universitat Politècnica de Catalunya
Compartir:

Mostra el registre complet del document

Accessibilitat | Avís legal | Política de Cookies | Documents d'ús intern

Coordinació

Patrocini