Parallel k nearest neighbor graph construction using tree-based data structures

Rajani, Nazneen; McArdle, Kate; Dhillon, Inderjit S.; Rajani, Nazneen; McArdle, Kate; Dhillon, Inderjit S.

Parallel k nearest neighbor graph construction using tree-based data structures

Autor/a

Rajani, Nazneen

McArdle, Kate

Dhillon, Inderjit S.

Data de publicació

2015

Resum

Construction of a nearest neighbor graph is often a neces- sary step in many machine learning applications. However, constructing such a graph is computationally expensive, es- pecially when the data is high dimensional. Python's open source machine learning library Scikit-learn uses k-d trees and ball trees to implement nearest neighbor graph construc- tion. However, this implementation is ine cient for large datasets. In this work, we focus on exploiting these under- lying tree-based data structures to optimize parallel execu- tion of the nearest neighbor algorithm. We present parallel implementations of nearest neighbor graph construction us- ing such tree structures, with parallelism provided by the OpenMP and the Galois framework. We empirically show that our parallel and exact approach is e cient as well as scalable, compared to the Scikit-learn implementation. We present the rst implementation of k-d trees and ball trees using Galois. Our results show that k-d trees are faster when the number of dimensions is small (2d N); ball trees on the other hand scale well with the number of dimensions. Our implementation of ball trees in Galois has almost linear speedup on a number of datasets irrespective of the size and dimensionality of the data.

Tipus de document

Conference report

Llengua

Anglès

Matèries i paraules clau

Àrees temàtiques de la UPC::Matemàtiques i estadística; Algorithm; Algorismes

Publicat per

Barcelona Supercomputing Center

Documents relacionats

High Performance Graph Mining workshop (1st: 2015: Sydney)

Citació recomanada

Aquesta citació s'ha generat automàticament.

Exportar

DIDL MARC MARC_CCUC METS OAI_DC ORE QDC RDF

Drets

http://creativecommons.org/licenses/by-sa/3.0/es/

Open Access

Attribution-ShareAlike 3.0 Spain

Aquest element apareix en la col·lecció o col·leccions següent(s)

Congressos [11188]

Parallel k nearest neighbor graph construction using tree-based data structures

Autor/a

Data de publicació

Compartir

Resum

Tipus de document

Llengua

Matèries i paraules clau

Publicat per

Documents relacionats

Citació recomanada

Exportar

Drets

Aquest element apareix en la col·lecció o col·leccions següent(s)