Adaptive sampling methods for scaling up knowledge discovery algorithms

All of RECERCAT

To access the full text documents, please follow this link: http://hdl.handle.net/2117/97852

Title:	Adaptive sampling methods for scaling up knowledge discovery algorithms
Author:	Domingo Soriano, Carlos; Gavaldà Mestre, Ricard; Watanabe, Osamu
Other authors:	Universitat Politècnica de Catalunya. Departament de Ciències de la Computació
Abstract:	One of the biggest research challenges in KDD and Data Mining is to develop methods that scale up well to large amounts of data. A possible approach for achieving scalability is to take a random sample and do data mining on it. In this paper, we propose an adaptive sampling method to solve a variety of practically appearing data mining tasks on very large data. Our algorithms are adaptive in the sense that they determine from the data whether it has already seen enough data to reach a reliable conclusion. We prove the correctness of our method, estimate its efficiency theoretically, and show its efficienty experimentally on a concrete task requiring sampling.
Subject(s):	-Àrees temàtiques de la UPC::Informàtica -Scalability -KDD -Data mining -Knowledge discovery algorithms
Rights:
Document type:	Article - Draft Report
Share: