Title:
|
Finding relevant variables in PAC model with membership queries
|
Author:
|
Guijarro Guillem, David; Tarui, Jun; Tsukiji, Tatsuie
|
Other authors:
|
Universitat Politècnica de Catalunya. Departament de Ciències de la Computació |
Abstract:
|
A new research frontier in AI and data mining seeks to develop methods
to automatically discover relevant variables among many irrelevant
ones. In this paper, we present four algorithms that output such
crucial variables in PAC model with membership queries. The first
algorithm executes the task under any unknown distribution by
measuring the distance between virtual and real targets. The second
algorithm exhausts virtual version space under an arbitrary
distribution. The third algorithm exhausts universal set under the
uniform distribution. The fourth algorithm measures influence of
variables under the uniform distribution. Knowing the number $r$ of
relevant variables, the first algorithm runs in almost linear time for
$r$. The second and the third ones use less membership queries than
the first one, but run in time exponential for $r$. The fourth one
enumerates highly influential variables in quadratic time for $r$. |
Subject(s):
|
-Àrees temàtiques de la UPC::Informàtica::Informàtica teòrica -AI -Data mining -Artificial intelligence -Membership queries -Unknown distribution -Arbitrary distribution -Uniform distribution |
Rights:
|
|
Document type:
|
Article - Published version Report |
Share:
|
|