A competitive strategy for function approximation in Q-learning

Inicio | ¿Qué es? | Contacto

English | Català

Consultar RECERCAT

Por comunidades y
colecciones Por fecha Por autores Por títulos Por temas (CDU)

Consultar departamento

Por fecha Por autores Por títulos Por temas (CDU)

Estadisticas

Del documento Todo RECERCAT

Mi RECERCAT

Entrar Alertas por correo-e

Directorio de otros repositorios

RECERCAT Principal > Universitat Politècnica de Catalunya > Documents de recerca > Visualizar documento

Para acceder a los documentos con el texto completo, por favor, siga el siguiente enlace: http://hdl.handle.net/2117/14123

Título:	A competitive strategy for function approximation in Q-learning
Autor/a:	Agostini, Alejandro Gabriel; Celaya Llover, Enric
Otros autores:	Institut de Robòtica i Informàtica Industrial; Universitat Politècnica de Catalunya. ROBiri - Grup de Robòtica de l'IRI; Universitat Politècnica de Catalunya. VIS - Visió Artificial i Sistemes Intel.ligents
Abstract:	In this work we propose an approach for generalization in continuous domain Reinforcement Learning that, instead of using a single function approximator, tries many different function approximators in parallel, each one defined in a different region of the domain. Associated with each approximator is a relevance function that locally quantifies the quality of its approximation, so that, at each input point, the approximator with highest relevance can be selected. The relevance function is defined using parametric estimations of the variance of the q-values and the density of samples in the input space, which are used to quantify the accuracy and the confidence in the approximation, respectively. These parametric estimations are obtained from a probability density distribution represented as a Gaussian Mixture Model embedded in the input-output space of each approximator. In our experiments, the proposed approach required a lesser number of experiences for learning and produced more stable convergence profiles than when using a single function approximator.
Abstract:	Peer Reviewed
Materia(s):	-Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial -Q-Learning -Reinforcement learning -generalisation (artificial intelligence) learning (artificial intelligence) AUTOR: reinforcement learning -Aprenentatge -- Tècniques
Derechos:
Tipo de documento:	Artículo - Borrador Objeto de conferencia
Compartir:

Mostrar el registro completo del ítem

Documentos relacionados

Otros documentos del mismo autor/a

Online EM with weight-based forgetting

Celaya Llover, Enric; Agostini, Alejandro Gabriel

Action rule induction from cause-effect pairs learned through robot-teacher interaction

Agostini, Alejandro Gabriel; Celaya Llover, Enric; Torras, Carme; Wörgötter, Florentin

Probability density estimation of the Q Function for reinforcement learning

Agostini, Alejandro Gabriel; Celaya Llover, Enric

Learning rules from cause-effects explanations

Agostini, Alejandro Gabriel; Celaya Llover, Enric; Torras, Carme; Wörgötter, Florentin

On-line learning of macro planning operators using probabilistic estimations of cause-effects

Agostini, Alejandro Gabriel; Wörgötter, Florentin; Celaya Llover, Enric; Torras, Carme

Accesibilidad | Aviso legal | Política de Cookies | Documentos de uso interno

Coordinación

Patrocinio