Reinforcement learning with a Gaussian mixture model

Inicio | ¿Qué es? | Contacto

English | Català

Consultar RECERCAT

Por comunidades y
colecciones Por fecha Por autores Por títulos Por temas (CDU)

Consultar departamento

Por fecha Por autores Por títulos Por temas (CDU)

Estadisticas

Del documento Todo RECERCAT

Mi RECERCAT

Entrar Alertas por correo-e

Directorio de otros repositorios

RECERCAT Principal > Universitat Politècnica de Catalunya > Documents de recerca > Visualizar documento

Para acceder a los documentos con el texto completo, por favor, siga el siguiente enlace: http://hdl.handle.net/2117/12093

Título:	Reinforcement learning with a Gaussian mixture model
Autor/a:	Agostini, Alejandro Gabriel; Celaya Llover, Enric
Otros autores:	Institut de Robòtica i Informàtica Industrial; Universitat Politècnica de Catalunya. VIS - Visió Artificial i Sistemes Intel.ligents
Abstract:	Recent approaches to Reinforcement Learning (RL) with function approximation include Neural Fitted Q Iteration and the use of Gaussian Processes. They belong to the class of fitted value iteration algorithms, which use a set of support points to fit the value-function in a batch iterative process. These techniques make efficient use of a reduced number of samples by reusing them as needed, and are appropriate for applications where the cost of experiencing a new sample is higher than storing and reusing it, but this is at the expense of increasing the computational effort, since these algorithms are not incremental. On the other hand, non-parametric models for function approximation, like Gaussian Processes, are preferred against parametric ones, due to their greater flexibility. A further advantage of using Gaussian Processes for function approximation is that they allow to quantify the uncertainty of the estimation at each point. In this paper, we propose a new approach for RL in continuous domains based on Probability Density Estimations. Our method combines the best features of the previous methods: it is non-parametric and provides an estimation of the variance of the approximated function at any point of the domain. In addition, our method is simple, incremental, and computationally efficient. All these features make this approach more appealing than Gaussian Processes and fitted value iteration algorithms in general.
Materia(s):	-Machine learning -generalisation (artificial intelligence) learning (artificial intelligence) -Aprenentatge automàtic -Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic -Classificació INSPEC::Cybernetics::Artificial intelligence::Learning (artificial intelligence)
Derechos:	Attribution-NonCommercial-NoDerivs 3.0 Spain http://creativecommons.org/licenses/by-nc-nd/3.0/es/
Tipo de documento:	Artículo - Versión publicada Objeto de conferencia
Compartir: