Instilling moral value alignment by means of multi-objective reinforcement learning

dc.contributor.author
Rodriguez Soto, Manel
dc.contributor.author
Serramià Amorós, Marc
dc.contributor.author
López Sánchez, Maite
dc.contributor.author
Rodríguez-Aguilar, Juan A. (Juan Antonio)
dc.date.issued
2023-02-01T09:10:35Z
dc.date.issued
2023-02-01T09:10:35Z
dc.date.issued
2022-01-24
dc.date.issued
2023-02-01T09:10:35Z
dc.identifier
1388-1957
dc.identifier
https://hdl.handle.net/2445/192920
dc.identifier
715848
dc.description.abstract
AI research is being challenged with ensuring that autonomous agents learn to behave ethically, namely in alignment with moral values. Here, we propose a novel way of tackling the value alignment problem as a two-step process. The first step consists on formalising moral values and value aligned behaviour based on philosophical foundations. Our formalisation is compatible with the framework of (Multi-Objective) Reinforcement Learning, to ease the handling of an agent's individual and ethical objectives. The second step consists in designing an environment wherein an agent learns to behave ethically while pursuing its individual objective. We leverage on our theoretical results to introduce an algorithm that automates our two-step approach. In the cases where value-aligned behaviour is possible, our algorithm produces a learning environment for the agent wherein it will learn a value-aligned behaviour.
dc.format
17 p.
dc.format
application/pdf
dc.language
eng
dc.publisher
Springer
dc.relation
Reproducció del document publicat a: https://doi.org/10.1007/s10676-022-09635-0
dc.relation
Ethics And Information Technology, 2022, vol. 24
dc.relation
https://doi.org/10.1007/s10676-022-09635-0
dc.rights
cc by (c) Manel Rodríguez Soto et al., 2022
dc.rights
http://creativecommons.org/licenses/by/3.0/es/
dc.rights
info:eu-repo/semantics/openAccess
dc.source
Articles publicats en revistes (Matemàtiques i Informàtica)
dc.subject
Intel·ligència artificial
dc.subject
Aprenentatge per reforç (Intel·ligència artificial)
dc.subject
Ètica
dc.subject
Aspectes morals
dc.subject
Artificial intelligence
dc.subject
Reinforcement learning
dc.subject
Ethics
dc.subject
Moral aspects
dc.title
Instilling moral value alignment by means of multi-objective reinforcement learning
dc.type
info:eu-repo/semantics/article
dc.type
info:eu-repo/semantics/publishedVersion


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)