<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/style.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-04-17T16:12:10Z</responseDate><request verb="GetRecord" identifier="oai:www.recercat.cat:2117/355912" metadataPrefix="oai_dc">https://recercat.cat/oai/request</request><GetRecord><record><header><identifier>oai:recercat.cat:2117/355912</identifier><datestamp>2025-07-23T04:28:13Z</datestamp><setSpec>com_2072_1033</setSpec><setSpec>col_2072_452951</setSpec></header><metadata><oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.lyncode.com/xoai" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
   <dc:title>Deep Reinforcement Learning in Recommender Systems</dc:title>
   <dc:creator>Izquierdo Enfedaque, Héctor</dc:creator>
   <dc:contributor>Universitat Politècnica de Catalunya. Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial</dc:contributor>
   <dc:contributor>Angulo Bahón, Cecilio</dc:contributor>
   <dc:subject>Àrees temàtiques de la UPC::Informàtica</dc:subject>
   <dc:subject>Recommender systems (Information filtering) -- Mathematical models -- Software</dc:subject>
   <dc:subject>Reinforcement learning -- Mathematical models</dc:subject>
   <dc:subject>Sistemes recomanadors (Filtratge d'informació) -- Models matemàtics -- Programari</dc:subject>
   <dc:subject>Aprenentatge per reforç -- Models matemàtics</dc:subject>
   <dc:description>Recommender Systems aim to help customers find content of their interest by presenting them suggestions they are most likely to prefer. Reinforcement Learning, a Machine Learning paradigm where agents learn by interaction which actions to perform in an environment so as to maximize a reward, can be trained to give good recommendations. One of the problems when working with Reinforcement Learning algorithms is the dimensionality explosion, especially in the observation space. On the other hand, Industrial recommender systems deal with extremely large observation spaces. New Deep Reinforcement Learning algorithms can deal with this problem, but they are mainly focused on images. A new technique has been developed able to convert raw data into images, enabling DRL algorithms to be properly applied. This project addresses this line of investigation. The contributions of the project are: (1) defining a generalization of the Markov Decision Process formulation for Recommender Systems, (2) defining a way to express the observation as an image, and (3) demonstrating the use of both concepts by addressing a particular Recommender System case through Reinforcement Learning. Results show how the trained agents offer better recommendations than the arbitrary choice. However, the system does not achieve a great performance mainly due to the lack of interactions in the dataset</dc:description>
   <dc:date>2021-10-22</dc:date>
   <dc:type>Master thesis</dc:type>
   <dc:identifier>https://hdl.handle.net/2117/355912</dc:identifier>
   <dc:identifier>ETSEIB-240.161351</dc:identifier>
   <dc:language>eng</dc:language>
   <dc:rights>http://creativecommons.org/licenses/by-nc-nd/3.0/es/</dc:rights>
   <dc:rights>Open Access</dc:rights>
   <dc:format>application/pdf</dc:format>
   <dc:publisher>Universitat Politècnica de Catalunya</dc:publisher>
</oai_dc:dc></metadata></record></GetRecord></OAI-PMH>