<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/style.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-04-13T01:00:48Z</responseDate><request verb="GetRecord" identifier="oai:www.recercat.cat:2072/483462" metadataPrefix="marc">https://recercat.cat/oai/request</request><GetRecord><record><header><identifier>oai:recercat.cat:2072/483462</identifier><datestamp>2025-04-09T19:43:21Z</datestamp><setSpec>com_2072_6</setSpec><setSpec>col_2072_452954</setSpec></header><metadata><record xmlns="http://www.loc.gov/MARC21/slim" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.lyncode.com/xoai" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
   <leader>00925njm 22002777a 4500</leader>
   <datafield ind2=" " ind1=" " tag="042">
      <subfield code="a">dc</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="720">
      <subfield code="a">Pérez Carrasco, David</subfield>
      <subfield code="e">author</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="260">
      <subfield code="c">2025-04-08T13:26:32Z</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="260">
      <subfield code="c">2024-10-16T13:49:37Z</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="260">
      <subfield code="c">2025-04-08T13:26:32Z</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="260">
      <subfield code="c">2024</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="520">
      <subfield code="a">Tutor: Anders Johnson</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="520">
      <subfield code="a">Treball de fi de grau en Enginyeria Matemàtica en Ciència de Dades</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="520">
      <subfield code="a">In recent years, Reinforcement Learning (RL) has emerged as a powerful paradigm for sequential decision making under uncertainty. Within this framework, Markov Decision Processes (MDPs) serve as a fundamental model, defining the dynamics of state transitions and rewards. However, traditional RL algorithms, like Q-Learning, often struggle with large or continuous state spaces due to computa-&#xd;
tional complexity. Linearly-solvable Markov Decision Processes (LMDPs) offer a promising alternative, leveraging linear programming techniques for efficient planning and value function approximation.&#xd;
The focus of this work is on evaluating and benchmarking state-of-the-art RL models against algorithms for continuous MDPs leveraging LMDPs, such as Z-Learning. The aim is to improve the performance and scalability of these algorithms in larger and more intricate domains. We investigate efficient methods for optimal action selection and value function approximation within the linear framework. To enable a fair comparison with traditional MDP-based RL, we develop methods for embedding MDPs into LMDPs and vice versa. This allows us&#xd;
to benchmark state-of-the-art RL models against algorithms designed for continuous MDPs using LMDPs.&#xd;
&#xd;
Furthermore, our research rigorously explores various factors that influence the learning behavior of algorithms in the context of Linearly-solvable MDPs.&#xd;
Particularly, we focus on analyzing the impact of different exploration strategies, aiming to uncover their effectiveness across diverse scenarios. By delving into these aspects, our study contributes valuable insights into the optimization and enhancement of reinforcement learning algorithms.</subfield>
   </datafield>
   <datafield ind1="8" ind2=" " tag="024">
      <subfield code="a">http://hdl.handle.net/2072/483462</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Aprenentatge per reforç</subfield>
   </datafield>
   <datafield ind2="0" ind1="0" tag="245">
      <subfield code="a">Efficient algorithms for linearly solvable Markov decision processes</subfield>
   </datafield>
</record></metadata></record></GetRecord></OAI-PMH>