<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/style.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-04-13T05:56:02Z</responseDate><request verb="GetRecord" identifier="oai:www.recercat.cat:2117/368805" metadataPrefix="marc">https://recercat.cat/oai/request</request><GetRecord><record><header><identifier>oai:recercat.cat:2117/368805</identifier><datestamp>2025-07-22T15:04:53Z</datestamp><setSpec>com_2072_1033</setSpec><setSpec>col_2072_452951</setSpec></header><metadata><record xmlns="http://www.loc.gov/MARC21/slim" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.lyncode.com/xoai" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
   <leader>00925njm 22002777a 4500</leader>
   <datafield ind2=" " ind1=" " tag="042">
      <subfield code="a">dc</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="720">
      <subfield code="a">Fontana Miyoshi Bianchi, Rafael</subfield>
      <subfield code="e">author</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="260">
      <subfield code="c">2022-04-27</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="520">
      <subfield code="a">In this study, the potential of using Reinforcement Learning for Portfolio Optimization is investigated, considering the constraints set by the stock market, such as liquidity, slippage, and transaction costs. Five Deep Reinforcement Learning (DRL) agents are trained in two different environments to test the agents' ability to learn the best trading strategies to allocate assets, expecting to generate higher cumulative returns. All agents used are model-free and already optimized for financial problems, using the FinRL library. Therefore, the state-space has a high dimension, as found in the financial market environments. The two proposed environments use market data from US stocks, and one of them also uses Finsent data, an alternative data source that contains the news sentiment for all the stocks that are part of Dow Jones Industrial Average (DJIA). A series of backtesting experiments were performed from the beginning of 2019 to the beginning of 2020 and compared the two environments and how the agents performed against the DJIA. All the results were assessed with the pyfolio Python library, which uses all standard metrics to evaluate portfolio performance. Some algorithms increased the cumulative returns compared to the first dataset. The best result obtained outperformed DJIA by a significant amount and a smaller drawdown.</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Reinforcement learning</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Machine learning</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Reinforcement Learning</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Machine Learning</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Portfolio Optimization</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Finance</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Markov Decision Processes</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Aprenentatge per reforç</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Aprenentatge automàtic</subfield>
   </datafield>
   <datafield ind2="0" ind1="0" tag="245">
      <subfield code="a">Reinforcement learning for portfolio optimization</subfield>
   </datafield>
</record></metadata></record></GetRecord></OAI-PMH>