<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/style.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-04-14T02:22:02Z</responseDate><request verb="GetRecord" identifier="oai:www.recercat.cat:2117/368805" metadataPrefix="qdc">https://recercat.cat/oai/request</request><GetRecord><record><header><identifier>oai:recercat.cat:2117/368805</identifier><datestamp>2025-07-22T15:04:53Z</datestamp><setSpec>com_2072_1033</setSpec><setSpec>col_2072_452951</setSpec></header><metadata><qdc:qualifieddc xmlns:qdc="http://dspace.org/qualifieddc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.lyncode.com/xoai" xsi:schemaLocation="http://purl.org/dc/elements/1.1/ http://dublincore.org/schemas/xmls/qdc/2006/01/06/dc.xsd http://purl.org/dc/terms/ http://dublincore.org/schemas/xmls/qdc/2006/01/06/dcterms.xsd http://dspace.org/qualifieddc/ http://www.ukoln.ac.uk/metadata/dcmi/xmlschema/qualifieddc.xsd">
   <dc:title>Reinforcement learning for portfolio optimization</dc:title>
   <dc:creator>Fontana Miyoshi Bianchi, Rafael</dc:creator>
   <dc:subject>Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic</dc:subject>
   <dc:subject>Reinforcement learning</dc:subject>
   <dc:subject>Machine learning</dc:subject>
   <dc:subject>Reinforcement Learning</dc:subject>
   <dc:subject>Machine Learning</dc:subject>
   <dc:subject>Portfolio Optimization</dc:subject>
   <dc:subject>Finance</dc:subject>
   <dc:subject>Markov Decision Processes</dc:subject>
   <dc:subject>Aprenentatge per reforç</dc:subject>
   <dc:subject>Aprenentatge automàtic</dc:subject>
   <dcterms:abstract>In this study, the potential of using Reinforcement Learning for Portfolio Optimization is investigated, considering the constraints set by the stock market, such as liquidity, slippage, and transaction costs. Five Deep Reinforcement Learning (DRL) agents are trained in two different environments to test the agents' ability to learn the best trading strategies to allocate assets, expecting to generate higher cumulative returns. All agents used are model-free and already optimized for financial problems, using the FinRL library. Therefore, the state-space has a high dimension, as found in the financial market environments. The two proposed environments use market data from US stocks, and one of them also uses Finsent data, an alternative data source that contains the news sentiment for all the stocks that are part of Dow Jones Industrial Average (DJIA). A series of backtesting experiments were performed from the beginning of 2019 to the beginning of 2020 and compared the two environments and how the agents performed against the DJIA. All the results were assessed with the pyfolio Python library, which uses all standard metrics to evaluate portfolio performance. Some algorithms increased the cumulative returns compared to the first dataset. The best result obtained outperformed DJIA by a significant amount and a smaller drawdown.</dcterms:abstract>
   <dcterms:issued>2022-04-27</dcterms:issued>
   <dc:type>Master thesis</dc:type>
   <dc:rights>Open Access</dc:rights>
   <dc:publisher>Universitat Politècnica de Catalunya</dc:publisher>
</qdc:qualifieddc></metadata></record></GetRecord></OAI-PMH>