<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/style.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-04-17T03:42:33Z</responseDate><request verb="GetRecord" identifier="oai:www.recercat.cat:2117/445176" metadataPrefix="marc">https://recercat.cat/oai/request</request><GetRecord><record><header><identifier>oai:recercat.cat:2117/445176</identifier><datestamp>2025-11-08T09:09:33Z</datestamp><setSpec>com_2072_1033</setSpec><setSpec>col_2072_452951</setSpec></header><metadata><record xmlns="http://www.loc.gov/MARC21/slim" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.lyncode.com/xoai" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
   <leader>00925njm 22002777a 4500</leader>
   <datafield ind2=" " ind1=" " tag="042">
      <subfield code="a">dc</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="720">
      <subfield code="a">Kurmangaliyeva, Dinara</subfield>
      <subfield code="e">author</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="260">
      <subfield code="c">2025-10-22</subfield>
   </datafield>
   <datafield ind2=" " ind1=" " tag="520">
      <subfield code="a">Accurately forecasting customer lifetime value (CLV) is central to budget allocation and retention strategy in non-contractual e-commerce settings. This thesis presents a comparative study of three approaches on the UCI Online Retail II dataset: (i) tree-based machine learning models (Random Forest and XGBoost) trained on engineered time-window features, (ii) the same models augmented with unsupervised segmentation signals, and (iii) a hybrid deep learning architecture combining Transformer encoders for temporal covariates with an LSTM pathway for purchasing trends and a sequence decoder for multi-month forecasts. To inject segmentation without discarding nuance, I derive distance-to-centroid features from K-Means clusters learned on TSFresh time-series representations, rather than using coarse cluster labels. The evaluation follows a temporally consistent train/validation/test split with group-aware cross-validation by customer and reports MAE, RMSE, and R2. Empirically, XGBoost achieves the strongest out-of-sample performance, the deep model is intermediate, and Random Forest trails. Adding distance-to-centroid features yields a modest gain for XGBoost but slightly degrades Random Forest, indicating that boosted trees can extract weak but useful segmentation signals while bagged trees are more sensitive to noise. Feature importance analysis shows that monetary variables (total and average spend) dominate across models, with frequency and tenure contributing second-order signals; distances to specific clusters add incremental lift and provide interpretable behavioral archetypes. Managerially, the results support segment-aware targeting of high-value lookalikes and reinforce the practicality of gradient boosting on structured tabular features for CLV. Methodologically, this thesis contributes an end-to-end, reproducible pipeline that blends sequence-aware engineering with segmentation-as-features. Limitations include dataset size and a limited hyperparameter search for the deep model; future work should explore uncertainty quantification, per-segment specialist models, and larger-scale tuning to assess the conditions under which deep architectures surpass tree-based ensembles for CLV.</subfield>
   </datafield>
   <datafield ind1="8" ind2=" " tag="024">
      <subfield code="a">http://hdl.handle.net/2117/445176</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial::Aprenentatge automàtic</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Deep learning (Machine learning)</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Customer Lifetime Value (CLV)</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Time series forecasting</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Machine learning</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">XGBoost</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Random forest</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Deep dearning</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Transformer</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">LSTM</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">TSFresh</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">K-Means clustering</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Feature engineering</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Segmentation</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Gradient boosting</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">E-commerce</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Predictive modeling</subfield>
   </datafield>
   <datafield tag="653" ind2=" " ind1=" ">
      <subfield code="a">Aprenentatge profund (Aprenentatge automàtic)</subfield>
   </datafield>
   <datafield ind2="0" ind1="0" tag="245">
      <subfield code="a">Predicting customer lifetime value in e-commerce: an empirical evaluation of tree ensembles, neural sequence models, and time-series clustering</subfield>
   </datafield>
</record></metadata></record></GetRecord></OAI-PMH>