dc.contributor |
Universitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació |
dc.contributor |
Romero Moral, Óscar |
dc.contributor.author |
Shi, Dai |
dc.date |
2014-07-04 |
dc.identifier.citation |
100624 |
dc.identifier.uri |
http://hdl.handle.net/2099.1/24886 |
dc.language.iso |
eng |
dc.publisher |
Universitat Politècnica de Catalunya |
dc.rights |
info:eu-repo/semantics/openAccess |
dc.subject |
Àrees temàtiques de la UPC::Informàtica::Informàtica teòrica::Algorísmica i teoria de la complexitat |
dc.subject |
Algorithms |
dc.subject |
Bandit |
dc.subject |
Multi-armed |
dc.subject |
recommendations |
dc.subject |
automatic content selection |
dc.subject |
Algorismes |
dc.title |
Exploring Bandit Algorithms for Automatic Content Selection |
dc.type |
info:eu-repo/semantics/masterThesis |
dc.description.abstract |
Multi-armed bandit (MAB) problem is derived from slot machines in the casino. It is about how a gambler could pull the arms in order to maximize total reward. In this sense, the gambler needs to decide which arm to explore in order to gain more knowledge, and which arm to exploit in order to guarantee the total payoff. This problem is also very common in real world, such as automatic content selection. The website is like a gambler. It needs to select proper content to recommend to the visitors, trying to maximize click through rate (CTR). Bandit algorithms are very suitable for this kind of issue. Because they are able to deal with exploration and exploitation trade-off with high churning data. When context is considered during content selection, we model it as contextual bandit problems. In this thesis, we evaluate several popular bandit algorithms in different bandit settings. And we propose our own approach to solve a real world automatic content selection case. Our experiments demonstrate that bandit algorithms are efficient in automatic content selection. |