Título:
|
Multi-modal embedding for main product detection in fashion
|
Autor/a:
|
Rubio Romano, Antonio; LongLong, Yu; Simó Serra, Edgar; Moreno-Noguer, Francesc
|
Otros autores:
|
Institut de Robòtica i Informàtica Industrial; Universitat Politècnica de Catalunya. ROBiri - Grup de Robòtica de l'IRI |
Abstract:
|
© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
Abstract:
|
Best Paper Award a la 2017 IEEE International Conference on Computer Vision Workshops |
Abstract:
|
We present an approach to detect the main product in fashion images by exploiting the textual metadata associated with each image. Our approach is based on a Convolutional Neural Network and learns a joint embedding of object proposals and textual metadata to predict the main product in the image. We additionally use several complementary classification and overlap losses in order to improve training stability and performance. Our tests on a large-scale dataset taken from eight e-commerce sites show that our approach outperforms strong baselines and is able to accurately detect the main product in a wide diversity of challenging fashion images. |
Abstract:
|
Peer Reviewed |
Abstract:
|
Award-winning |
Materia(s):
|
-Àrees temàtiques de la UPC::Informàtica::Automàtica i control -computer vision -learning (artificial intelligence) -common embedding -multi-modal embedding -deep learning -Classificació INSPEC::Automation |
Derechos:
|
Attribution-NonCommercial-NoDerivs 3.0 Spain
http://creativecommons.org/licenses/by-nc-nd/3.0/es/ |
Tipo de documento:
|
Artículo - Versión presentada Objeto de conferencia |
Compartir:
|
|