Multi-modal embedding for main product detection in fashion

Inicio | ¿Qué es? | Contacto

English | Català

Consultar RECERCAT

Por comunidades y
colecciones Por fecha Por autores Por títulos Por temas (CDU)

Consultar departamento

Por fecha Por autores Por títulos Por temas (CDU)

Estadisticas

Del documento Todo RECERCAT

Mi RECERCAT

Entrar Alertas por correo-e

Directorio de otros repositorios

RECERCAT Principal > Universitat Politècnica de Catalunya > Documents de recerca > Visualizar documento

Para acceder a los documentos con el texto completo, por favor, siga el siguiente enlace: http://hdl.handle.net/2117/114315

Título:	Multi-modal embedding for main product detection in fashion
Autor/a:	Rubio Romano, Antonio; LongLong, Yu; Simó Serra, Edgar; Moreno-Noguer, Francesc
Otros autores:	Institut de Robòtica i Informàtica Industrial; Universitat Politècnica de Catalunya. ROBiri - Grup de Robòtica de l'IRI
Abstract:	© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Abstract:	Best Paper Award a la 2017 IEEE International Conference on Computer Vision Workshops
Abstract:	We present an approach to detect the main product in fashion images by exploiting the textual metadata associated with each image. Our approach is based on a Convolutional Neural Network and learns a joint embedding of object proposals and textual metadata to predict the main product in the image. We additionally use several complementary classification and overlap losses in order to improve training stability and performance. Our tests on a large-scale dataset taken from eight e-commerce sites show that our approach outperforms strong baselines and is able to accurately detect the main product in a wide diversity of challenging fashion images.
Abstract:	Peer Reviewed
Abstract:	Award-winning
Materia(s):	-Àrees temàtiques de la UPC::Informàtica::Automàtica i control -computer vision -learning (artificial intelligence) -common embedding -multi-modal embedding -deep learning -Classificació INSPEC::Automation
Derechos:	Attribution-NonCommercial-NoDerivs 3.0 Spain http://creativecommons.org/licenses/by-nc-nd/3.0/es/
Tipo de documento:	Artículo - Versión presentada Objeto de conferencia
Compartir:

Mostrar el registro completo del ítem

Documentos relacionados

Otros documentos del mismo autor/a

Efficient monocular pose estimation for complex 3D models

Rubio Romano, Antonio; Villamizar Vergel, Michael Alejandro; Ferraz Colomina, Luis; Peñate Sánchez, Adrián; Ramisa Ayats, Arnau; Simó Serra, Edgar; Sanfeliu Cortés, Alberto; Moreno-Noguer, Francesc

BASS: boundary-aware superpixel segmentation

Rubio Romano, Antonio; Yu, Longlong; Simó Serra, Edgar; Moreno-Noguer, Francesc

Multi-modal fashion product retrieval

Rubio Romano, Antonio; Yu, Longlong; Simó Serra, Edgar; Moreno-Noguer, Francesc

Multi-modal joint embedding for fashion product retrieval

Rubio Romano, Antonio; Yu, Longlong; Simó Serra, Edgar; Moreno-Noguer, Francesc

Estimación monocular y eficiente de la pose usando modelos 3D complejos

Rubio Romano, Antonio; Villamizar Vergel, Michael Alejandro; Ferraz Colomina, Luis; Peñate Sánchez, Adrián; Sanfeliu Cortés, Alberto; Moreno-Noguer, Francesc

Accesibilidad | Aviso legal | Política de Cookies | Documentos de uso interno

Coordinación

Patrocinio