3d semantic representation of actions from effcient stereo-image-sequence segmentation on GPUs
Alexey, Abramov; Aksoy, Eren Erdal; Dörr, Johannes; Wörgötter, Florentin; Pauwels, Karl; Dellen, Babette
Universitat Politècnica de Catalunya. Institut de Robòtica i Informàtica Industrial
A novel real-time framework for model-free stereo-video segmentation and stereo-segment tracking is presented, combining real-time optical flow and stereo with image segmentation running separately on two GPUs. The stereosegment tracking algorithm achieves a frame rate of 23 Hz for regular videos with a frame size of 256 x 320 pixels and nearly real time for stereo videos. The computed stereo segments are used to construct 3D segment graphs, from which main graphs, representing a relevant change in the scene, are extracted, which allow us to represent a movie of e.g. 396 original frames by only 12 graphs, each containing only a small number of nodes, providing a condensed description of the scene while preserving data-intrinsic semantics. Using this method, human activities, e.g., handling of objects, can be encoded in an efficient way. The method has potential applications for manipulation action recognition and learning, and provides a vision-front end for applications in cognitive robotics.
Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal::Reconeixement de formes
Pattern recognition systems
Reconeixement de formes (Informàtica)
Classificació INSPEC::Pattern recognition

