A Multi-view Framework for Human Parsing in Human-Robot Collaboration scenarios

Main Authors: Terreran, Matteo, Barcellona, Leonardo, Evangelista, Daniele, Menegatti, Emanuele, Ghidoni, Stefano
Format: Proceeding Journal
Terbitan: , 2021
Subjects:
Online Access: https://zenodo.org/record/6367906
Daftar Isi:
  • Perception plays a major role in human-robot col- laboration tasks enabling the robot to understand the surround- ing environment, especially the position of humans inside its working area. This represents a key element to ensure a safe collaboration, and several human representations have been pro- posed in the literature (e.g., 3D bounding boxes, skeletal models). In this work, we propose a novel 3D human representation derived from body parts segmentation, which combines high- level semantic information (i.e., human body parts) and volume information. Body parts segmentation is known as human- parsing in the literature, which mainly focuses on RGB images. To compute our 3D human representation we propose a multi- view system based on a camera network, where single-view body parts segmentation masks are projected into 3D coordinates and fused together, obtaining a 3D representation robust to occlusions. A further step of 3D data filtering also improves robustness to outliers. The proposed multi-view human parsing approach has been evaluated in a real environment in terms of global and class accuracy on a custom dataset, acquired to thoroughly test the system under various conditions. The experimental results demonstrate that the proposed system achieves high performance also in multi-person scenarios where occlusions are largely diffused.