A Simple Method to Calculate Positions in Pose Tracking to Verify Work Procedures

Kazumoto Tanaka

doi:10.56741/esl.v2i02.313

Authors

Kazumoto Tanaka
kazumoto@hiro.kindai.ac.jp
Kindai University https://orcid.org/0000-0002-7324-3201

Keywords:

absolute root coordinates, deep neural network, pose tracking, root-relative coordinates, work procedure

Abstract

Although manufacturing processes are becoming increasingly automated, many factories still rely on manual operations. In such facilities, there is a strong need to automatically detect human errors by checking whether specified procedures are being followed. For this purpose, studies have been conducted on utilizing deep neural network (DNN) based three-dimensional (3D) human pose-tracking methods to examine work procedures. However, most of these techniques require a high-end computer equipped with a graphics processing unit (GPU). On the other hand, in this study, we adopt MediaPipe Pose, a lightweight pose estimation network provided by Google, to perform pose tracking on a low-end personal computer (PC) to enable such systems to be deployed in small factories. However, MediaPipe Pose cannot track the location of a human body because it estimates poses in a coordinate system with the waist as the origin (that is, in a root‑relative coordinate system). Therefore, in this study, we developed a method to obtain the absolute coordinates of the root with a simple calculation. The results of an experimental evaluation show that the computational load of the proposed approach is negligible, and the repeatability of the estimation sufficed to evaluate a given operator's work on a predetermined working path. Therefore, the proposed methods enables work procedures to be checked using MediaPipe Pose on a low-end PC.

Downloads

Download data is not yet available.

References

Y. Murata, Y. Takehara, Y. Uda, and T. Yoshikawa, “Picking and Assortment Operation Assistance Systems with the Depth Camera,” International Journal on Advances in Intelligent Systems, 11(3 & 4), pp. 299-308, 2018.

C. Chena, T. Wanga, D. Lia, and J. Hongb, “Repetitive Assembly Action Recognition Based on Object Detection and Pose Estimation,” Journal of Manufacturing Systems, 55, pp. 325-333, 2020. https://doi.org/10.1016/j.jmsy.2020.04.018

P. Lou, J. Li, Y-H. Zeng, B. Chen, and X. Zhang, “Real-time Monitoring for Manual Operations with Machine vision in Smart Manufacturing,” Journal of Manufacturing Systems, 65, pp. 709-719, 2022. https://doi.org/10.1016/j.jmsy.2022.10.015

Y. Ono, O. D. A. Prima, “Assessment of Drug Picking Activity using RGB-D Camera,” Proc. ACHI2021, Nice, France, Jul. 2021.

G. Moon, J. Y. Chang, and K. M. Lee, “Camera Distance-Aware Top-Down Approach for 3D Multi-person Pose Estimation from a Single RGB Image,” Proc. ICCV2019, Seoul, Korea, pp. 10133-10142, Oct. 2019.

J. Y. Chang, G. Moon, and K. M. Lee, “Poselifter: Absolute 3D Human Pose Lifting Network from a Single Noisy 2D Human Pose,” arXiv:1910.12029v2, 2020. https://doi.org/10.48550/arXiv.1910.12029

L. Jin, et al., “Single-Stage is Enough: Multi-Person Absolute 3D Pose Estimation,” CVPR2022, New Orleans, USA, 13076-13085, Jun. 2022.

Y. Zhan, F. Li, R. Weng, and W. Choi, “Ray3D: Ray-Based 3D Human Pose Estimation for Monocular Absolute 3D Localization,” Proc. CVPR2022, New Orleans, USA, 13116-13125, Jun. 2022.

Y. Cheng, B. wang, B. Y., and R. T. Tan, “Graph and Temporal Convolutional Networks for 3D Multi-person Pose Estimation in Monocular Videos,” Proc. AAAI-21, (virtual), 1157-1165, Feb. 2021.

S. Choi, S. Choi and C. Kim, "MobileHumanPose: Toward Real-time 3D Human Pose Estimation in Mobile Devices," Proc. CVPRW2021, (virtual), pp. 2328-2338, Jun. 2021.

MediaPipe Pose. https://github.com/google/mediapipe/blob/master/docs/solutions/pose.md (accessed on 1 March 2023).

P. Jiang, D. Ergu, F. Liu, Y. Cai, and B. Ma, “A Review of Yolo Algorithm Developments,” Procedia Computer Science, 199, pp. 1066-1073, 2022.

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” Proc. CVPR2018, Salt Lake City, USA, pp. 4510-4520, Jun. 2018.

L. Sigal, A. O. Balan, and M. J. Black, “Humaneva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion,” International Journal of Computer Vision, 87, pp. 4-27, 2010.