Real-Time Posture Analysis of Construction Workers for Ergonomics Training Soumitry J. RAY1 and Jochen TEIZER2
1 Ph.D. Candidate, Computational Science and Engineering, School of Civil and
Environmental Engineering, Georgia Institute of Technology, 790 Atlantic Dr. N.W., Atlanta, GA 30332-0355, United States of America, E-mail: [email protected]
2 Ph.D., Assistant Professor, School of Civil and Environmental Engineering, Georgia
Institute of Technology, 790 Atlantic Dr. N.W., Atlanta, GA 30332-0355, United States of America, Phone: +1-404-894-8269, Fax: +1-404-894-2278, E-mail: [email protected]
ABSTRACT
Work related fatigue and injuries are critical issues in the construction industry. Repetitive and physically demanding nature of the activities and awkward work postures are the primary reasons for work related fatigues and injuries. In view of this, worker training on ergonomics is necessary before the start of any construction activity. Traditional methods of worker monitoring are tedious and in efficient. Recent approaches to understand worker ergonomics use specialized devices to physically monitor the health of workers. In addition to this, attempts have been made to use computer vision techniques to understand workers ergonomics; however they mostly focus on estimating the posture of workers. In this research, we present a framework for integrating posture analysis of workers and a predefined set of rules to categorize work tasks as ergonomic or non-ergonomic.
Key Words: Ergonomics; posture classification; range camera; pose estimation, safety and health; construction worker.
INTRODUCTION
Execution of tasks by construction workers requires twisting body parts such as shoulder joints, neck, back, and knees. In these situations, if the posture is such that body parts are strained for a long period of time, then it may cause fatigue, injuries, or in severe cases it may lead to permanent deformation. Among these injuries, back injuries and Work-related Musculoskeletal Disorder (WMSD) are the most common. WMSD has been defined by (CPWR) as, “injuries if the muscles, tendons, joints and nerves caused or aggravated by work”. Such injuries occur primarily with workers involved in carrying heavy loads, kneeling, contact stress, vibration, extreme temperatures, twisting hands or wrists, stretching to work overhead or other awkward positions while performing job activities.
To address these injuries, we monitor posture of workers and use a set of predefined rules to categorize activities as ergonomic or non-ergonomic using a Kinect range camera. We focus solely on activities carried out indoors as range camera measurements are more reliable in indoor conditions than in outdoor conditions where ambient lighting conditions introduces a significant amount of noise. To limit the scope of this research, our rules target few activities such as
overhead work, load lifting from the ground surface, and ground work in kneeling or crawling posture. The contributions of this paper are: (a) automatic classification of work activities of as ergonomic or non-ergonomic that can be beneficial for worker training and education, or safe and healthy setup and layout of the work environment, (b) rule-based ergonomics evaluation for a few specific activities, (c) demonstration using the Kinect range camera along with OpenNI for posture estimation.
BACKGROUND
In this section we discuss the background literature on the following topics: a) statistics that establish the ergonomic issues arising in the construction industry and the necessity for remedial measures to mitigate non-fatal injuries and illness, b) literature review on body motion analysis in the construction industry focusing on ergonomics of workers using methods such as Physiological Status Monitors (PSM) and computer vision, and c) literature on human posture classification using computer vision techniques.
Ergonomics
Injuries and illness not only force workers to stay away from work but also pose the risk of chronic health problems. Statistics from the Bureau of Labor Statistics (BLS) database (2004-2009) shows that back injuries accounted for 18-21%, knee injuries for 8-9% and neck/shoulder injuries for 6-8%. All are among non-fatal injuries (see details in Figure 1). These injuries typically occur due to awkward work posture while carrying out tasks such as lifting loads (back and knee injuries) and working overhead (neck and shoulder injuries)(CPWR 2007, NIOSH 2007).
Figure 1. Distribution of non-fatal occupational injuries and illness in the construction industry (2004-2009) involving days away from work by selected
injury.
Study conducted by the Center for Construction Research and Training, (CPWR) shows that trade workers belonging to masonry and roofing miss most of the work days. Since masonry work typically requires repeated bending and twisting, the study reports back injuries as the most frequently mentioned reason to miss work. In 2005, 75.4 (per 10,000 full time workers) injury and illness cases were recorded of masons. Similarly, back problems occur very frequently with laborers since they are often involved in lifting and carrying heavy loads. Laborers who might receive little training or education on ergonomics, are thus among the unhealthiest performing trades. 21.0 19.2 18.3 16.7 19.7 18.0 9.6 10 10.7 9.3 9.9 8.9 8.2 8.7 8.1 9.6 8.6 8.9 8.8 8 9.5 8.8 9.4 8.7 8.3 7.9 8.4 8.3 8 9.8 6.8 6.6 6.4 7.5 6.2 8 4.5 5.5 5.2 5.5 5.6 5.1 4.3 4.1 4.6 3.8 4.6 3.7 28.5 30 28.8 30.5 28 28.9 0% 20% 40% 60% 80% 100% 2004 2005 2006 2007 2008 2009 Back Finger Multiple parts Knee Hand/wrist Neck/shoulder Foot/toe Eyes Other Ye ar
Ergonomics Analysis in Construction Industry
Ergonomic analysis in the construction industry has been a multi-pronged study. Various technologies have been used to monitor safety and ergonomics of workers; however, there is no technology that is clearly more preferred. Some are suitable for online monitoring in an active work zone, while others may be more suitable for training workers in a mock-up environment before the start of actual work task. These technologies can be categorized into the following main categories: a) Physiological Status Monitoring (PSM), b) motion trackers, and (c) vision-based tracking.
PSMs give physiological information such as heart rate (HR), respiration rate (RR), acceleration (Gatti et al. 2011) and skin temperature, which can be used to monitor physiological condition of the human body. Using accelerometers, PSM can estimate the bending angle of the body torso in only one plane but it is susceptible to drift when measured over a period of time. In addition, vital information such as HR and RR obtained from PSMs can be used to measure strain on the worker’s body.
For motion tracking, technologies such as gyroscopes, accelerometers have been used. Motion tracking mostly focuses on ergonomics to observe awkward postures of workers. Recently, (Alwasel et al. 2011) proposed an external musculoskeletal joint angle sensor system to monitor the kinematics of the shoulder movement in one plane. The system employs a magneto-resistive angle sensor and measures the change in angle by measuring the change of the magnetic field line.
In another approach, computer vision methods have been used to estimate the full body posture of all body joint angles. The main advantage of computer vision method is it estimates all the body joint angles non-invasively. In this regard, there have been few approaches to determine the motion of workers. (Gonsalves & Teizer, 2009) use a range camera and determine body posture of construction workers in a 2D plane using a start skeleton model for few activities such as lifting a box and work zone flagging. (Hans et al. 2011) used Kernel Principal Component Analysis (KPCA) to embed high dimensional intensity images in a low dimensional manifold to recognize predefined motions of worker performing masonry activity.
Human posture classification
Human posture classification is the problem of categorizing a person’s posture in an image into discrete target classes such as ‘Standing’, ‘Sitting’, ‘Bending’, and ‘Crawling’, to name a few. Human posture classification methods can be broadly categorized into supervised and unsupervised methods.
In supervised methods, typically low level features such as silhouette information (Shahbudin et al. 2010) and Histogram of Oriented Gradients (HOG) (Dalal & Triggs, 2005), are extracted from the object of interest. Either these features are then fed in to a discriminant function such as Support Vector Machine (SVM) (Shahbudin et al. 2010) or a manifold learning technique such as KPCA (Cheng et al. 2009), or the object of interest is rescaled to a specific size and then fed in to a supervised Locality Preserving Projections (LPP) or Linear Discriminant Analysis (LDA) (Wientapper et al. 2009). However, in cases where intensity images are used, extracting silhouette or edge information is difficult and affected by noise. It can be addressed by using range cameras which makes segmentation easier and accurate.
The principle behind estimating human posture using unsupervised methods is to reduce the dimensionality of high-dimensional image data points and then perform a nearest neighbor search to assign labels. Reduction in dimensionality helps in capturing important features that are sufficient for the step of predicting data labels and suppress the effect of noise in data. Data mapping from high-dimensional image data space for which a posture is known into a low-dimensional space can be performed by linear mapping techniques such as Principal Component Analysis (PCA). PCA operates on the principle of maximizing the variance of input data and identifying the principal components that sufficiently express the input data. In cases where input data contains a non-linear structure as it can be the case in human postures, PCA fails to find the low-dimensional embedding space. Therefore, usually non-linear embedding techniques are employed (Lee & Lee, 2008) and posture is predicted by learning from the mapping of low-dimensional embedded space and the high-dimensional image space. In this study, we draw motivation from (Wientapper et al. 2009) and use LDA to classify postures owing to the simplicity and the ease of implementation.
METHODOLOGY
The range data used for this study was collected with a Kinect camera. Other range cameras exist and some of the important qualitative criteria of selecting one for construction safety and health applications can be found in (Teizer et al. 2005 & 2007), (Teizer, 2008) and (Teizer & Kahlmann, 2008): (a) resolution (e.g., 480 x 640 pixels), (b) noise due to the presence of mixed pixels at depth discontinuities, (c) frame update rate and (d) color vs. signal amplitude image.
Figure 2. Methodology diagram.
The framework for evaluating safety and ergonomic aspects of a construction worker is shown in Figure 2. In the first step, the camera provides raw depth image from which we extract the person and form a feature vector. Using these feature vectors we learn a model for posture classification into four categories such as: stand, squat, bend, and crawl. Then the pose of the person is estimated to determine the body joints angle and spatial location. Pose estimation is a difficult problem to solve and is the subject of ongoing research. However, there are open source libraries such as OpenNI that estimate the pose of a person using a range camera. We estimate the pose only for the standing posture, once it has been classified and estimated, we use a set of rules to determine the ergonomics of the task being carried out.
Feature Extraction
In feature extraction step, first the person is extracted using a standard background subtraction approach. Binary mask obtained after background subtraction was filtered by standard filtering and morphological operations. By selecting the largest bounding box among the clusters of white pixels in the filtered binary mask the region of interest (ROI) was extracted. The binary ROI is then used to extract depth values of pixels of the person from the raw depth image which is then converted in to a grayscale image. Scaling of depth values in to grayscale values was done with respect to the depth distance of person from camera. The scaling of pixel depth values is done to ensure that grayscale values of person does not depend on depth distance of the person from camera. Finally, the image was rescaled to 20 x 25 grayscale image and then reshaped into a row vector of size d = 500, that is d
R
x∈ .
We term
x
as the feature vector and use it for posture classification. Posture Classification and Pose EstimationPosture Classification
To classify posture the feature vector x obtained from the feature extraction step is used. LDA which is a linear classification model of the form β x+β0 =0
T is
used for posture classification. The advantage of LDA is its simplicity and there are few parameters which need to be estimated. LDA assumes that clusters belonging to all classes obey multivariate normal distribution as shown below, where fk(x) gives the distribution for data pointxto be in class ‘k’. Therefore, to train our model for the posture classification, it is necessary to estimate the class means
μ
k (k = 1, 2, 3, or 4) and the pooled covariance matrix Σ. In prediction stage we assign posture class using a “one vs. one” voting mechanism.⎟ ⎠ ⎞ ⎜ ⎝ ⎛− − Σ − Σ = ∈ [ ] −[ ] 2 1 exp ) 2 ( 1 ) ( 1 2 / 1 2 / k T k d k d f x x x R x μ μ π where, Σ∈Rd×d and d k ∈R
μ . The class assignment c~kis computed according to the
rule:c V k k arg max ~ = , where
[
]
4 3 2 1,v ,v ,v vV = and vk is the number of votes received by the kth class such that vk ∈[0,3] and 6
4
1 =
∑
k=vk . In certain cases, two or threeclasses receive the same number of votes (say two votes each), which means classification cannot be done. For such instances, we decide the posture class ckof the current range image frame by weighing votes assigned to the preceding two frames, that is, t
k
k V
c =argmax and, Vt =β1Vt+β2Vt−1+β3Vt−2. Vt is the vector of votes
received by all classes for the tth frame. For our experiments the following weights were used:β1=0.4,β2=0.35, and β3=0.25.
Posture Estimation
After classification of posture, we estimate body pose and compute body joint angles. OpenNI gives the spatial location of body parts and joints. It typically gives
significant error and loses track when the body posture is not upright for a period of time or if hand limbs are placed too close to the body. The posture of hands, for example, cannot be estimated accurately in such situations. It is also not possible to measure body joint angles and spatial location for different poses (such as bending, squatting and crawling) with high accuracy. Therefore, we address this problem using LDA. We perform posture classification as discussed in the previous section and then estimate the body poses only when LDA predicts standing posture. One frame from a sequence of sample motions of a person placing a box on a floor in a laboratory environment has been shown in Figure 3a. The measured joint angles of the motion are illustrated in Figure 3b computed over a sequence of 60 frames.
(a) RGB image and depth image with posture skeleton. (b)
(b) Body joint angles measured over 60 frames Figure 3. Posture estimation with OpenNI. Rules for Determining Ergonomics
Overhead Work
Overhead work involves those activities in which a worker is required to reach up and raise an arm or both above the shoulder level. Albers et al. (2007) explain, “the risk of developing shoulder pain or a shoulder muscle or joint disorder is increased by the combination of frequently working with raised shoulders (60 degrees or more), using repetitive arm or shoulder movements while in this position, and applying force while in this position”. However, the definition is not precise, as to what constitutes the arm angle. The ambiguity in the definition has been illustrated in Figure 4c where we can see three possibilities arise that are related to overhead work. The arm angle has been denoted byα . We refer to arm angle as illustrated in case 3 of Figure 4c. The reason behind this choice is that case 3 can account for situations where work space is limited and the hand is close to the position of the worker’s head. Measuring the angles to arm elbows (case 1 and 2) may or may not put a worker in a strenuous work task.
Squat or Sit to Lift Load
Material handling is a very common task on construction sites and they increase the level of stress on the back. A study on material handling (NIOSH 2007)
10 20 30 40 50 60 70 80 90 -50 0 50 100 150 Frame number An g le [d e g ] left arm right arm torso left thigh right thigh left knee right knee
suggests ways to mitigate the risk of injuries for lifting loads by reducing the level of stress exerted on the back. It recommends avoiding bending at waist and instead suggests squatting or kneeling position, as is suitable, and keeping
(a) (b) (c) (d)
Figure 4. (a) and (b) Body posture representation. (c) Ambiguity in arm definition of NIOSH. (d) Torso, knee and thigh angle definition.
the load close to the body while and supporting it by pushing the legs up. However, such methods are applicable only for small loads and for bigger loads team lifting or mechanized equipment such as a forklift is recommended. Furthermore, if a worker is accessing a container located on a raised platform then (NIOSH 2007) recommends raising the worker so that the container is accessed 30” – 40” from the surface the worker is standing upon.
Stoop or Bend to Lift Load
Tasks such as rebar-tying and fastening involve frequent stooping postures over an extended period of time and can result in back spasms and sprain. However, such postures can be avoided by improving the work site environment, for example, using rebar tying technology or extensions in all suitable cases. Even tasks such as lifting a load from the ground surface can be accomplished by stooping or bending. Such postures, however, are not recommended for ergonomic reasons (NIOSH 2007). Crawl
While working close to the floor workers are required to kneel on the ground. In absence of padding for knees, direct pressure is exerted by hard floor surface on knees which can be damaging knees, muscles, and put stress on other body parts, such as the back. To mitigate the stress (Albers et al. 2007) recommends using kneeling creepers.
EXPERIMENTAL RESULTS AND DISCUSSION Body Posture Classification
To assess the performance of the classifier we used a set of 7,757 range images (Data set-1) collected from three subjects. An additional set of 14,469 range images (Data set-2) was collected from five new subjects to assess the performance of LDA when presented with data from unfamiliar subjects. We used 33% of the frames in Data set-1 to train the classifier, and then asked the classifier to predict the poses
X Y Z 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Z Y X 1 2,3,6 4 5 7 8 9 10,13 11 12 14 15 3 4 5 3 4 5 3 4 5 α α α Case 1 Case 2 Case 3 XZ plane XZ plane XZ plane Arm angle Z Y X 2,3,6 9 Z Y X 11 12 14 15 Z Y X 11 14 10,13 Torso angle Knee angle Thigh angle
12 15
for Dat mis clas sets sav spe T w O Exp Erg safe than fea Ove Fig onc 60° tr Dis dev range imag ta set-2 was sclassificatio sses as show s, we retrain ved the param ecific constru Table 1. Con with five subj
Output class [Frames] periments gonomics We ant fety inspecto n that is ty sibility that erhead Work For ove gure 5. The d ce the ergon °), otherwise Figure 5. R raining env with strai scussion The de veloping an o es in Data s s computed t on of squat, wn in Table ned the class meters of LD uction-like e nfusion matr jects (Data Stand Squat Bend Crawl Error Rate in Constru ticipate that r and worke ypically avai such a system k erhead work developed ru nomic rule-s e they are rep
Results of a p vironment. E ined arms s emonstrated online work 0 -1 -0.5 0 0.5 1 1.5 Y [ m ] et-2. The to to be 0.19. bend and cr 1. After eva sifier on all DA classifier environments rix showing set-2) not b Stand 0.92 0.02 0.043 0.02 0.08 uction-like our approa ers by provid ilable. The p m can be ap k, we presen ule-based al set decided t presented by person perf Estimated b hown in red methodolo ker ergonomi 1 2 X [m] -1 -0.5 0 0.5 1 1.5 Y [ m ]
tal error rate The primary rawl as ther aluating the available d r for use in f s. g the error r belonging to Targe Squat 0.002 0.719 0.030 0.249 0.28 Environme ach develope ding more o presented ca plied. nt results to gorithm sho that arms ar y green color forming moc body posture d (unsafe) a ogy and re ics monitorin 0 1 2 X [m] e once the c y error contr re was class performanc data points in further ergon rate of class the trainin et Class [Frames t Be 2 0.0 9 0.0 0 0.8 9 0.1 0. ent for Ev ed in this r objective and ase study b o a construc ows strained re in unsafe r. ck-up overh es (bottom) and otherwis esults prese ng system. T 2 -1.5 -1 -0.5 -1 -0.5 0 0.5 1 1.5 X [m] Y [ m ] classifier was ribution aro overlap bet e of LDA o n Data set-1 nomic analy ifier when p ng group (Da s] end C 030 038 0 808 0 123 0 .19 valuating R research wil d quantitativ below demon ction worke d arms in red e position (g head work i for the ima se in green
ent the pos The computa 0 0.5 s applied to se from the tween these on both data 1 and 2 and ysis, and for
presented ata set-1). Crawl 0 0.079 0.257 0.666 0.33 Rule-based ll assist the ve feedback nstrates the r shown in d (Figure 5) greater than in a safe ages (top) (safe). ssibility of ational time f
for processing is 0.15 – 0.17 seconds (Intel(R) Xeon (R) 3.07 GHz, 14.0 GB) which includes pose classification and estimation, ergonomic analysis and visualization of results. In posture classification, we discussed an overlap between target classes that contributed to a higher misclassification rate. To reduce the misclassification rate, a classifier with non-linear decision boundary, such as SVM with polynomial or radial kernel can be used. OpenNI used in this study has the limitation that it can only compute body postures accurately for standing postures. Furthermore, the error rate for the spatial position of the body joints was not measured. An external marker system could be used to capture the ground truth and compare it with results obtained in this study. Such an external marker system, however, was not available to this study or part of the scope of work.
CONCLUSION
A framework for automatic worker ergonomic analysis was presented. The presented system utilizes 3D range imaging system to monitor worker motion. Ergonomics analysis was performed by integrating posture estimation and classification information with a predefined set of rules. A rule base for ergonomics was developed that can be used for future research studies that use computer vision technique, in indoor environment. However, this research did not take in to account time, repetition, forceful exertion and anthromorphic factors that are critical for ergonomics analysis and remains as a scope for future work. The presented system works in real-time and can be used as an interactive training tool for worker. Such technology-assisted worker assessment platforms can prove also to be useful in engaging the workforce more pro-actively and objectively.
REFERENCES
Alwasel A., Elrayes K., Rahman E.M.A. and Haas C. (2011), “Sensing construction work-related musculosketal disorders (WMSDs)”, 26th International Symposium on Automation and Robotics in Construction (ISARC).
Albers J.T., Estill C.F. (2007), “Simple solutions - ergonomics for construction workers”, U.S. Department of Health and Human Services.
Bureau of Labor Statistics, BLS (2011), “Bureau of labor statistics, illness, injuries and fatalities”, <http://www.bls.gov/iif/oshcdnew.htm> (June 6, 2011).
Cheng P., Li W., Ogunbona P. (2009), “Kernel PCA of HOG features for posture detection”, 24th International Conference Image and Vision Computing New Zealand (IVCNZ).
CPWR, The Center for Construction Research and Training, (2007), “The Construction Chart Book -The U.S. Construction Industry and its Workers”. Dalal N., Triggs B. (2005), “Histograms of oriented gradients for human detection”,
Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR).
Gatti U.C., Migliaccio G.C., Schneider S. (2011), “Wearable physiological status monitors for measuring and evaluating worker’s physical strain: Preliminary validation, computing in civil engineering”, Proceedings of the 2011 ASCE International Workshop on Computing in Civil Engineering.
Gonsalves R. and Teizer J. (2009), “Human motion analysis using 3D range imaging technology”, 26th International Symposium on Automation and Robotics in Construction (ISARC).
Han S.U., Lee S.H, Mora F.P. (2011), “Application of dimension reduction techniques for motion recognition: construction worker behavior monitoring, computing in civil engineering”, Proceedings of the 2011 ASCE International Workshop on Computing in Civil Engineering.
Lee A. and Lee C. (2008), “The role of manifold learning in human motion analysis”,
Human Motion Understanding, Modeling, Capture and Animation, pages 1–
29.
NIOSH (2007), “Ergonomic guidelines for material handling”, <http://www.cdc.gov/niosh/docs/2007-131/pdfs/2007-131.pdf> DHS (NIOSH), Publication No. 2007-131(June 10, 2011).
OpenNI (2011), <http://www.openni.org/> (June 12, 2011).
Shahbudin S., Hussain A., Hussain H. , Samad S. A., Tahir N. M. (2010), “Analysis of PCA based feature vectors for SVM posture classification”, 6th International Colloquium on Signal Processing & Its Applications (CSPA). Teizer J., Kim C., Haas C.T., Liapi K.A., and Caldas C.H. (2005), “A framework for
real time 3d modeling of infrastructure”, Transportation Research Record:
Journal of the Transportation Research Board, No. 1913, pp. 177-186,
Washington D.C.
Teizer J., Caldas C.H., and Haas C.T. (2007), “Real-time three-dimensional occupancy grid modeling for the detection and tracking of construction resources”, ASCE Journal of Construction Engineering and Management, 133(11), pp. 880-888, Reston, Virginia.
Teizer J. (2008), “3D range image sensing for active safety in construction”, Journal of Information Technology in Construction, Sensors in Construction and Infrastructure Management, 13 (Special Issue), pp. 103-117.
Teizer J. and Kahlmann T. (2008), “Range imaging as an emerging optical 3d measurement technology”, Transportation Research Record: Journal of the Transportation Research Board, No. 2040, pp. 19-29, Washington D.C.
Wientapper F., Ahrens K., Wuest H., and Bockholt U. (2009), “Linear-projection based classification of human postures in time-of-flight data”, Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics San Antonio.