HAL Id: cea-02313771
https://hal-cea.archives-ouvertes.fr/cea-02313771
Submitted on 11 Oct 2019
HAL
is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire
HAL
, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Indoor Occupancy Estimation via Location-Aware
HMM: An IoT Approach
Masahiro Yoshida, Sofia Kleisarchaki, Levent Gurgen, Hiroaki Nishi
To cite this version:
Masahiro Yoshida, Sofia Kleisarchaki, Levent Gurgen, Hiroaki Nishi. Indoor Occupancy Estimation
via Location-Aware HMM: An IoT Approach. 2018 IEEE 19th International Symposium on ”A World
of Wireless, Mobile and Multimedia Networks” (WoWMoM), Jun 2018, Chania, Greece. pp.14-19,
�10.1109/WoWMoM.2018.8449765�. �cea-02313771�
Indoor Occupancy Estimation via Location-Aware
HMM: An IoT Approach
Masahiro Yoshida
Science & Technology,Keio University [email protected]
Sofia Kleisarchaki
Univ Grenoble Alpes,CEA, Leti Sofi[email protected]
Levent G¨urgen
Univ Grenoble Alpes,CEA, Leti [email protected]
Hiroaki Nishi
Science & Technology,Keio University [email protected]
Abstract—Indoor occupancy estimation is a critical analytical task for several applications (e.g., social isolation of elderlies). The proliferation of Internet of Things (IoT) devices enabled the occupancy estimation, as it provided access to a mass amount of data. Several works have been proposed exploiting the IoT Passive Inference (PIR) or environmental (e.g., CO2) features. These works however are traditionally selecting the feature space at the learning phase and passively using it over time. Hence, they ignore the dynamics of indoor occupancy, such as the location of the occupant or his motion patterns, leading to a decreasing accu-racy over time. In this paper, we study those dynamics and show that motion patterns, along with environmental features favor the occupancy estimation. We design aLocation-Aware Hidden Markov Model (HMM), which dynamically adapts the feature space based on the occupant’s location. Our experiments on real data show that Location-Aware HMM can reach up to 10% better accuracy than Conventional HMM.
Index Terms—Occupancy Estimation, Location-Aware Hidden Markov Model, Internet of Things (IoT)
I. INTRODUCTION
During the last decades, a considerable change in the age distribution of societies has been observed. The life expectancy, in conjunction with the low birth rate, is leading to a world of a higher number of older adults than children. This transformation of the population puts a heavy burden on health and social care systems, as the costs of caring for an elderly are increasing. Thus, the prolongation and support of independent living at home becomes a crucial task.
The rise of Internet of Things (IoT) gave a great boost in the field of assisted living with home automation. Several works attempted to equip houses with smart devices, in order to detect daily life activities [15, 19], such as visiting. Visiting can prevent social isolation and loneliness, which are major concerns for the physical and mental health of an elderly. Moreover, it can improve the scheduling of nurses and carers, to avoid collisions with other socializing activities.
Detecting visitors has been achieved through occupancy estimation. Several works have been proposed and can be
This work was partially supported by the EU-Activage project Grant No 732679 and MEXT/JSPS KAKENHI Grant (B) Number JP16H04455 and JP17H01739 and by Technology Foundation of the R&D project ”Design of Information and Communication Platform for Future Smart Community Services” by the Ministry of Internal Affairs and Communications of Japan.
Fig. 1:CO2Distribution for Various Occupants
categorized based on the features they exploit; Passive Infer-ence (PIR) and environmental. Those two categories are not exhaustive, but are the major, when dealing with non-intrusive sensors. Each of the two categories exhibit limitations.
For the first category, several works have been proposed [3, 11] that utilize PIR sensors to estimate occupancy. Although these works track the indoor motion of occupants, they lack the ability to model their motion patterns. In this paper, we make two interesting observations that favor the occupancy es-timation and successfully model two different motion patterns. First, we show that visitors tend to stay longer in communal than private areas of a house and second we notice that visitors tend to wander less around the rooms than permanent residents. To the best of our knowledge, this is the first work to attempt such a modeling.
For the second category, several works [2, 8, 10, 16] are focusing on environmental features, such asCO2human emis-sion,soundlevel ortemperature. However, these works ignore the fact that those features can be easily affected by factors other than humans, leading to several false negatives. Figure 1 highlights such a challenge. It shows three distributions of
CO2 concentration over two working days. Each distribution corresponds to the presence of zero, one and two occupants respectively. We notice that the distribution of two occupants is relatively close to the distribution of zero occupants, leading to a false negative in the detection of two occupants. This behavior arises by the fact that the apartment is occupied mainly at sleeping hours. Since the CO2 sensor is located
at the living room, while the residents are sleeping in the bedroom, its value drops to the same level of having no occupancy. This problem could be easily avoided, either by providingCO2 sensor in the bedroom, which is expensive, or by dynamically adapting the feature space based on occupants location. In this paper, we design and implement a location aware algorithm, based on Hidden Markov Model (HMM). Our algorithm, namely Location-Aware HMM, is able to dynamically select the feature space, based on the location of the occupants.
In a nutshell this paper makes the following contributions: 1) We suggest a methodology that is based on a combina-tion offilterandwrapperselection methods in order to explore the discriminating ability of IoT environmental and motion features.
2) We model two different motion patterns of occupants, by suggesting two new relevant features. We show that these features favor the occupancy estimation.
3) We propose a Location-Aware HMM algorithm, which is able to dynamically select the feature space based on occupant’s location.
4) We run comprehensive experiments over a real dataset of two different time periods; every day life and vaca-tions. We show that the proposed Location-Aware HMM achieves up to 10% better accuracy than
Conventional HMM.
This paper is organized as follows. Section II discusses the related work. Section III presents the algorithmic solution for dynamically selecting the feature space. Our experimental study is given in Section IV. Section V concludes this paper.
II. RELATEDWORK
The development of ubiquitous networks and IoT technolo-gies enabled the rise of several works on occupancy estimation. Some papers have been proposed for estimating the exact number of occupants, by applying a direct sensing and tracking of people via cameras [5], wearables [2, 9] or RFID based systems [14]. These works however appear to be intrusive, expensive and demanding, as they require a close and active collaboration with the participants. Those three factors make a real-life deployment unrealistic.
Another research track attempted to overcome the aforemen-tioned limitations, by focusing on non-intrusive IoT sensors. Several works [2, 8, 10, 16] have been proposed that explore only non-intrusive, envrironmental features, such asCO2,CO, P M2.5, temperature, sond, etc. Those works end up with a common conclusion that raw environmental features exhibit a high correlation with the occupants number and a good predictive ability. The authors in [4, 6] take a step further and discuss how the environmental features are changing over time. They propose a new set of features, such as first and second order difference, in order to capture their time dynamics. Although these works make a first step on understanding the temporal dependency in occupancy data, they are highly impacted by the number and the position of the sensors. By ignoring the position of the sensor with respect
to the occupant’s location, they produce noisy information for their models and a degraded accuracy.
Beyond the environmental features, some research [3, 11] has been done in estimating occupancy using Passive Inference (PIR) sensors. These works explore the relative position of an occupant and highlight the need to introduce that knowledge in their feature space. The authors in [3] propose a set of features, such as the number of sensor firings and the number of transitions between rooms, which attempt to capture the occupant’s indoor motion patterns. Although this work is able to detect the presence or absence of visitors, it is ignorant of their number. Moreover, it suffers from the common problem of static feature space; it can not dynamically adapt the feature space to the occupied room and thus it lacks the ability of capturing the location dynamics over time.
Last but not least, some research exists oriented in studying the accuracy of different models in occupancy estimation. The work in [4, 6] states thatHMMis a suitable model for estimating occupancy, as it outperforms the state-of-the-art methods of Neural Network (NN) and Support Vector Machine (SVM).
III. OCCUPANCYESTIMATIONALGORITHM
HMM is an ubiquitous tool for modelling sequential time series data. Several works in occupancy estimation [4, 6] base their model on Conventional HMM and demonstrate its superior performance. Conventional HMMcan be seen as an automaton, where the probability of moving into the next state depends exclusively on the previous state.
Figure 2 shows a graphical representation of the different probabilistic states of Conventional HMM, dedicated to the problem of occupancy estimation. In Conventional HMM, the input feature data form the observation states and each predicted class forms a hidden state. Observation and hidden states at time t are represented as xt ∈ X and zt ∈ Z, respectively. In our problem formulation, xt is the feature set extracted by the sensors and zt is the number of occupants at time t. Three parameters should be defined, to formulate HMM; the starting hidden state probability matrix stp = {sti : 1 ≤ i ≤ n}, where sti is the probability of moving from the starting state to state i, the hidden state transition probability matrixA={aij : 1≤i, j ≤n}, where aij is the probability of moving from state i to state j and the emission probability B = {bi(xt) : 1 ≤ i ≤ n}, where bi is the probability of an observation xt being generated from a statei. Emission probabilityB is given in the form of Probability Distribution Function (PDF). Several models exist to approximate PDF, but Gaussian Mixture Model (GMM) is well used to express multidimensional complex feature spaces.
GMMis a linear combination of normal Gaussian Model (GM), which is expressed as follows:
GM(x|,Σ) = 1
(2π)m2 2√Σexp{12(xμ)TΣ−1(xμ)} wherexis the feature vector, μandΣare the average and covariance matrices of x, respectively. Finally, parameter m defines the dimension ofx.GMMis formulated as follows:
Fig. 2: Hidden Markov Model Fig. 3:Location-AwareHidden Markov Model
GMM(xk|μk,Σk) = ΣKk=1kGM(xk|μk,ΣK),
0≤k≤1,ΣKk=1k = 1
wherek is the mixing coefficient andK is the number of
GM used. Note that optimizing the input parameters of GMM,
μk, Σk, k, with sufficient number of GM can appropriately approximate PDF. A common approach for optimizing the parameters is the Expectation-Maximization (EM) algorithm. Once the parameters are obtained andHMMis formulated, the testing data are classified using Viterbialgorithm [7].
Note that theConventional HMMuses the entire feature set x to calculate the emission probability matrix. Hence, it utilizes sensors data derived by all rooms. However, in real conditions, only few rooms are occupied simultaneously and the occupied rooms depend on the number of occupants (see Section IV-B). Thus, the calculation of emission probability over the entire feature set can be noisy.
Figure 3 illustrates our proposed Location-Aware
model, which is able to dynamically adapt the feature set based on the occupants locations, i.e., occupied rooms. We extend the observation states, where each one represents a room (e.g., living room) or a combination of them (e.g., living room and kitchen). Figure 3 omits some states for brevity. The emission probability matrix B is calculated over the new observation states and it takes into consideration only the feature space of the occupied room. This modification will allow the reduction of noise introduced by non-relevant features.
Algorithm 1 outlines our proposed Location-Aware HMM. It takes as input a dataset D, a time duration t and extracts a set Z, which contains the estimation of occupants number for each dataset of durationt,Dt. At first step, it splits
D intoDt. For eachDt, it dynamically selects the feature set by detecting the occupied room (line 2). A room is considered occupied, if its motion sensor si ∈ Smotion is firing for at least l seconds during t. Note that if the value of parameter
l is too small, the system becomes very sensitive in detecting occupied rooms, leading to false alarms. On the contrary, if
l approaches t the system may miss an occupied room. We
Algorithm 1 Location-Aware HMM Require: DatasetD, Time Duration t
Ensure: Occupancy estimation setZ for eachDt∈D
1: forDt∈D do
2: forsi∈Smotiondo
3: if f ire durationt(si)∈xt>13∗t then
4: B← ⎡ ⎣b. . . .00 . . . b0n bi1 . . . bin ⎤ ⎦, where bin=PDF(xt|zt), s.t. PDF(x) = 1 5: end if 6: end for 7: HMM←HMM (stp, A, B) 8: Z ←Z∪Viterbi(HMM) 9: end for
have empirically seen that for l = 13 ∗t (line 3), we get the best accuracy. Given the feature setxtof the occupied rooms, the algorithm calculates the new emission probability (line 4). Finally, theConventional HMMis performed over the new emission probability (line 7) and the number of occupants for
Dt is estimated (line 8).
IV. EXPERIMENTALSTUDY
A. Data collection
Our dataset has been collected from a real setting, deployed in an one-bedroom apartment of two residents in Grenoble, France. A set of non-intrusive sensors has been placed in each room, measuring the environmental conditions and motion activities. Figure 4 illustrates the apartment’s floor plan, along with the location of each sensor. A circle represents a Z-Wave multisensor 1 gathering motion, temperature, humidity and lighting information. Similarly, a square represents a Netatmo2weather station gatheringCO2,pressureandsound level information. The latter information is only available at 1 aeotec.com/z-wave-sensor 2 www.netatmo.com
/LYLQJURRP.LWFKHQ
5RRP
%DWKURRP :& %DOFRQ\
Fig. 4: Floor Plan
the living room. The accuracy of all multisensors varies as little as±0.2 %and the maximum delay in sensing is 10 secs. For the purpose of our experiments, data of two time periods were collected. The first period corresponds to a holiday week (1st-5th Nov), while the second corresponds to working days (16th-18th Nov). Those two periods were selected as they exhibit a high difference in their underlying distributions, due to different habits of daily and vacations life of the residents. To assess our analysis, we use a Leave-One-Out(LOO) cross validation, where data of one day are used as the validation set and the remaining days as the training set. The sampling resolution of data is set to t = 3 minutes, by averaging. We consider this resolution as a reasonable choice, since it provides enough data for analysis without estimation delay.
The collection and storage of data is performed using sensiNact 3 middleware.SensiNact is an open source eclipse platform, enabling the collection, processing and redistribution of IoT data. Moreover, a graphical annotation system is created, allowing the residents to annotate changes in the number of occupants. A ground truth of 56 annotations was created with the number of occupants varying between 0-3. B. Feature Space
Table I summarizes a set of environmental features (first row) that have been shown to be effective in [6, 12, 13]. This set refers toCO2concentration,soundlevel,temperature, etc, which are features provided directly by the sensors.
In this work, we are interested in going beyond the set of environmental features and study the dynamics of occupants location over time. For this purpose, we make two assumptions of daily life habits, regarding the indoor motion patterns of residents. First, we assume that the higher the number of occupants, the more likely a sensor in communal area (resp., private) to be triggered for longer (resp., shorter). Residents are more likely to stay longer (resp., shorter) at the living
TABLE I: FEATURE SPACE
Enviro- CO2,Sound,Temperature,Humidity,Pressure,Lighting
nmental
Motion fire ratiot(si) =
f ire durationt(si) t
co firet(si, sj) = f ire durationt(si)∩tf ire durationt(sj)
3 https://projects.eclipse.org/proposals/eclipse-sensinact
Fig. 5: Firing Duration of Motion Features
room (resp., toilet), when they have guests than when they are alone. Second, we assume that the higher the number of occupants, the less likely is to fire multiple motion sensors simultaneously. Visitors are more likely to stay in a particular room, than wandering around the rooms.
Table I presents a proposed set of motion features (second row), capturing the above assumptions. The first feature,
f ire ratiot(si), indicates the ratio of the firing duration (i.e.,
f ire durationt(si)) of a particular motion sensor si during time intervalt. The second feature,co f iret(si, sj), indicates the co-firing duration of a pair of sensors(si, sj)duringt.
Figure 5 shows the average firing duration (y-axis) for the various motion features (x-axis). We notice that the average firing duration of sensors in communal areas (i.e., LivingRoom, Kitchen) is longer when the house is occupied by guests, than by the permanent residents. Similarly, private areas (i.e., bedroom, toilet) have shorter occupancy duration while guests are in the house. Moreover, we notice that the
co f ire ratio duration of all pair of rooms is lower when there are guests in the house. There is only one exception, theco f ire(LivingRoom, Kitchen)duration, which remains high when there are guests. This behavior is explained by the fact that the living room and the kitchen belong to the same common area, where occupants stay during visiting time. The aforementioned observations confirm our initial assumptions about the motion patterns of occupants.
C. Feature selection
Given the above feature space, we are interested in studying the discriminating ability of each feature in estimating the number of occupants. We explore two characteristics of the features using filtering selection [17] approaches. First, the correlation of each feature with each other. If two features are highly correlated, then we consider that one does not add any additional value, as it is determined by the other. Second the correlation of each feature with the output class (i.e., number
(a) Feature Correlation Heatmap (b) Features Ranking (ANOVA)
Fig. 6: Feature Selection via Filtering Approaches
of occupants). If a feature has a low correlation to the output class, it appears a low discriminating ability in predicting it. To this end, features exhibiting a high correlation between each other or a low correlation with the predicted class are eliminated from our analysis.
Figure 6a illustrates a heatmap, using Pearson correlation co-efficient similarity between all features during the occu-pied hours. Warmer intensities of color stand for stronger correlation between features, while cooler colors represent weaker correlations. We notice three different areas of interest, which form squares. The first square, at the upper left corner, demonstrates the correlation between the firing duration of various motion sensors (i.e., f ire durationt(si)). We notice that every single sensor exhibits a slight correlation with each other (less than 0.5), caused by the fact that people wander around the house using its facilities. However, the sensors in the kitchen and in the living room exhibit a higher correlation (more than 0.6). Since those two rooms are in a single space, they tend to trigger their sensors simultaneously. For this reason, we exclude from our analysis their co-firing duration feature. In the second square, a high correlation between humidity data of different rooms is observed. This high correlation arises from the fact that humidity does not significantly vary between different rooms. The indication of humidity is mainly affected by the weather conditions (e.g., rainy or sunny day), than the indoor location of the sensor. On the contrary, humidity in the bathroom follows a different behavior. Since bathroom’s humidity is increasing during shower time, its correlation with the humidity of other rooms is dramatically decreased. Another point to be added is that pressure (last row of heatmap) seems to be highly correlated and to be able to determine humidity. To this end, we omit from our analysis all humidity features, except humidity in the
bathroom and maintain pressure as an equivalent alternative feature. Similar observations can be made, when exploring data of luminance (third square). Kitchen, living room and bedroom are highly correlated, since all three rooms have natural light during day. On the contrary, bathroom and toilet exhibit a lower correlation, since they are internal rooms of no natural light. For our analysis, we exclude the luminance of the living room and bedroom.
Figure 6b depicts a ranking of features, based on their ability to predict the output class, as given by the p-value of ANOVA [18] statistical test. We notice that environmental features, such as pressure,temperature,soundandCO2 con-centration are highly ranked. Moreover, motion features, such asf ire ratioandco f ireare good estimators of occupancy. This result reaffirms the primary good design of those features capturing the indoor motion patterns of occupants. Figure 6b presents all features used for the rest of our analysis. D. Occupancy Estimation
To validate the effectiveness ofLocation-Aware HMM, we compare it with the state-of-the-art Conventional HMM. To ensure a fair comparison, we first retrieve the best feature set for each algorithm, by applying a wrapper [1] selection method. The method starts with an empty feature set and then sequentially adds a new feature to the set. Each time a new feature is added, the accuracy of the method is calculated. The feature set exhibiting the best accuracy is selected for the comparison. Based on this feature set, we compare the two algorithms using a Leave-One-Out (LOO) cross validation.
Figure 7 depicts the accuracy (y-axis) of the two algo-rithms for sequentially increasing subsets of features (x -axis). We notice thatLocation-Aware HMMsystematically outperforms the Conventional HMM and reaches its best performance when using the top-23 features. Given this feature
Fig. 7: Accuracy of Various Feature Sets Fig. 8: Accuracy of Various Days
space, Location-Aware HMM needs 620 secs to perform the estimation on the entire dataset, instead of 586 secs for
Conventional HMM. The extra computational cost is due to the dynamic adaptation of the feature space.
Figure 8 illustrates the accuracy of each algorithm (y-axis), when estimating the number of occupants for a particular day (x-axis). For the purpose of the evaluation, we trained all algorithms over the data of all other dates (than training) and over their best feature set. We notice thatLocation-Aware HMM reaches an accuracy of 90% and it outperforms the
Conventional HMM, regardless of the date. Even when it appears a lower performance (for date 2), their difference is not statistically significant. This behavior indicates the superiority of our algorithm, which is thanks to its initial design and its ability to dynamically adapt the feature space. The adaptation of the feature space seems to reduce the noise and to favor the occupancy estimation accuracy. Another aspect to be adduced is that our algorithm performs well even for data of different periods, i.e. daily life and vacations, which exhibit highly different underlying distributions.
V. CONCLUSION
To deal with the dynamics of occupant’s location for the task of occupancy estimation, we proposed a modeling of their indoor motion patterns. We noticed that visitors are more likely to move in communal areas than private, while residents are more likely to wander around the house when they are alone. We showed that those two motion patterns, along with environmental features are good estimators of occupancy. We designed and implemented a Location-Aware HMM
algorithm, which enables the dynamic adaptation of the feature space based on the occupant’s location. We showed that this adaptation is positively affecting the probability of selecting the predicted class and reaches up to 10% better accuracy than
Conventional HMM.
REFERENCES
[1] N. E. Aboudi and L. Benhlima. Review on wrapper feature selection approaches. InICEMIS, pages 1–5, Sept 2016. [2] G. Ansanay-Alex. Estimating occupancy using indoor carbon
dioxide concentrations only in an office building: a method and qualitative assessment. 11th REHVA World Congress, 06 2013.
[3] J. Austin, N. Larimer, J. Kaye, M. Pavel, and T. Hayes. Svm to detect the presence of visitors in a smart home environment. 2012:5850–3, 08 2012.
[4] Z. Chen, Q. Zhu, M. K. Masood, and Y. C. Soh. Environmental sensors-based occupancy estimation in buildings via ihmm-mlr. IEEE on Industrial Informatics, 13(5):2184–2193, Oct 2017. [5] Y. Cho, S. O. Lim, and H. S. Yang. Collaborative occupancy
reasoning in visual sensor network for scalable smart video surveillance.IEEE Consumer Electronics’10, 56(3):1997–2003. [6] B. Dong, B. Andrews, K. Lam, M. Hynck, R. Zhang, Y. Chiou, and D. Benitez. An information technology enabled sustainabil-ity test-bed (itest) for occupancy detection through an environ-mental sensing network.Energy & Buildings, 42(7):1038–1046, 2010.
[7] G. D. Forney. The viterbi algorithm.IEEE, 61(3):268–278, ’73. [8] R. Frodl and T. Tille. A high-precision ndir hboxCO2 gas sensor for automotive applications.IEEE, 6(6):1697–1705, ’06. [9] M. Jin, H. Zou, K. Weekly, R. Jia, A. M. Bayen, and C. J. Spanos. Environmental sensing by wearable device for indoor activity and location estimation. InIECON’14, p. 5369-5375. [10] D. Li, T. F. Bissyand´e, J. Klein, and Y. L. Traon. Sensing
by proxy in buildings with agglomerative clustering of indoor temperature movements. SAC ’17, pages 477–484, NY, USA. [11] P. Liu, S. K. Nguang, and A. Partridge. Occupancy inference
using pyroelectric infrared sensors through hidden markov mod-els. IEEE, 16(4):1062–1068, Feb 2016.
[12] M. K. Masood, Y. C. Soh, and V. W.-C. Chang. Real-time oc-cupancy estimation using environmental parameters.IJCNN’15. [13] M. K. Masood, Y. C. Soh, and V. W. C. Chang. Real-time occupancy estimation using environmental parameters. In IJCNN’15, p. 1-8.
[14] B. B.-G. Nan Li, Gulben Calis. Measuring and monitoring occupancy with an rfid based system for demand-driven hvac operations. Automation in Construction, 24:89–99, 2012. [15] Q. Ni, A. Beln, G. Hern, and I. P. D. L. Cruz. Review : The
el-derlys independent living in smart homes: A characterization of activities and sensing infrastructure survey to facilitate services development, 2015.
[16] H. Rahman and H. Han. Bayesian estimation of occupancy distribution in a multi-room office building based on co2 concentrations. Building Simulation, Oct 2017.
[17] N. S´anchez-Maro˜no, A. Alonso-Betanzos, and M. Tombilla-Sanrom´an. Filter methods for feature selection – a comparative study. InIDEAL’07, p. 178-187.
[18] R. H. W. Penny. Chapter 13: Analysis of variance, 2006. [19] H. Zheng, H. Wang, and N. Black. Human activity detection
in smart home environment with self-adaptive neural networks. InICNSC’08, p. 1505-1510.