Knowledge-driven methods - Sensor-based activity recognition

Critical Literature Survey

2.3 Sensor-based activity recognition

2.3.2 Knowledge-driven methods

The knowledge-driven approach leverages domain knowledge for activity modelling and inferring. The underlying observation is that human daily activities contain rich in common-sense knowledge that interlinks activities and surroundings. The knowledge-driven ap-proach is more compelling than the data-driven for several reasons. First, activities in re-alistic scenarios (i.e, cooking, preparing a drink) may comprise amounts of same physical actions, and the order that the subjects perform the activities may not be consistent all the time. These characteristics of the activities pose a challenge to recognise them solely based on physical signals from sensors such as accelerometers and gyroscopes. However, those activities can be differentiated by taking into account a diversity of the surrounding context.

Since most activities usually take place at different time, location and with different object interactions, thus this additional context can be used to better characterise the activities. For example, the activity “brush teeth” usually happens in the bathroom in the morning and at night, and the objects involved are usually toothpaste and toothbrush. Moreover, the knowledge used for activity recognition can be explicitly specified or mined from external information sources, thus avoiding the processes of manual labelling, feature extraction and learning in the data-driven approaches to activity recognition. Previous works also demon-strate that, with carefully defined domain knowledge, it is possible to achieve the equivalent recognition performance to the HMM [125]. In what follows, we discuss two knowledge-driven approaches, one is to mine the knowledge from external information sources (e.g.

web) and the other is to explicitly specify the knowledge (e.g. ontology). Their advantages and disadvantages are also discussed.

Mining-based approach

The basic idea behind this approach is that the activity models can be created by mining knowledge from the existing external sources such as websites, which provide the instruc-tions on performing the activities and the objects that are required. Hence, through infor-mation retrieval methods, activities are modelled by establishing the relationship between the activities and the required objects in a probabilistic manner. Then given the objects used at a specific time point, which are usually captured through sensors, the probabilities of the activity classes that current activity belongs to are calculated and the one that has the maximum probability is chosen to label the current activity.

Perkowitz et al. [116] propose a method to create an activity model by mining the web. By tagging each word in the sentences with its part of speech, they are able to extract the objects used in the activities. Then they automatically calculate the probabilities of the objects usage in the activities using Google conditional probabilities APIs. As the objects involved in an activity cannot be exhaustively mined from the web, Tapia et al. [139] propose a way to deal with unseen objects. They create a hierarchical ontology of synonymous words for functionally similar objects. By performing shrinkage over the ontology of objects, they calculate the probabilities for the unseen objects in a probabilistic way. Instead of relying on object probabilities for activity recognition, Gu et al. [37] mine the activity descriptive texts from the websites, they then use natural language processing method to extract the objects used in the activities from the texts and information retrieval methods to calculate the weights of each object with respect to different activities. Finally, they construct contrast patterns for each activity based on the object terms and their relevance weights they mined from the web, so as to maximize the discriminative power of fingerprints for each activity class.

Ontology-based approach

Rather than mining objects usage information from the external resources, an ontology-based approach is to explicitly specify the activity models with a description-ontology-based method.

In [20], the authors model sensors and activities as classes in ontology separately, with each class described by a number of properties. For example, each sensor class has the state

prop-erty, indicating the state of the object to which the sensor is attached. The activation of the sensor can be interpreted as the object-interaction. Each activity class has the properties such as hasLocation and useArtifact, denoting the location in which the activity is performed and objects involved in the activity. The aggregation of sensor activations at a specific time point can be used to establish a situation, which is then reasoned against already established activ-ity models. In this light, the sensor ontologies model the situation at a specific time point that interrelates the context information and sensor observations. Activity ontologies interlink the activities and contextual information through object properties, and activity recognition is equivalent to reasoning on a dynamically constructed situation against activity ontologies.

The activity ontologies are organized in a hierarchical structure, where subclass inherits all the properties of superclass. As more sensor observations are aggregated at runtime, the rec-ognized activity can be narrowed down from the class hierarchy, as thus, the ontology-based approach is able to recognise both coarse-grained and fine-grained activities.

Riboni et al. [124] even propose to combine ontological reasoning with statistical reasoning.

The environment is modelled using ontology, then using domain knowledge, they perform ontological reasoning to infer the possible activities in each location. At run-time, they use statistical reasoning to obtain for each data sample a posterior for each activity class, the possible activities are then filtered out by the previous inferred knowledge from the ontol-ogy.

The obvious problem with the ontology-based approach is that the temporal reasoning is not supported. Moreover, it is vulnerable to information uncertainty, due to the fact that all the object properties must be satisfied in order to infer specific activities. Helaoui et al.

[45] propose probabilistic ontological approach to recognize multilevel human activities. In particular, they leverage the log-linear description logics (DLs) to integrate DLs with proba-bilistic log-linear models [108]. They add weighted axioms into the ontology as long as they are consistent with the axioms already in the ontology. There may be several ontologies with different set of axioms that are consistent with each other, only the one that have the maximum a-posterior (MAP) is chosen for reasoning activity. In order to take temporal in-formation into account, they strictly define the order of actions in each activity. This method is robust with data uncertainty in that the axioms are assigned with different weights, the larger the weight the stronger the rule holds.

Discussion

Both of the mining-based and the ontology-based approaches have their own disadvan-tages. For example, activity models established from the object probabilities mined from the external information sources are general models that do not allow to achieve high recogni-tion performance as people perform activities quite differently. For ontology-based method, even though data uncertainty and temporal reasoning are solved by the probabilistic on-tological framework, defining the ontology and specifying the weights of the axioms are non-trivial tasks. Although it is possible to learn the weights from the data, advantage of non-manual-labelling of knowledge-driven approach is affected. Moreover, the explicitly specified order of actions in the ontology disregards the fact that activities can be performed differently by various users, even the same person may perform the same activity in vari-ous manners at different time. The goal of our research is not to propose solutions to solve aforementioned problems. On the contrary, we will leverage the domain knowledge as a starting point, and incrementally refine the activity recognition model with opportunistic sensors and increasingly available data.

In document A framework for mobile activity recognition (Page 50-53)