• No results found

Perception Methods for Systems with Limited Resources

In this section, a selection of perception methods is discussed. The focus is on simple techniques that can be used on devices that have very limited processing power and memory. For some methods the approach was to approximate calculations usually carried out on powerful systems (with no real limitation regarding processing power and memory) in the best possible way on the limited system.

3.7.1 Basic Statistical Functions

The average of the data samples provided by a single sensor over a given time- window can be calculated with minimal cost3. Calculating the average is meaningful for data from nearly any sensor, e.g. light, acceleration, temperature, and pressure sensor.

The median is a valuable measure to eliminate extreme values and false readings in the signal that is supposed to be stable or slow changing. However if the sample size is very small the median may proof of little value. To avoid the need for a division in the calculation an uneven number of samples can be taken.

Calculating the standard deviation can give an indication on how stable a signal is, or how much change there is in the signal. The measure is less useful if it is known that the signal sometimes carries wrong values, because even a single value can distort the result.

The range of the samples collected can be easily calculated by finding the minimum and maximum. This can be done on the fly without the need to save all samples, by always updating minimum and maximum after each reading. Range is however very vulnerable to single false readings. These errors are avoided by the use of percentiles, e.g. using the interquartile range is more robust. Sorting the data and calculating the distance between values at one quarter and three-quarter is obviously more reliable than using the range, e.g. a few faulty values do not wreck the calculated feature. A compromise between both measures that can be calculated without storing all samples

3 By selecting the number of samples over which the average is calculated as 2n

the division operation can be replaced by a shift. When calculating the average of 256 Byte values and storing the sum in a 2 Byte variable, instead of dividing only the MSB can be taken as result.

is to hold, not just minimum and maximum, but also the runner-up for minimum and maximum (or even the n smallest and n largest values).

An indication of the amount of change in a signal can be gained by calculating the sum of absolute differences between the average, and each data sample in a window. Summing up the differences between following samples can be done on the fly and gives information on how rapidly a signal changes.

3.7.2 Time Domain Analysis

To avoid the transformation into the frequency domain feature extraction procedures in the time domain can be used. This has been used in particular on data from accelerometers, light sensors and audio.

Finding the average value is computationally cheap and can be done on the fly with little need for memory. For audio the average itself has no meaning but it is useful to calculate further features. Knowing the average means that calculations on how often the average is crossed in a certain time and also the average distance between crossing the average, can be performed. It is also possible to calculate the distribution of the distances between crossing the average. This is an indicator for the base frequency and the stability of the base frequency in the signal. Counting the direction changes in the signal is also possible on the fly. The ratio between the average crossings and the direction changes gives an indication on the type of signal and allows discrimination between contexts. For example in the audio signal it is possible to discriminate music, speech, and noise, and in the acceleration signal it is possible to find characteristic values for certain patterns of movement. More details and an example are shown in Appendix A.1: Time Domain Analysis.

For fast changing signals like audio signals, the peaks or energy (root mean square) of the signal in small time windows (e.g. getting a indication every 100ms) provides information about the sampled data. Certain audio events (speaking of a word, ringing of the phone, applause, music) result in a characteristic series of values. See Appendix A.1: Time Domain Analysis.

3.7.3 Derivatives

For many signals it is of interest to find information about the change rather than about the absolute values. Calculation or estimation of the first derivative of the sensor data indicates the direction of change in the signal. This information is especially helpful to find transitions in the observed conditions, e.g. going into the dark or speeding up. In the simplest cases this can be estimated by checking whether or not the samples are continuously falling or rising. Another indication is to sum up the differences of consecutive samples.

Analysing higher derivatives provides information on how the changes that occur, are changing over time. A simple way of doing this is to hold a history of features (e.g. the changes in light) and analysing these features similarly.

3.7.4 Neural Networks

To provide abstract or symbolic information, it is necessary to process the calculated features and cues further. Neural networks can be set up so that they take the cues and features calculated as input and provide context or the context class as output. Many neural networks are computationally demanding. The following approach however can be implemented on very restricted hardware platforms.

Logical Neural Networks as described in [Aleksander,95] offer a computationally

cheap method to learn and recognise patterns. In the learning phase the applied input patterns are transformed into binary vectors, which are then subdivided into shorter parts. For each class of input pattern a logical storage unit is used. The short patterns are then use as memory addresses of the assigned storage unit (e.g. internal or external RAM). For each pattern seen the value of the storage is 1. In the recalling phase an incoming pattern is also transformed into a binary vector and subdivided into parts. The output is then the class of the memory unit that has the most sub-patterns in common with the incoming vector. For an example see [Schmidt,96,p24ff]. Implementing learning as well as recall is feasible on very simple hardware. Depending on the number of contexts to be recognised and the size of the input vector additional storage is required.

It is also possible to implement Backpropagation Neural Networks on restricted hardware. Basically it is possible to implement small backpropagation networks directly on these devices, however due to storage restrictions and also to increase training speed, a distributed implementation is often preferable. This is a two step process. In the first phase, when the contexts are learned, the input vectors consisting of cues or sensors values, or a mixture of both, are acquired on the microcontroller device and communicated to a backend system (e.g. over serial line or RF to a PC). These data samples are annotated with the context they are recorded in. In the backend these data is used to train a backproagation network of appropriate size and structure. When the training process is finished and all weights are calculated these can then be coded into the recognition software that will run on the microcontroller device. By coding the weights directly into the recognition code the size of the network is mainly limited by the program memory.

Nearest neighbour matching is a very simple technique for pattern matching. When

implementing this technique on a microcontroller it can be done with varying complexity depending on the requirements. In a simple implementation for each class a representative vector is calculated and stored during the learning phase. When the system is in operational mode, an incoming vector is compared to the stored sample vectors and the distance is calculated. The nearest neighbour is then selected as the class to which the input belongs. In the TEA project we used this approach in one of the experiments to recognise different motion patterns.

If the clusters are not known in advance using the Kohonen Self-Organizing Map (SOM) or one its many variants is another option [Fausett,94,p169ff]. The clustering algorithm is able to learn new clusters at any time and can also handle noisy data. To make the output meaningful, the produced clusters must be labelled with context names [Laerhoven,99]. The topology preserving property of the SOM makes it very probable that the nearest label will indeed be the right context. Generally, the longer the system is trained, the better the recognition becomes.

3.7.5 Rule Based Systems

The straight forward approach is to integrate rules while programming. This usually is done without much thought, when the domain is limited and easy to understand. This

is particularly simple when sensors map well to contexts of interest and the number of contexts is small. An example, developed in the project TEA, is a device which can detect the context ‘in a pocket’, ‘on the table’, ‘in the user’s hand’ by sensing acceleration, light, temperature and touch, for a similar experiment see Appendix A.2: A Simplified Rule Set.

To build rule based systems a two step process is used. In a first phase data is collected and transferred to a backend system. The data annotated with the contexts that the samples belong to, and then analysed to find appropriate rules. Knowing the rules resulting from the data analyses, these can then be implemented on the microcontroller device. This can be done by hard-coding the rules into the source code of the software that will run on the microcontroller. Another option is to build an interpreter that runs on the microcontroller and can take rules as input and interpret these.

A further option for systems where rules are applicable, but the borders between states are less clear, is the use of Fuzzy Logic [Zadeh,73], [Traeger,94].