Operant Conditioning: the Response-Consequences model

2. Literature Review

2.2 Psychological Perspectives of Consumer Behaviour

2.2.2 Behavioural Theories

2.2.2.2 Operant Conditioning: the Response-Consequences model

Following on from the theory of associative learning, a new stream of behavioural learning emerged which suggested that learning occurred as a result of the consequences following behaviour, rather than merely the stimulus that precedes it. Much of the operant conditioning theory is based around the animal experiments of Edward L. Thorndike and Burrhus Frederic Skinner.

Thorndike’s Puzzle Box & Law of Effect

Associative learning involves the presentation of a pair of stimuli to the subject. The outcome is not contingent upon the response of the subject. Work by Edward L. Thorndike (1874-1949) took a different approach to understanding learning, one that centred on the actions of the subject. In Thorndike’s experiments on ‘Instrumental Learning’, the response of the subject is instrumental to the outcome (Thorndike 1911). Edward L. Thorndike’s experiments involved the examination of hungry cats placed in puzzle-boxes with food left outside. The cat had to operate a latch to open the door, to allow them to escape and access the food. The escape was contingent on the correct response by the cat (operating the latch). Upon escape, the cat gained access to a piece of fish, which it would have seen from inside the box. Initially, the cat would act frenetically, meowing and struggling, and have difficulty escaping. The first escape would occur seemingly by accident. Once the fish was consumed, the cat would be immediately placed back in the box and the process began anew. Thorndike recorded the length of time between the cat being placed in the box and its escape, as an index of learning. Eventually, after the cat was repeatedly returned to the box after their treat, Thorndike observed that the escape time

reduced significantly to around five seconds. Thorndike suggested that the shorter the escape time the stronger the learning.

Thorndike suggested that learning was not an indication of insight in animals, rather, a matter of trial and error, with a gradual reduction in the number of errors made over time. The association itself between the stimulus situation (puzzle box) and the response was gradually being learned.

This stimulus-response (S-R) association is strengthened gradually over time, as indicated by the gradual reduction in escape time. Thorndike argued that if the learned association was not an indication of the cat anticipating the consequence of their actions, then this would have been indicated by one significant reduction in escape time, rather than the gradual reduction over time that was observed. Instead, he asserted that the consequences of a response either weaken or strengthen the stimulus-response associations, and as such, the inclination to perform the response again; strengthened (‘stamped in’) when the response leads to reward, and weakened (‘stamped out’) when the consequences were unpleasant, such as the removal of reward or presentation of punishment (Thorndike 1911; Gleitman 1986). This process of strengthening or weakening of behavioural tendencies was known as the ‘Law of Effect’. As the experiments progressed, correct responses were ‘stamped in’, and incorrect responses ‘stamped out’. Later, the process of fortifying a stimulus-response association would be called ‘reinforcement’, with the means of this reinforcement called the reinforcer. This notion of reinforcement became key in future developments of Behaviourism, especially the works of Burrhus Frederic Skinner and the emergence of Operant Conditioning.

Skinner’s ‘Operant Behaviour’

Compared with associative learning, where behaviour is triggered by antecedents, and

instrumental learning, and where learning is asserted to relate to the strengthening of stimulus-response associations, operant learning is concerned with how behaviour is affected by its consequences. Skinner suggested that in associative learning, the behaviour of the subject was elicited by the external conditioning stimuli, while with instrumental conditioning behaviour is emitted from within the organism, and called these instrumental responses, ‘operants’.

Thorndike’s ‘Law of Effect’ suggested that the fortification of stimulus-response associations result in the strengthening of learning (Thorndike 1911), and Skinner agreed that it was the consequences of a behaviour that ‘shaped and maintained’ it (Skinner 1963; Skinner 1971).

Through reinforcement and punishment consequences, subjects learn to behave in different ways (Skinner 1958; Azrin and Holz 1966; Axelrod and Apsche 1983; Patterson, Kosson et al.

1987; Meoli, Feinberg et al. 1991). Skinner’s seminal work with rats proposed that, through learning, behaviour becomes a function of (contingent upon) its consequences. If we behave in a certain way, and are positively rewarded for that behaviour, then we are more likely to repeat that behaviour in the future. On the other hand, if we are punished for our behaviour, we are less likely to repeat that behaviour in the future. Skinner’s rats were placed in boxes, described as operant chambers, which simulated a closed environment. At the beginning of an experiment, a hungry rat would be placed in a box for several days, with food delivered occasionally by an automatic food dispenser to a tray inside the box. The rat soon came to associate the sound of the dispenser with the food, and upon hearing it, would approach the food tray. A lever inside the box, which had previously been locked in its lower position was then raised, and

programmed to dispense food whenever the rat touched it. The tests showed that once the rat had discovered that touching the lever would produce food, it would start to press the lever repeatedly, indicating that the reinforcement was modifying behaviour (lever pressing).

Schedules of Reinforcement

Skinner’s studies indicated that the schedule with which the reinforcement takes place is vital to the strength of the behaviour modification. Frequency and regularity (or predictability) of the presentation of reinforcement affect the pattern and frequency of behavioural responses (Ferster and Skinner 1957). When predictable or frequent reinforcements are withdrawn, frequency of responses begin to decline fairly quickly, while responses will take longer to fade, if

reinforcements withdrawn were previously unpredictable or infrequent. Table 2.3 below outlines the different types of reinforcement schedule identified by Skinner and Ferster (Ferster and Skinner 1957), including examples of reinforcement schedules and their impact on consumers.

Table 2.3: Schedules of Reinforcement Reinforcement

Schedule

Description Pattern & rate of response Continuous

Reinforcement (CRF)

Each response is reinforced Slow and steady response rate.

Fixed Interval (FI) Reinforcement after X seconds, provided response occurs during that time.

Response rate increases as next reinforcement becomes available

75 Variable Interval

(VI)

Reinforcement every X seconds, but at a different interval in subsequent trials

Stable response rates over time, with moderate response growth

Fixed Ratio (FR) Reinforcement given after fixed number of responses, e.g. five responses for one

Reinforcement

Higher rate of responding as next reinforcement

approaches.

Variable Ratio (VR) Reinforcement after X responses, but after a different amount of responses in subsequent trials

Very steady and very high response rate.

Reinforcement and Punishment

Reinforcement and Punishment are the outcomes contingent on the behaviour that occurs.

When a behaviour is increased as a consequence of the outcome, it is said to have been subject to reinforcement. When the consequence of the behaviour leads to a decrease in that behaviour, it has been subject to punishment. Marketers have many reasons to change their customer/

potential customer’s behaviour, and like with many other disciplines, marketing makes use of the different forms of reinforcement and punishment to maximise desirable behaviour. Figure 2.9 below outlines the four grades of reinforcement and punishment, and how the presentation or removal of stimuli impact on future behaviour.

Reinforcement always strengthens behaviour, such as increasing its intensity, or frequency. A behaviour will be repeated with greater frequency in the future, if the outcome of a behaviour is favourable. This outcome may be pleasing with the presentation of pleasant or ‘appetitive’

stimulus (positive reinforcement) or the removal of a disagreeable or ‘aversive’ stimulus (negative reinforcement). Many aspects of consumer choice, including store choice, can be explained in this way (Foxall 1990).

In Skinner’s experiments, the food (appetitive stimuli) was the positive reinforcement of the lever pressing behaviour, leading to an increase in that behaviour. The example of Bounty^TM chocolate bars, provided above includes positive reinforcement, and allowed Mars Inc. to draw upon the uniqueness of the product to reinforce the feelings that the adverts aimed to elicit from customers. Customers eating the product are reinforced by the appetitive stimulus of tasting the coconut centre. They are reminded by the wrapper that they are eating ‘A Taste of Paradise’, and again reminded of the exotic beach on which the Bounty^TM advert was set, and the feelings this evoked. This feedback loop is a crucial component in the development of customer loyalty to the brand, increasing its purchasing frequency in the future. In choice of shopping centre, if a customer has previously had an agreeable shopping trip to a particular shopping centre, he/she may be more inclined to patronise that shopping centre or a similar one in the future.

Another form of positive reinforcement identified in behavioural research is known as the Premark principle (Premark 1959). Researchers discovered that presenting the opportunity to engage in a preferred activity (high probability behaviour) as a reward for engaging in a less-preferred activity (low probability behaviour) can increase the low-probability behaviour (Premark 1959; Mitchell and Stoffelmayr 1973). An example of this is when parents allow their child to go out and play once they have completed their homework. Alternatively a shopper may treat themselves to a visit to a favoured shop (high probability behaviour) after completing the weekly family grocery shop (low-probability behaviour).

Figure 2.9: Positive and Negative Reinforcement and Punishment

Positive reinforcement Negative reinforcement

Positive punishment Negative punishment Consequence of the Behaviour

Stimulus is removed Stimulus is presented

Outcome

Behaviour is weakened (decreases in the future) Behaviour is strengthened (increases in the future)

Source: Miltenberger 2004 p120

In some of Skinner’s experiments, loud irritating noises (aversive stimuli) were used as negative reinforcers. They would be played into the box, until the rat pressed the lever, then the noise would cease. This resulted in increase in lever pressing behaviour. Certain products can be sold by utilising negative reinforcement in advertisements. Government initiatives have used negative reinforcement successfully, to persuade people to purchase and to maintain smoke alarms, by highlighting the potentially fatal consequences (presenting adverse stimuli) of failing to do so.

Commercially, similar techniques have been used to sell products such as bathroom air fresheners, where the adverse stimuli is the embarrassment caused by an unpleasant toilet, or washing up liquid, by showing the downside of buying inferior products from competitors, and washing detergent, by comparing the brand on offer with reportedly inferior products that are unable to remove stubborn stains. For store choice, an example of negative reinforcement would be when a customer avoids a particular store, in which they had past experience of abusive sales people, and choose to shop elsewhere (Foxall 1990). This is similarly applicable in choice of a particular type of shopping centre. For example, a shopper may choose to avoid shopping in city centre shopping areas, because of a previous experience of having great difficulty getting parked in one such shopping centre in the past. Because of this experience in one city centre shopping centre, a consumer may expect similar consequences at all such shopping centres (generalisation), and avoid all such shopping centres in favour of alternatives.

A consumer who has a dissatisfactory experience at a particular outlet, whatever the reason, may come to view alternative outlets more favourably, and be inclined to patronise those outlets more in the future. For example, a consumer may visit a shopping centre with the desire to purchase everything they need at that shopping centre. It has been established that consumer shopping trips tend to be for several items, and are often multipurpose, and early models of shopping centre choice were criticised for their inability to account for multipurpose (Carter 1993). Should the consumer fail to purchase all of the items on their shopping list at the shopping centre visited, they may be more inclined to go to larger, more diverse shopping centres in the future.

While reinforcement, both positive and negative leads to strengthening of behaviour,

punishment outcomes lead to suppression of behaviour. Punishment is not to be confused with negative reinforcement, which is associated with increases in response rates (Catania and Harnad 1988). A particular consequence is only deemed punishing if it results in a decrease in the related behaviour in the future (Miltenberger 2004). Certain consequences will act as punishers for some, resulting in a decrease in behaviour, but not necessarily for other people. It is also

important to consider whether consequences are truly reducing a behaviour, or merely ensuring an immediate escape from the consequence. A truly punishing consequence will ensure

avoidance behaviour in the future, reducing behavioural occurrence over time, rather than merely terminating the behaviour in the short term.

Some consequences may be mislabelled as punishing if they cause a behaviour to cease immediately, but not a decrease in the behaviour over time (Miltenberger 2004). Smacking a child is an example of this. While a parent smacking a child may force the child to cease its unwanted activity, a child craving attention may indeed increase their unwanted behaviour in the future to elicit attention from their parents. Behaviour tends to be reduced in the future, if a behaviour results in an unfavourable outcome for the subject. The outcome of a behaviour may be unfavourable with the presentation of aversive stimulus, or the removal of an appetitive stimulus. An example of positive punishment can be draw from some of Skinner’s experiments, where lever pressing by rats would administer an electric shock to the cage floor (aversive stimuli). As a result of this aversive stimulus, future lever pressing behaviour by the rats would reduce. Solomon and Wynne’s experiments also indicated that dogs would cease certain

behaviour to avoid aversive consequences such as an electric shock (Solomon and Wynne 1953).

In practice, marketing activity tends to try to increase behaviour, rather than decrease it, but there are certain instances where marketers use positive punishment to reduce an undesirable behaviour (Nord and Peter 1980). The government has also made powerful use of positive punishment in drink driving prevention initiatives, by showing the graphic and harrowing consequences of doing so, to persuade people to avoid such behaviour in the future

(Macpherson and Lewis 1998). Similarly, cigarette companies now have to put warning labels and graphic stomach-turning pictures of smoking related diseases on cigarette packets, as evidence of the harmful consequences of smoking, to reduce take-up of the habit, and in the hopes that it might help people stop (Watson 2001).

Should a consumer face an unsatisfactory trip to shopping centre, they may reduce their patronage frequency to such a place in the future. For example, a shopping centre which has inadequate parking provision may lead to a frustrating experience for a consumer, and reduce their desire to visit that shopping centre, and other similar shopping centres (through

generalisation) again. Shopping centres have even employed techniques to specifically reduce certain behaviour and remove undesirable elements from the shopping centre. Many shopping

centres all over the world, have faced problems caused by teenagers who use them as places to meet and ‘hang out’. This segment of shopping centre patron is deemed undesirable in that they make use of the amenities, but have little disposable income and so make very few purchases in stores. They can also be intimidating to other paying customers, and shopping centre owners often feel they can be detrimental to the overall appeal of a centre. Recently, several shopping centres have tried to move the teenagers on by piping in ‘big band’ and classical music in the favoured spots where teenagers congregate, frequently by mall entrances (anon 2005; anon 2005). These types of music serve as aversive stimuli to teenagers, and as a result they are likely to leave the vicinity, and less likely to return. The ‘mosquito alarm’ is another tool used by retailers to punish undesirable individuals to discourage loitering behaviour nearby (BBC_News 2008), as it is pitched at a frequency that only teenagers can hear, and has no impact on most people over the age of 25.

Negative punishment involves the removal of an appetitive stimulus to decrease behaviour. For example, if a rat consistently receives food at regular intervals when not pressing a lever, but food is withheld when it does press the lever, the rat’s lever pressing behaviour will lessen over time. Negative punishment is popular in child rearing, as a means for reducing misbehaviour (Miltenberger 2004). The increasingly popular ‘timeout’ involves the removal of all attention and stimuli from the child, to reduce disruptive or naughty behaviour (Clark, Rowbury et al. 1973).

Also, taking a child’s toys away when they misbehave sometimes stops future misbehaviour.

While some advertising is aimed to sell a product to a wide target market, many advertising campaigns aim to target very narrow segments of the market (Dolich 1969; Park, Jaworski et al.

1986). Certain companies may desire a young trendy audience for their product, but be fearful of a wider appeal, as this may damage a brand image. Peer pressure may mean that visiting a

particular shopping centre will result in loss of approval from peers, and so such places are avoided in the future. Negative reinforcement is also important in suppressing certain antisocial behaviour in shopping centres. Customers deciding whether to shoplift will be discouraged from doing so, for fear of being removed from the shopping centre, which may have further

consequences in the future.

80 Generalisation and Discrimination

When the reinforcement of one response also strengthens other similar responses, then generalisation has occurred (Foxall 1990). When the response to one stimulus transfers to another stimulus, the response has been generalised (Foxall 1990). However, when reinforcement of a response does not strengthen other responses it has been differentially reinforced, i.e. discrimination has taken place (Foxall 1990).

In their study of Little Albert, Watson and Rayner investigated whether Little Albert’s

conditioned fear of white rats could be transferred onto other animals or objects (Watson and Rayner 1920). When presented with a rabbit and a dog, it was noted that Little Albert’s fear response seemed to have transferred to both, but that he did not react as violently to the dog as he did with the rabbit. Fear was also transferred to a fur coat (seal), cotton wool Santa Claus mask, and Watson’s hair, indicating generalisation of the fear response, although the fear response did not transfer to the hair of other observers. Also, playing blocks had been used to calm Little Albert between experiments, indicating discrimination. Furthermore, when Watson moved the experiments to a very different environment, he observed that emotional transfers between situations can occur.

To illustrate generalisation in a consumption setting, an earlier example shall be revisited. Where a child has learned that throwing a tantrum (R) on a shopping trip (S^D) can result in the purchase of sweets (S^R) he may also realise that the tantrums can be used to persuade a parent to purchase other desired items (S^R) such as toys.

In marketing, generalisation is utilised in branding strategies by companies (Engel, Blackwell et al. 1993). Should one product elicit a favourable response, then this may be generalised to other products under the same brand name. As such, there has been a growing trend towards the launch of new products under an existing parent brand. However, this is not without risk, as unfavourable responses to a new product may generalise to reflect unfavourable on the parent

In document Shopping Centre Choice: A Behavioural Perspective (Page 73-82)