Many Attributes Chapter - Packaging Research

dents to act as measuring instruments. Despite those critics who say that consumers cannot act as measuring instruments, the world of everyday experience shows that the consumer can and does act as measuring instrument in many everyday situations, whether shop- ping, driving, cooking, eating, or just about every other activity!

Now back to the problem of measurement. What should we do with ratings of acceptance, such as the sets of ratings for six hypothetical products that we see in Table 4.1 ? What can we learn from these results? Can the data make the package designer smarter? What new insights emerge to make the job easier? Certainly we can say that in the case of the fi rst column of observations (A), the products all bunch together, that in the case of the second column (B) there are two good products and the rest poor products, and in the third column (C) the opposite occurs, so we see mostly good products, and a few poor products.

Analyzing Data for Shelf - Stable Milks

Let ’ s look at an example of data from an actual study. The particular project dealt with shelf - stable chocolate milk drink where respondents evaluated nine different products, fi rst by visual inspection (just looking at the product) and then, immediately afterward, by lifting and holding the product. For right now, we ’ ll focus on the acceptance ratings of the nine products, to see how the products perform.

First look at the stimuli in Figure 4.1 . Keep in mind that the respondents inspected the packages one at a time, in a randomized order. Randomizing eliminates any order bias, such as the often - observed bias of the product evaluated fi rst to get a higher score in that fi rst position than it would obtain in other positions.

research methods to understand Nestl é products. The fi rst study revolved around coffee, specifi cally what characteristics of coffee from across the spectrum of commercially available coffees seemed to be the ones that consumers liked the most. Like most other companies at that time, Nestl é was wedded to the method of paired comparisons, so that it was unable to uncover these general rules of consumer preference. Through persis- tence, however, Schmid created a research project to evaluate 11 different coffees, with the goal of discover- ing a pattern.

We know that the basic relation between sensory attribute and liking is an inverted U - shaped curve. As a stimulus becomes stronger, it is fi rst liked, then liked more, and then liked most. Any continuing increase in the stimulus intensity beyond this maximum degree of liking started to diminish acceptance. Schmid ’ s question was simple: What did this particular curve look like for specifi c coffee attributes? And, another question arose: Could Nestl é discover groups of coffee consumers with different sensory - liking curves? That is, did there exist in the population groups of consumers who liked high intensities of sensory attributes (the so - called high impact people) and other groups who liked the low intensities of same sensory attributes (the so - called low impact people)? At the end of the day, the research program uncovered these groups, and made use of them in opti- mizing formulations. But, that is not the essence of our story. Now we want to learn how this approach of com- petitive analysis and pattern recognition translates into packaging.

What Do We Learn from Many Products?

When we test many products, instead of one product, we discover how they perform versus each other. That, in itself, is important especially when we use a scale of acceptance rather than the method of paired comparisons. If we live only in the world of pairs, then we certainly know that one package is preferred to another, and by how much. The task is harder, of course, when we deal with many products. Nonetheless, with enough data and enough theory, we probably could create some type of general scale so that the preferences that we observe are transformed into these scale values. Leon Louis Thurs tone, the famous psychometrician, created just such an indirect method 80 years ago in 1927 (Thurstone, 1927 ). We can do a lot better and be far more productive, however, if instead of paired comparisons we ask respon-

Table 4.1 Three possible outcomes (results A – C) from evalu- ating six packages on liking, using a 0 – 100 - point scale

Result (A) Result (B) Result (C) Package 1 68 68 68 Package 2 62 66 64 Package 3 56 45 63 Package 4 49 43 62 Package 5 42 49 49 Package 6 35 37 41

Chapter 4 Patterns in Packages: Learning from Many Packages and Many Attributes 37

ucts performed a certain way might work if we could simply take that information and plug the acceptance scores into some type of predictive model. We might predict sales, and if the product were suffi ciently accept- able, then we ’ d stop there, breathe a sigh of relief, and move on.

Life is hardly that simple. Testing is not science. Most tests that we might run are simply passive reports of an aspect of reality. We need the evaluation of what the products are, but we also need to discover specifi cally what to do once we have the data. Perhaps, in the hands of a skilled developer, these simplistic reports of package acceptance can be magically transformed into under- standing why one package does well and another one We start the analysis with ratings of liking. What ’ s

special about these data is that they represent in - market products, so we know something about their performance. When we look at Table 4.2 , we are struck by the fact that we are dealing with a “ beauty contest ” among the products. Certainly, we know which product wins and which loses. But, we don ’ t know why and we ’ re not particularly sure about what makes a good product.

Now What Do We Do?

If beauty contests only bring us part of the way, then we have to do something else. The question, of course, is what is this something else? Just knowing that the prod-

Nestlé Nesquik Hershey’s Shamrock

Farms 20 oz.

Looney Tunes

Land O’ Lakes Grip ‘n Go

Viva Slam Shamrock

Farms 12 oz.

Milk Chug DariGo

Figure 4.1 The nine shelf - stable milk products. Respondents fi rst looked at the product, rated appearance attributes, then held the product and rated the remaining attributes.

Table 4.2 Performance of the nine products on acceptance (liking) and on purchase intent

Nesquik Hershey ’ s Grip ’ n Go Tunes

Shamrock Farms 20 oz

Shamrock

Farms 12 oz Chug Slam DariGo Appearance only

Like the bottle overall 83 78 75 61 60 57 52 51 50

In hand

Like the bottle overall 82 79 78 60 59 61 55 56 53 5 - point purchase scale

Top - 1 box (defi nitely buy) 53 41 24 14 19 17 13 12 11 Top - 2 box (defi nitely and

probably buy)

does poorly. Unfortunately, that skilled developer is rare, more a matter of legend than reality. That skilled developer is akin to the wonderful senior doctors in novels such as Sinclair Lewis ’ Arrowsmith, who, from years of practice, can see a patient and instantly have a sense of what is causing the patient ’ s distress (Lewis, 1925 ). Such expertise comes from years of clinical training, experience that other doctors (and hence, by analogy, package and product designers) simply do not possess.

With these caveats in mind, we move to the logical next step — adding more information to our study. Market researchers and sensory analysts call this information diagnostics, for the simple reason that the information diagnoses the characteristics of the product. Diagnostics come in the form of attribute ratings, usually assigned by the consumer respondent. However, diagnostic information about these products can be provided by marketers who deconstruct the products into the presence/absence of features, by expert panelists who give specifi c, struc- tured sensory descriptions of the product, or by instruments that provide measures of package characteristics. Whatever the source of diagnostic information may be, the objective is to provide more data so that the data become useful. Those data allow the researcher to search for underlying relations among the variables in the data set.

The next question is what type of information is best? And, of course, who can provide that information? Consumer researchers are accustomed to having the respondent rate products (concepts, actual products, packages) on a myriad of questions, so there ’ s no problem with the quantity of information. It ’ s the quality of that information, however, that is important.

Let ’ s move to an example of the type of information that one could ask about a product in a package. The focus of the evaluation is the package. Keep in mind that the respondent does not possess a particularly broad vocabulary for package features. Thus, looking at Table 4.3 , we see a few descriptive questions that instruct the respondent to profi le his response to the physical characteristics. Most of the other questions instruct the respondent to rate acceptance (liking, purchase intent), or image (e.g., appropriate for a specifi c end use). It is worth looking at Table 4.3 in detail because from it you will get a sense of the depth to which a questionnaire can probe the “ package experience. ”

We see how large a data set might be, with many questions and many test stimuli. Now, the question is what do we do with these data? For our project on choco-

late milk we worked with nine different products in the market, each rated separately on the full set of attributes that we see in Table 4.3 . We can now generate a lot of data. The question comes down to “ How do we process

the data from the nine milk products to give us direction ? ”

It might seem that we are belaboring the point of “ what to do ” and, in fact, we are focusing and belaboring the point. It will be very important, both for this book and later on. If you are reading this book, then most likely you are interested in how to create better packages, better graphics designs, and better experiences. Most likely you are not particularly interested in the latest and greatest way to represent these data by a two - or three - dimensional picture.

Now, back to the question of what should we do to “ make the data sing to us, ” or perhaps a bit less hyper- bolic, how to extract insights from the data. Much of what you read in this book focuses on the relation between variables, in the spirit of psychophysics. We can do that investigation now, but we need to fi nd the appropriate independent variable (x - axis) and the appropriate dependent variable (y - axis). We should keep in mind the following considerations, adapted from both develop- ment and science:

1. The independent variable should represent something that we can legitimately vary. We cannot really vary overall liking or purchase intent. Those are intrinsi- cally dependent variables, results of our perception. We cannot order a package of a given level of acceptance. Nor can we really vary appropriateness for an end use. Like acceptability, the rating of appropriate is a judgment made after the respondent integrates the sensory information about the product with other cri- teria, such as price, brand, and the like.

2. With the criterion of actionability in mind (i.e., the developer can push the attribute in a direct and meaningful way, knowing what to do ahead of time), let us look at the different attributes in Table 4.3 . For the most part, we see these attributes as dependent variables or results of different physical features.

3. Our observation about the preponderance of evaluative attributes is very important. Many consumer researchers focus on the description of a package (or product) in terms of how well it is accepted, and what the package is “ good for. ” In the world of research this focus on evaluative and image attributes is under-

Table 4.3 Profi les of three commercially available, shelf - stable chocolate milk products on attributes, after both pure visual and hand evaluations of the packages, respectively

Nestl é Hershey Land O ’ Lakes Brand Nesquik Hershey ’ s Grip ’ n Go Bottle size 16 oz. 14 oz. 12 oz. Overall ratings

Like bottle overall 83 78 75 Overall purchase interest Top Box % 53 41 24 Top - 2 box % purchase

interest

88 73 62

Appearance attributes (before the bottle is held) Like overall appearance

of the bottle

84 75 68

Like overall size of the bottle

81 77 67

Like overall shape of the bottle

79 75 74

Like label design on the bottle

82 72 68

Like color scheme used on label

79 74 73

Small versus large size of the bottle

67 64 51

Ease of reading overall text on bottle

85 77 74

Ease of reading fl avor name on bottle

82 82 76

Ease of reading brand name on bottle

89 90 79

Overall size of the bottle (too little versus too much)

11 9 0

Bottle in - hand attributes (after the bottle is held) Like overall feel of bottle

in hand

82 79 78

Like shape of bottle in hand

83 80 79

Like overall appearance of the bottle cap

79 73 65

Like color of the bottle cap

78 75 66

Like overall size of bottle cap

75 72 68

Comfortable to hold 84 82 82 Easy to grip 83 84 83 Ease of reading overall

text on bottle

84 78 76

Ease of reading fl avor name on bottle

82 81 77

Ease of reading brand name on bottle

90 88 78

Overall color of bottle cap, light versus dark

62 62 56

Overall size of bottle cap, small versus large

63 64 58

Amount of information on bottle (too little versus too much)

10 12 16

Nestl é Hershey Land O ’ Lakes Brand Nesquik Hershey ’ s Grip ’ n Go Imagery attributes Uniqueness of bottle 77 78 71 Cool looking 81 74 68 Fun looking 86 67 76 Quality of product 87 86 75 Good tasting 88 87 76 Miscellaneous attributes (percents)

Consumer group product is appropriate for

Children 85 62 90 Teenagers 72 73 55 Adults 48 73 29 Meals product is most

appropriate for Breakfast 68 64 71 Lunch 55 58 51 Dinner 25 25 21 In between meals 75 74 64 Other 25 25 17 Occasions when the

respondent expects to drink the milk product

At home 65 66 67 At school 60 58 62 At work 46 54 32 For when you are on

the go

64 62 51

When you want something good for you

30 32 25

After you play sports or play hard

18 11 14

When you are thirsty 35 28 27 When you want

something fun to drink

56 43 45

When you are with your friends

40 33 29

When you want a snack 64 62 55 Other 15 20 14 None of the above 0 0 1 Types of food with

which to drink the product

Desserts such as cookies, cakes, pies, doughnuts 65 66 64 Muffi ns, bagels, or toast 57 54 51 Cereal 35 31 26 Sandwiches 43 45 41 Pizza 19 17 16 Dinner entrees 22 17 19 Ice cream 21 25 21 Chocolate 15 15 15 Other 32 33 26 39

90 80 70 LSIZE 60 50 30 40 50 60

Perceived size of bottle 70 80 90 100 90 80 70 60 50 40 30 PCHILD 30 40 50 60

Perceived size of bottle 70 80 90 60 55 50 MLUNCH 45 40 30 40 50 60

Perceived size of bottle 70 80 90 80 70 60 50 PTEEN 40 30 30 40 50 60

Perceived size of bottle 70 80 90

Figure 4.2 How the perceived size of the bottle “ drives ” other attributes (at least covaries with them)

standable. Researchers, especially consumer researchers, do not fancy themselves as product developers. Rather, they think of themselves as measuring the opinion of the consuming public. In turn, this consuming public is presumed to focus on “ WIIFM ” (what ’ s in it for me) (i.e., what does the product do for ME?).

4. Let us identify some variables that we can change. These will be our independent variables. One of these is size. The package developer can change size in a straightforward way.

5. Now that we have identifi ed size as an independent variable that we feel to be actionable, let us look for relations between size and the subjective response. As we stated above, basic and applied research both instruct us that as a sensory attribute increases, liking fi rst increases, peaks, and then drops down. This relation, often described as an inverted U, appears to fi rst have been discovered for liking versus sensory intensity, but may be a general principle.

6. Although we did not systematically vary the size of the bottle, let us see how size covaries with a few

other attributes. To discover what is happening, we create a scatter plot. The x - axis is the perceived size of bottle, and the y - axis is the attribute rating. Typically, the data scatter, but there may an underlying curve or line. We ’ re looking for the shape of the relation, even though we recognize it ’ s not necessar- ily a good fi tting curve. Look at Figure 4.2 , showing how the perceived size of the bottle drives some other attributes. We fi tted a scatter plot to the data and drew the best curve. We don ’ t show the points but show the best fi tting curvilinear relation. Remember, we have not done an experiment. Rather, we are trying to fi gure out what message nature is trying to tell us. There are four patterns that we can deduce from Figure 4.2 , each of which emerges from the “ fi tted relation ” between the perceived size of bottle as a consumer rating, and either rating of an attribute or appropriateness.

So, what can we conclude about these data, and thus what do we learn about the dynamics of our milk package?

Chapter 4 Patterns in Packages: Learning from Many Packages and Many Attributes 41

We also learned that in order to discover underlying patterns we need to have variables to generate these patterns. That is, we need to have at least two attributes, which we plot against each other. Just working with liking or another evaluative criterion does not tell us what we need to know. It helps to have many attributes with which to “ play. ”

Third, the consumer respondent need not rate every attribute on a common scale in order to let the research uncover the patterns. The respondent can rate an attribute (e.g., size of the bottle), rate another attribute (e.g., liking of size), but simply select a third attribute (e.g., vote yes or no for appropriate for a specifi c situation or type of respondent). The analysis does not differentiate between scale values and percentages.

Fourth, in an ideal case we should have many independent variables to explore. To the degree that we can work with many more continuous variables (e.g., heavi- ness, length, width, etc.), we will be able to create more independent variables, and thus test for more relations.

Fifth and fi nally, the data need not be perfect. We can live with variability and imperfect fi ts. Certainly we would like the data points to lie close to the fi tted curve, but the fi t need not have to be high; it has to be reasonable. (The defi nition of reasonable is an entirely different topic, not relevant here.) When we work with many in - market products, our goal should be to discover patterns for future use.

References

Lewis , S. ( 1925 ) Arrowsmith . Harcourt, Brace & Company , New York . Thurstone , L.L. ( 1927 ) A law of comparative judgment . Psychological

Review , 34 , 273 – 286 . 1. The data do not precisely fi t the curve. The reason

for the scatter makes intuitive sense. The bottles varied on many features. We are simply plotting two subjective attributes and looking for a fi t between them.

2. We can assume the relation is linear or quadratic. If we assume a quadratic, or nonlinear relation, then there is the possibility that the curve peaks some- where in the middle sensory range. That intermediate optimum is the case for the attribute “ liking of size. ” Bottles above the optimum size don ’ t appear to be liked as much. We might never have uncovered this relation without testing the nine different bottles that we did here. In this fi rst graph, the respondent rated both the perceived size of the bottle and the liking of

In document Packaging Research (Page 51-58)