• No results found

Crown Toothpaste

Kayleigh Marlon is the Chief Buyer at Tar-Mart, a company that operates a chain of superstores selling discount

merchandise. Tar-Mart has a huge national presence, and manufacturers compete fiercely to get their products onto Tar- Mart's shelves.

Crown Toothpaste, a new entrant in the toothpaste market, is one of them. Kayleigh agreed to stock Crown for 4 weeks and display it prominently. After that period, she will stop stocking Crown unless 5% of Tar-Mart's customers bought Crown or were considering buying Crown within the next month.

The trial period is now over. Kayleigh has asked you to take a sample of customers to see if Tar-Mart should continue stocking Crown. She would like you to be at least 95% confident in your answer.

The first step is to decide how large a sample size to choose. Kayleigh tells you that, in the past, when Tar-Mart introduced a new product, the percentage of people who expressed interest ranged between 2% and 10%. What sample size should you use?

a. 50

This is not the best answer. This sample size will satisfy the two rules of thumb (n(p-bar) " 5 and n(1 - (p-bar)) " 5) for all proportions falling in the range 2% to 10%.

b. Anything between 50 and 250

This is not the best answer. This sample size will satisfy the two rules of thumb (n(p-bar) " 5 and n(1 - (p-bar)) " 5) for all proportions falling in the range 2% to 10%.

c. 250

This is the best answer. This sample size will satisfy the two rules of thumb (n(p-bar) " 5 and n(1 - (p-bar)) " 5) for all proportions falling in the range 2% to 10%.

You choose a sample size of 250. After conducting the survey, you find that 10 out of 250 people surveyed had bought Crown or were considering buying Crown within the next month. What is the 95% confidence interval for the population proportion?

a. From 1.6% to 6.4% This is the correct answer. b. From 2.0% to 6.0%

This is not the correct answer. The appropriate z-value for a 95% confidence interval is 1.96. c. From 3.5% to 4.5%

This is not the best answer. You may have forgotten to take the square root of the standard deviation in the formula for the confidence interval.

z-table

Confidence Interval Utility

First, you find the sample proportion: 10 out of 250 is a proportion of 4%. You verify that n(p-bar) = 250*0.04 = 10 " 5 and n(1 - (p-bar)) = 250*0.96= 240 "5. Then, using the formula, you find the confidence interval around the sample proportion. The endpoints of that interval are 1.6% and 6.4%. Challenge: OOPS! Package Deliveries

OO-P-S is a small-package delivery service with worldwide operations. Celine Bedex, VP Marketing, has heard increasing complaints about late deliveries, and wants to know how many of the shipments are late by one day or more.

Celine would like an estimate of the percentage of late deliveries. In a sample of 256 shipments, 2 were delivered late, a proportion of about 0.008, or 0.8%. If Celine wants to be 99% confident in the result of a confidence interval calculation, the interval is:

a. Between -0.6% and 2.2%

This is not the correct answer. One of the rules of thumb for the sample size is not being satisfied. b. Between 0.7% and 0.9%

c. No valid inferences can be drawn from these data.

This is the best answer. One of the rules of thumb for the sample size is not being satisfied: n(p-bar) = (256) (0.008) = 2 is less than 5.

Celine collects a new sample, this time of 729 shipments. Of these, 8 were late. Celine can be 99% confident that the population proportion of late packages is between:

a. 0.1% and 2.1%

This is the correct answer. The new sample size is sufficiently large to investigate a population proportion of 0.011.

b. 0.3% and 1.9%

TThis is not the correct answer. The appropriate z-value for a confidence interval of 99% is 2.58. c. 0.0% and 1.6%

This is not the correct answer. The new sample has a new sample proportion of 0.011. Use this sample proportion in your confidence interval calculation.

d. The sample size is still too small to make a valid inference.

This is not the correct answer. Both n(p-bar) and n(1 - (p-bar)) are greater than 5, so both rules of thumb are satisfied.

z-table

Confidence Interval Utility

First, calculate the sample proportion for the new sample: 8/729 = 0.011. Then, verify that the new sample size satisfies the rules of thumb. Both n(p-bar) and n(1 - (p-bar)) are greater than 5. Using the new sample size and sample proportion, calculate the confidence interval: [0.1%, 2.1%].

Hypothesis Testing Introduction

After finishing the sampling assignments, you and Alice decide to take some time off to enjoy the beach. Just as you are gathering your beach gear, Leo gives you another call.

Improving the Kahana

Hi there! Don't let me keep you from enjoying the beach. I just wanted to let you know what I'd like you to help me with next. I've been working on ideas to increase the Kahana's profits.

Is it possible to increase profits by raising the room prices? That would be an easy solution.

I wish it were that easy. Room prices are extremely competitive and are often the first thing potential guests take into consideration. So if we increase room prices, I'm afraid we'll have fewer guests. That might put us back where we started from with profits — or even worse.

What other factors influence your profits?

The two major ones are room occupancy rates and discretionary spending. "Discretionary spending" is the money guests spend on non-room amenities. You know, food, drinks, spa services, sports activities, and so on.

As a manager I can affect a variety of factors that influence discretionary spending: the quality of the restaurant, for example, or the types of amenities offered.

And you'd like us to help you understand your guests' discretionary spending patterns better.

Right. Then I can explore new ways to increase profits on non-room amenities. I can also see if some of my recent efforts to increase guest spending have paid off.

I'm particularly interested in restaurant operations. I've made some changes to the restaurants recently. For example, I hired a new executive chef last year. I'd like to know if restaurant revenues per person have changed since then.

I'd also like to find out if the renovation of our premier cocktail lounge has resulted in higher spending on beverages. Finally, I've been wondering if discretionary spending patterns are different for leisure and business guests. If so, I might change our marketing campaigns to better suit each of those market segments.

What records do you have for us to work with?

We don't have a consolidated report for this year yet, so we'll need to conduct some surveys and analyze the results. You're really getting into these statistical methods, aren't you, Leo?

Definition

Leo made some important changes to his business and he has some ideas of what the impact of these changes has been. How do you put his ideas to the test?

As managers, we often need to put our claims, ideas, or theories to the test before we make important decisions. Based on whether or not our claim is statistically supported, we may wish to take managerial action.

Hypothesis testing is a statistical method for testing such claims. A hypothesis is simply a claim that we want to substantiate. To begin, we will learn how to test hypotheses about population means.

For instance, suppose we know that the historical average number of defects in a production process is 3 defects per 1,000 units produced. We have a hunch that a certain change to the process — a new

machine, say — has changed this number. The hypothesis we wish to substantiate is that the average defect rate has changed — that it is no longer 3 per 1,000.

How do we conduct a hypothesis test? First, we collect a random sample of units produced by the process. Then, we see whether or not what we learn about the sample supports our hypothesis that the defect rate has changed.

Suppose our sample has an average defect rate of 2.7 defects per 1,000. Based on this sample, can we confidently say that the defect rate has changed?

That depends. To find out, we construct a range around the historical defect rate of 3 — the population mean that has been cast in doubt. We construct the range so that if the mean defect rate in the population is still 3, it is very likely for the mean of a sample taken from the population to fall within that range.

The outcome of our test will depend on whether 2.7, the mean of the sample we have taken, falls within the range or not.

If the sample mean of 2.7 falls outside of the range, we feel comfortable rejecting the hypothesis that the defect rate is still 3.

However, if the sample mean falls within the range, we don't have enough evidence to support the claim that the defect rate has changed.

This example captures the essence of hypothesis testing, but we need to formalize our intuition about the example and define our new statistical technique more precisely.

To conduct a hypothesis test, we formulate two hypotheses: the so-called null hypothesis and the alternative hypothesis.

Based on experience or conventional wisdom, we have an initial value of the population mean in mind. The null hypothesis states that the population mean is equal to that initial value: in our example, the null hypothesis states that the current population mean is 3 defects per 1,000. We use the Greek letter mu to represent the population mean, in this case the current average defect rate.

The alternative hypothesis is the claim we are trying to substantiate. Here, the alternative hypothesis is that the average defect rate has changed. Note that the alternative hypothesis states that the null hypothesis does not hold.

As the example suggests, in a hypothesis test, we test the null hypothesis. Based on evidence we gather from a sample, there are only two possible conclusions we can draw from a hypothesis test: either we reject the null hypothesis or we do not reject it.

Since the alternative hypothesis states the opposite of the null hypothesis, by "rejecting" the null hypothesis we necessarily "accept" the alternative hypothesis.

In our example, the evidence from our sample will help us determine whether or not we should reject the null hypothesis that the defect rate is still 3 in favor of the alternative hypothesis that the defect rate has changed.

Based on our sample evidence, which conclusion should we draw? We reject the null hypothesis if it is highly unlikely that our sample mean would come from a population with the mean stated by the null hypothesis.

hypothesis. Drawing a sample with 14 defects from a population with an average defect rate of 3 would be very unlikely.

"We cannot reject the null hypothesis if it is reasonably likely that our sample mean would come from a population with the mean stated by the null hypothesis. The null hypothesis may or may not be true: we simply don't have enough evidence to draw a definite conclusion."

For example, if the sample we drew had a defect rate of 3.05 per 1,000, we could not reject the null hypothesis, since it wouldn't be unusual to randomly draw a sample with 3.05 defects from a population with an average defect rate of 3.

Note that having the sample's average defect rate very close to 3 does not "prove" that the mean is 3. Thus we never say that we "accept" the null hypothesis — we simply don't reject it.

It is because we can never "accept" the null hypothesis that we do not pose the claim that we actually want to substantiate as the null hypothesis — such a test would never allow us to "accept" our claim! The only way we can substantiate our claim is to state it as the opposite of the null hypothesis, and then reject the null hypothesis based on the evidence.

It is important that we understand exactly how to interpret the results of a hypothesis test. Let's illustrate the two types of conclusions with an analogy: a US jury trial.

In the US judicial system, the accused is considered innocent until proven guilty. So, the null hypothesis is that the accused is innocent. The alternative hypothesis is that the accused is guilty: this is the claim that the prosecution is trying to prove.

The two possible outcomes of a jury trial are "guilty" or "not guilty." The jury does not convict the accused unless it is certain beyond reasonable doubt that the accused is guilty. With insufficient

evidence, the jury cannot conclude that the accused truly is innocent. The jury simply declares that the accused is "not guilty.

Similarly, in a hypothesis test, if our evidence is not strong enough to reject the null hypothesis, then that does not prove that the null hypothesis is true. We simply have failed to show it is false, and thus cannot reject it.

A hypothesis is a claim or assertion that can be tested. On the basis of a hypothesis test we either reject or leave unchallenged a particular statement: the null hypothesis.

Alice promises Leo that the two of you will drop by his office first thing in the morning to test if Leo's survey results support his claims that food and beverage spending patterns have changed.

Summary

We use hypothesis tests to substantiate a claim about a population mean. The null hypothesis states that the population mean is equal to an initial value that is based on our experience or conventional wisdom. We test the null hypothesis to learn if we should reject it in favor of our claim, the alternative hypothesis, which states that the null hypothesis does not hold.

Single Population Means

The next morning, Leo explains the measures he has undertaken to increase customer spending on food and beverages. "I'd like to see if they've had a discernable impact on my guests' restaurant-related spending patterns."

The Restaurant Revenue Problem

Last year, I made two major changes to restaurant operations: I brought in a new executive chef and renovated the main cocktail lounge.

The chef introduced a new menu: a fusion of traditional Hawaiian and French cuisine. She put some elaborate items on the menu, like that mango and brie tart I recommended to you. She also has offerings that cater to simpler tastes. But the question is, have restaurant profits been affected by the new chef? Since we set our food margins as a fixed percentage of food revenue, I know that if revenues have increased, profits have increased too. Based on last year's consolidated reports, the average spending on

food per person per day was $55. I'm curious to see if that has changed.

In addition, I renovated the cocktail lounge. The old bar was designed poorly and used space inefficiently. Now more guests can be seated in the lounge, and more seats have good views of the ocean.

I also invested in a large machine that makes a wide variety of frozen drinks. Frozen pina coladas are very, very popular.

I hope my investments in the bar are paying off in terms of higher guest spending on drinks. Beverages have high margins, but I'm not sure if beverage sales have increased enough to cover the investments. Can we say, for beverages, as for food, that "changes in revenues" are a good proxy for "changes in profits?"

Absolutely. I set my profit margins as a fixed percentage of revenues for beverages as well. Last year, the average spending on beverages per guest per day was $21.

Isn't that high?

Well, we have some very nice wines in our restaurants.

We don't have the consolidated report yet, but I've already had my staff choose a random sample of guests.

We pulled the restaurant and lounge receipts for the guests in the sample and noted three items: total food revenues, total beverage revenues, and number of guests at the table. Using this information, we should be able to estimate the daily spending on food and beverages per guest.

You look at Leo's data and wonder how you can discern whether Leo's changes — the new chef and the bar renovations — have influenced the resort's profits.

Hypothesis Tests for Single Population Means

Leo has prepared data for you. How are you going to put it to use?

Our first type of hypothesis test is used to study population means. Let's walk through an example of this type of test.

Suppose the manager of a movie theater implemented a new strategy at the beginning of the year: he started showing old classics instead of recent releases.

He knows that prior to the change in strategy, average customer satisfaction was 6.7 out of a possible 10 points. He would like to know if average customer satisfaction has changed since he altered his theater's artistic focus.

The manager's null hypothesis states that the current mean satisfaction has not changed; it is still 6.7. We use the Greek letter mu to represent the current mean satisfaction rating of the theater's entire film- going population.

His alternative hypothesis is the opposite of the null hypothesis: it states that average customer satisfaction is now different.

To substantiate his claim that the mean has changed, the manager takes a random sample of 196 moviegoers. He is careful to sample across movies, show times, and dates. The mean satisfaction rating for the sample is 7.3, with a standard deviation of 2.8.

Does the fact that the random sample's mean of 7.3 is higher than the historical mean of 6.7 indicate that this year's moviegoers really are more satisfied?

Or, is the mean still the same, and the manager "just happened" to pick a sample with an unusually high average satisfaction rating? This is equivalent to asking the question: If the null hypothesis is true — the average satisfaction is still 6.7 — would we be likely to randomly draw the sample that we did, with average satisfaction 7.3?

we typically use 95% as our threshold level of likelihood.

We then construct a range around the population mean specified by our null hypothesis. The range should be drawn so that if the null hypothesis is true, 95% of all samples drawn from the population would fall in that range. In other words, we create a range of likely sample means.

The central limit theorem tells us that the distribution of sample means follows a normal curve, so we can use its familiar properties to find probabilities. Moreover, the distribution of sample means is centered at our assumed population mean, mu, and has standard deviation sigma/sqrt(n). We don't know sigma, the underlying population standard deviation, so we use the sample standard deviation as our best estimate. As we do when constructing 95% confidence intervals, we create a range with width z*s/sqrt(n) = 1.96*s/sqrt(n) on either side of the mean. However, when we conduct a hypothesis test, we center the