Nihar Ranjan Roy
Business Statistics, By Ken Black, Wiley India Edition
Text Book
HYPOTHESIS TESTING AND CONFIDENCE INTERVALS ABOUT
THE DIFFERENCE IN TWO MEANS USING THE z STATISTIC
(POPULATION VARIANCES KNOWN)
In some research designs, the sampling plan calls for selecting two
independent samples, calculating the sample means and using the difference in the two sample means to estimate or test the difference in the two
population means.
The object might be to determine whether the two samples come from the same population or, if they come from different populations, to determine the amount of difference in the populations.
This type of analysis can be used to determine, for example, whether the effectiveness of two brands of toothpaste differs or whether two brands of tires wear differently.
Business research might be conducted to study the difference in the
productivity of men and women on an assembly line under certain conditions.
How does a researcher analyze the difference in two samples by using sample means?
The central limit theorem states that the difference in two sample
means, − , is normally distributed for large sample sizes ( ≥ 30) regardless of the shape of the populations. It can also be shown that
These expressions lead to a z formula for the difference in two sample means.
As a specific example, suppose we want to conduct a hypothesis test to determine whether the average annual wage for an advertising manager is different from the average annual wage of an auditing manager. Because we are testing to determine whether the means are different, it might seem
logical that the null and alternative hypotheses would be
where advertising managers are population 1 and auditing managers are population 2. However, statisticians generally construct these hypotheses as
Hypothesis testing
A random sample of 32 advertising managers from across the United States is taken.
The advertising managers are contacted by telephone and asked what their annual salary is. A
similar random sample is taken of 34 auditing managers. The
resulting salary data are listed in Table on left, along with the sample means, the population standard deviations, and the population variances.
In this problem, the business analyst is testing whether there is a difference in the average wage of an advertising manager and an auditing manager;
therefore the test is two tailed. If the business analyst had hypothesized that one was paid more than the other, the test would have been one tailed.
Suppose α= .05. Because this test is a two-tailed test, each of the two rejection regions has an area of .025, leaving .475 of the area in the
distribution between each critical value and the mean of the distribution. The associated critical table value for this area is z0.025=± 1.96. Figure on next slide shows the critical table z value along with the rejection regions.
The observed value of 2.35 is greater than the critical value obtained from the z table, 1.96. The business researcher rejects the null hypothesis and can say that there is a significant difference between the average annual wage of an advertising manager and the average annual wage of an auditing manager.
The business researcher then examines
Confidence Intervals
Suppose a study is conducted to estimate the difference between middle- income shoppers and low-income shoppers in terms of the average amount saved on grocery bills per week by using coupons. Random samples of 60 middle-income shoppers and 80 low income shoppers are taken, and their purchases are monitored for one week. The average amounts saved with coupons, as well as sample sizes and population standard deviations are in the table on the next page.
Problem
This information can be used to construct a 98% confidence interval to estimate the
difference between the mean amount saved with coupons by middle-income shoppers and the mean amount saved with coupons by low-income shoppers.
The z value associated with a 98% level of confidence is 2.33. This value, the data shown, and last formula can be used to determine the confidence interval.
There is a 98% level of confidence that the actual difference in the population mean coupon
HYPOTHESIS TESTING AND CONFIDENCE INTERVALS ABOUT
THE DIFFERENCE IN TWO MEANS: INDEPENDENT SAMPLES
AND POPULATION VARIANCES UNKNOWN
The hypothesis test presented in this section is a test that compares the
means of two samples .To determine whether there is a difference in the two population means from which The samples come. This technique is used
whenever the population variances are unknown (and hence the sample variances must be used) and the samples are independent (not related in any way). An assumption underlying this technique is that the measurement or
characteristic being studied is normally distributed for both populations.
Hypothesis testing
Hypothesis testing
If equal variance criteria cannot be met then the following formula may me used.
Because this formula requires a more complex degrees-of-freedom
Is there a difference in the way Chinese cultural values affect the purchasing strategies of industrial buyers in Taiwan and mainland China? A study by researchers at the National Chiao-Tung University in Taiwan attempted to determine whether there is a significant difference in the purchasing strategies of industrial buyers between Taiwan and mainland China based on the cultural dimension labelled
“integration.” Integration is being in harmony with one’s self, family, and associates. For the study, 46
Taiwanese buyers and 26 mainland Chinese buyers were contacted and interviewed. Buyers were asked to respond to 35 items using a 9-point scale with possible answers ranging from no importance (1) to
extreme importance (9). The resulting statistics for the two groups α = .01, are shown in step 5. Using test to determine whether there is a significant difference between buyers in Taiwan and buyers in mainland China on integration. Assume that integration scores are normally distributed in the population.
Problem
Confidence Interval
One group of researchers set out to determine whether there is a difference between “average Americans” and those who are “phone survey respondents”. Their study was based on a well-
known personality survey that attempted to assess the personality profile of both average Americans and phone survey respondents. Suppose they
sampled nine phone survey respondents and 10 average Americans in this survey and obtained the results on one personality factor, conscientiousness, which are displayed in Table . Assume that
conscientiousness scores are normally distributed in the population.
Problem
Technique to analyse dependent samples or related samples.
Some researchers refer to this test as the matched-pairs test. Others call it the t test for related measures or the correlated t test.
What are some types of situations in which the two samples being studied are related or dependent? Let’s begin with the before-and-after study.
Sometimes as an experimental control mechanism, the same person or object is measured both before and after a treatment.
Certainly, the after measurement is not independent of the before
measurement because the measurements are taken on the same person or
STATISTICAL INFERENCES FOR TWO RELATED POPULATIONS
The matched-pairs test for related samples requires that the two samples be the same size and that the individual related scores be matched. Following formula is used to test hypotheses about dependent populations.
Hypothesis Testing
Suppose a stock market investor is interested in determining whether there is a significant difference in the P/E (price to earnings) ratio for companies from one year to the next. In an effort to study this question, the investor
randomly samples nine companies from the Handbook of Common Stocks and records the P/E ratios for each of these companies at the end of year 1 and at the end of year 2. The data are shown in Table
Problem
Solution
Sometimes a researcher is interested in estimating the mean difference in two populations for related samples. A confidence interval for D, the mean population difference of two related samples, can be constructed by algebraically rearranging previous formula , which was used to test hypotheses about D. Again the assumption is that the differences are normally
distributed in the population.