Analysis of Genetic Data: Probability and the Chi-Square Test

(1)

(2)

Analysis of Genetic Data: Probability and the Chi-Square Test

(3)

PROBABILITY AND GENETIC EVENTS

Probability Theory

Probability of occurrence (P) =

For example, the probability of getting a head from a toss of a coin is 1/2.

number of defined outcome(s) Total number of possible outcomes

(4)

Basic Terms: Sample Space

In probability theory, the sample space of an experiment or random trial is the set of all

possible outcomes. For example,

 if the experiment is tossing a coin, then the sample space is the set {head, tail}.

 For tossing a single six-sided die, the sample space is {1, 2, 3, 4, 5, 6}

(5)

Event

Again in probability, any subset of the sample space is usually called an event.

Ordered event (e.g. tossing 2 coins once and obtaining a head for the first coin and a tail for the second coin).

Unordered event (e.g. tossing 2 coins once and obtaining exactly one tail).

(6)

Probability of Multiple Events

The rule of Independent events:

This states that the occurrence of past events have no influence on that of future events.

The Product rule:

This rule also states that the probability of

independent events occurring together is equal to the product of their individual probabilities.

E.g., if the p(A) = 0.7, then, p(AA) = 0.7 X 0.7 = 0.49

(7)

Questions

What is the probability of a couple having 5 boys in a row?

What is the probability of tossing a coin twice and getting one head and one tail?

(8)

Probability of Multiple Events cont’d.

Sum Rule

It states that the probability of either of 2 or more independent events occurring is equal to the sum of their individual probabilities.

Example:

What is the probability of a couple having (i) a boy and a girl? (ii) either a boy or a girl

(9)

Probability of Multiple Events cont’d.

Binomial expansion/distribution

The probability of occurrence of some arrangement of two mutually exclusive trials, where the final order is not specified, is defined by the binomial theorem:

In probability theory, events E₁, E₂, ..., E_n are said to be mutually exclusive if the occurrence of any one of them automatically implies the non-occurrence of the remaining n − 1 events.

Therefore, two mutually exclusive events cannot both occur.

P = (n!/s!t!)(p

^s

q

^t

)

(10)

P = (n!/s!t!)(p

^s

q

^t

)

Where;

n = number of trials

p = probability of an event occurring on any given trial

q = probability of the event not occurring s = number of times an event occurs

t = number of times an opposite event occurs

(11)

Example:

A would be couple plan to have five children when they marry. Determine the probability of the couple having 3 girls and 2 boys.

Solution:

n = 5, s = 3, t = 2, p = 1/2 and q = 1/2

p = (5!/3!2!)(1/2)³(1/2)² = 10(1/2)³(1/2)² = 10/32

(12)

Question

If four babies are born at a given hospital on the same day;

(a) What is chance that two will be boys and two girls?

(b) What is the chance that all four will be girls?

(13)

Questions

A man and his wife who are both heterozygous for albinism plan to have four children. Use the information to answer the following questions.

(a) What is the probability that any given child will be normal?

(b) What is the probability that all of them would be normal?

(c) What is the probability that all of them are normal except the 2^nd child?

(d) What is the probability of having an albino child among the four children?

(14)

Evaluating Genetic Data: Chi-Square Analysis

The Chi-Square (χ²) Test

Mendel’s 3:1 monohybrid and 9:3:3:1 dihybrid ratios are hypothetical predictions based on the following assumptions:

(15)

1. Dominance/Recessiveness 2. Segregation

3. Independent assortment and 4. Random fertilization

 Of all the factors, segregation, independent assortment and random fertilization can be affected by chance and thus influenced by normal deviation (chance deviation).

(16)

Evaluating Genetic Data, cont’d.

Thus, chance deviations in any one of the above can alter observed Mendelian ratios.

Chance deviation is affected by sample size. The greater the sample size the lesser the possibility of chance deviation occurring. The reverse is also true.

Chi-Square (χ²) Distribution

It allows one to determine whether or not a deviation from expected Mendelian ratio can be attributed solely to chance.

(17)

Chi-Square (χ²) Distribution, cont’d

It also compares observed distribution to expected distribution (based on genetic hypotheses) and mathematically assesses whether or not the calculated χ² value is due to chance or a real difference between the two distributions.

It is dependent upon the sample size.

(18)

CALCULATION OF CHI-SQUARE STATISTIC

c ²

Where “O” is the observed value for a given category and “E” is the expected value for that category

Since (o – e) is the deviation in each case, then the equation can be reduced to:

c

²

= Σd

²

/e

(19)

Problem Solving

Christabel, a would-be food technologist and a genetics student decided to test the 3:1

Mendelian ratio. She obtained 1000 seeds in the following proportions.

Tall : 740 Dwarf: 260

Calculate the p-value and also infer if her results closely fit the 3:1 ratio.

(20)

Step by step procedure to make the X2 calculation for the F2 results of a hypothetical monohybrid and dihybrid crosses

(21)

The final step is the interpretation of the c² value

First, we determine the value of the degrees of freedom (d/f), which is equal to n-1,

where n is the number of different categories into which each datum point may fall.

For the 3:1 ratio, n = 2, so d/f = 2 – 1 = 1

The d/f for the 9:3:3:1 ratio is 3

(22)

D/f must always be taken into account

because the greater the number of categories, the more deviation is expected due to chance.

The next step is to convert the c² value to the corresponding probability value (p), using a prepared chart or graph.

(23)

INTERPRETATION OF THE VALUE

Compare your calculated χ² value to the χ²value on the table at 5 %.

If your calculated χ² is larger than the χ²from the table at 5% (i.e. p > 0.05), then the difference is due solely to CHANCE, and therefore the observed numbers fit a particular ratio.

c

²

(24)

In the F₂ generation of a certain tomato experiment, Michael, decided to test the 9:3:3:1 Mendelian ratio. She obtained 1000 seeds in the following proportions.

583 round yellow 195 round green

166 wrinkled yellow 56 wrinkled green

Are the discrepancies between the observed and expected ratios acceptable?

(25)

(26)

(27)

Graph of Chi-Square

(28)

Using the dihybrid cross above, where p = 26 as an example,

The first interpretation is that, the probability is 26% or about 1 in 4, that the deviation was due to chance.

The second interpretation is that, were the

same experiment repeated many times, 26% of the trials would be expected to exhibit chance deviation.

(29)

Interpretation

Is 0.26 an acceptable or unacceptable p value?

The decision is relative and depends on the certainty of the investigator

By conversion, 0.05 (5%) has been chosen as an arbitrary standard.

(30)

All p values between 0.05 and 1.0 are

considered acceptable in chi-square analysis.

All values below 0.05 are unacceptable with respect to goodness of fit.

0.26 is much above 0.05 (5%) and therefore acceptable.

(31)

In other words, our data are consistent with the hypothesis of a 9:3:3:1 ratio of phenotypes,

which is indicative of a two-locus genetic model with dominance at each locus.

(32)

Were the p value below this standard, we would have rejected the hypothesis for the experiment. The data would then be

interpreted as unacceptable in fitting a 9:3:3:1 ratio.

(33)

NOTE:

When the χ² test shows that there is no significant difference between the observed and expected samples then “ we fail to reject the hypothesis”

i. e. we accept.

If there is significant difference between them “we reject the hypothesis”.

(34)

Homework

A heterozygous genetic condition called

“creeper” in chickens produces shortened and deformed legs and wings, giving the bird a

squatty appearance. Matings between creepers produced

775 creeper : 388 normal progeny.

(a) Is the hypothesis of a 3:1 ratio acceptable?

(b) Does a 2:1 ratio fit the data better?

(35)

CELLULAR BASIS OF INHERITANCE.

MITOSIS AND MEIOSIS

Analysis of Genetic Data: Probability and the Chi-Square Test

Basic Terms: Sample Space

Event

Probability of Multiple Events

Questions

P = (n!/s!t!)(p

q

)

P = (n!/s!t!)(p

q

)

Question

Questions

Evaluating Genetic Data: Chi-Square Analysis

c 2

c

= Σd

/e

Problem Solving

c

Graph of Chi-Square

Interpretation

Homework

c ²