• No results found

Estimation of recombination rates in a dairy cattle population

N/A
N/A
Protected

Academic year: 2021

Share "Estimation of recombination rates in a dairy cattle population"

Copied!
27
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

Estimation of recombination rates in a

dairy cattle population

A. Hampel, F. Teuscher, D. Wittenburg 01.09.2016

(3)

3

Background: Half-sib family

(4)

4

Background: Population parameters

Population structure has influence on population parameters Population LD Population recombination rate Maternal LD Paternal recombination rate

(5)

5

Background: Linkage disequilibrium

B b

A a  𝐴𝐴; 𝑎𝑎 … two alleles at a locus (allele frequency 𝑓𝑓𝐴𝐴 ,𝑓𝑓𝑎𝑎)

𝐵𝐵; 𝑏𝑏 … two alleles at a locus (allele frequency 𝑓𝑓𝐵𝐵 ,𝑓𝑓𝑏𝑏)  Frequencies of combinations in a population:

(6)

6

Loci are in linkage equilibrium: 𝑓𝑓𝐴𝐴𝐵𝐵𝑓𝑓𝑎𝑎𝑏𝑏 = 𝑓𝑓𝐴𝐴𝑏𝑏𝑓𝑓𝑎𝑎𝐵𝐵

𝐷𝐷 = 𝑓𝑓𝐴𝐴𝐵𝐵𝑓𝑓𝑎𝑎𝑏𝑏 − 𝑓𝑓𝐴𝐴𝑏𝑏𝑓𝑓𝑎𝑎𝐵𝐵

Loci are in linkage disequilibrium: 𝐷𝐷 … disequilibrium coefficient

𝑓𝑓𝐴𝐴𝐵𝐵𝑓𝑓𝑎𝑎𝑏𝑏 ≠ 𝑓𝑓𝐴𝐴𝑏𝑏𝑓𝑓𝑎𝑎𝐵𝐵

(7)

7

Paternal diplotype

1

2 (1 − 𝜃𝜃)

Daughter generation Probability

A B

a b

A B

Background: Recombination rate

1 2 𝜃𝜃 𝜃𝜃 … Recombination rate a B A b a b 1 2 (1 − 𝜃𝜃) 1 2 𝜃𝜃

(8)

8

Objective: Estimation of LD and recombination rates

Both methods were applied to an empirical dataset

Verification of the accuracy was performed in simulated half-sib families

EM Method* New Method

Minimization approach with less computation time

Maximization approach with high computation time

*Gomez-Raya (2012): Maximum likelihood estimation of linkage disequilibrium in half-sib families. Genetics

(9)

9

Parameters

 Maternal haplotype frequencies  Genotype counts from offspring  Paternal recombination rate

Solved by applying the EM algorithm

(10)

10

Parameters

 Empirical covariance between genotype codes for additive/dominant effects at two SNPs

e.g. AA  1, Aa  0, aa  -1 / e.g. AA  -1, Aa  1, aa  -1

 Allele frequency in the maternal population  LD of dam; LD of sire 𝜃𝜃 = 1−4𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠

2

(11)

11

Empirical data set

Comprised 1295 half-sibs of a dairy cattle population (Fugato-plus “BovIBI” data)

 40317 SNP-genotypes (29 autosomes) Estimation on BTA1

 Maternal linkage disequilibrium  Paternal recombination rate

(12)

12

Results: LD on chromosome 1

EM Method New Method 𝐷𝐷�𝑑𝑑𝑎𝑎𝑑𝑑 𝐷𝐷�𝑑𝑑𝑎𝑎𝑑𝑑 Fr eque nc y Fr eque nc y

(13)

13

Results: 𝜃𝜃 on chromosome 1

𝜃𝜃� 𝜃𝜃� Fr eque nc y Fr eque nc y EM Method New Method

(14)

14

Simulation: Parameters

𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 0.15 𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑 = 0.05 𝑁𝑁 ∈#{30,100,1000} 𝜃𝜃 ∈ {0.01, 0.05, 0.10, 0.20, 0.40, 0.50}

LD Population size Recombination rate

(15)

15

Bias of paternal recombination rate Mean squared error (𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑)

Computation time Selection of results

(16)

16

Results: Bias of 𝜃𝜃

New Method EM Method New Method EM Method

𝜃𝜃 = 0.20; N = 30 𝜃𝜃� 𝜃𝜃 = 0.01; N = 30 𝜃𝜃�

(17)

17

Results: MSE of 𝐷𝐷

𝑑𝑑𝑎𝑎𝑑𝑑

𝑀𝑀𝑀𝑀 𝑀𝑀 𝐷𝐷� 𝑑𝑑𝑎𝑎 𝑑𝑑 𝜃𝜃

(18)

18

Results: Computation time

N=30 N=100 N=1000 FBN-Method Gomez-Raya tim e i n se conds 20 40 60 80 100 120 0 1 1 𝜃𝜃� 1

(19)

19

𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑

Results: Likelihood function

Two maxima of likelihood function Example: 𝑓𝑓1𝑑𝑑𝑎𝑎𝑑𝑑 = 𝑓𝑓 2𝑑𝑑𝑎𝑎𝑑𝑑 = 0.5 𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 0.15 𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑 = 0.05 𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 0.05 𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑 = 0.15

(20)

20 Simulation 𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 0.15 𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 0.05 𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑 = 0.05 𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑 = 0.15 𝜃𝜃 = 0.20 𝜃𝜃 = 0.40 𝑓𝑓𝐴𝐴𝐵𝐵 = 0.40 𝑓𝑓𝑎𝑎𝐵𝐵 = 0.10 𝑓𝑓𝑎𝑎𝑏𝑏 = 0.40 𝑓𝑓𝐴𝐴𝑏𝑏 = 0.10 recalculation Complementary case 𝑓𝑓𝐴𝐴𝐵𝐵 = 0.30 𝑓𝑓𝑎𝑎𝐵𝐵 = 0.20 𝑓𝑓𝑎𝑎𝑏𝑏 = 0.30 𝑓𝑓𝐴𝐴𝑏𝑏 = 0.20

Results: Complementary case

(21)

21

𝜃𝜃� 𝜃𝜃=0.20; n=100 𝜃𝜃�

Results: Simulation of complementary case

𝜃𝜃=0.40; n=100

(22)

22

Outlook and summary

Simulation

The New Method had more accurate estimates for higher recombination rates

New Method

Based on a minimization approach

(23)

23

Empirical data set

The comparison of results showed that both methods had a

similar distribution of the maternal LD but the distribution of the recombination rate differed.

(24)

24

Maximization function

Two possible solutions  two maxima Found in both New and EM Method

Final solution depends on starting values

(25)

25

Next steps

Criterion for distinction of both maxima (likelihood values) Estimation of good starting values

(combination of minimization and maximization approach)

(26)

26 𝑔𝑔2: = 16𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑 + 4𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 1 − 2𝑓𝑓̂1𝑑𝑑𝑎𝑎𝑑𝑑 1 − 2𝑓𝑓̂1𝑑𝑑𝑎𝑎𝑑𝑑 𝑔𝑔1: = 𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 + 𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑 𝑐𝑐𝑐𝑐𝑐𝑐� 𝑑𝑑𝑜𝑜𝑑𝑑 = 16𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑 + 4𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 1 − 2𝑓𝑓̂1𝑑𝑑𝑎𝑎𝑑𝑑 1 − 2𝑓𝑓̂1𝑑𝑑𝑎𝑎𝑑𝑑 𝑐𝑐𝑐𝑐𝑐𝑐� 𝑎𝑎𝑑𝑑𝑑𝑑 = 𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 + 𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑 𝑄𝑄 𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠, 𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑 = 𝑐𝑐𝑐𝑐𝑐𝑐� 𝑎𝑎𝑑𝑑𝑑𝑑 − 𝑔𝑔1 2 + 𝑐𝑐𝑐𝑐𝑐𝑐� 𝑑𝑑𝑜𝑜𝑑𝑑 − 𝑔𝑔2 2 arg min(𝑄𝑄) 𝐷𝐷𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠,𝐷𝐷𝑑𝑑𝑑𝑑𝑑𝑑 = 𝑄𝑄 𝐷𝐷 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠, 𝐷𝐷𝑑𝑑𝑎𝑎𝑑𝑑 Minimization approach

Appendix: New Method

(27)

Leibniz-Institut für Nutztierbiologie FBN Wilhelm-Stahl-Allee 2 18196 Dummerstorf contact Hampel, Alexander Telefon: +49 38208 68 908 E-Mail: hampel@fbn-dummerstorf.de Internet: www.fbn-dummerstorf.de

References

Related documents

All four elements are concentrated in the digestive glands and viscera of the molluscs, but it seems probable that the gallium, aluminium and iron contents of these organisms

Any number of DEFINITY AUDIX subscribers can be administered to use the Lucent I NTUITY Message Manager feature either on the Class of Service screen or on the Subscriber

procurement, sustainable construction, communication management, conflict and dispute management, claims management, indoor air quality management, team management and

kawalan untuk mencapai prestasi kerja yang tinggi. Organisasi yang mengamalkan kearifan tempatan boleh membantu para pekerja menggunakan pelbagai kompetensi, integriti,

I n this paper, we discuss the existence as well as uniformly global attractivity results on unbounded intervals of functional differential equation through application

BAI, Beck Anxiety Inventory; BDI, Beck Depression Inventory II; ESS, Epworth Sleepiness Scale; mTBI, mild traumatic brain injury; PSQI, Pittsburgh Sleep Quality Index.. on September

Operational Loss Distributions In the same way as market and credit risk, if operational risk capital is to be assessed by a profit and loss distribution, it may be thought of as

According to the second re- search hypothesis, although the staff was required to enter the data into the LMS on a daily basis and school principals could easily supervise