• No results found

Hierarchical Bayesian Modeling of the HIV Response to Therapy

N/A
N/A
Protected

Academic year: 2021

Share "Hierarchical Bayesian Modeling of the HIV Response to Therapy"

Copied!
22
0
0

Loading.... (view fulltext now)

Full text

(1)

Hierarchical Bayesian Modeling of the HIV

Response to Therapy

Shane T. Jensen

Department of Statistics, The Wharton School, University of Pennsylvania

March 23, 2010

(2)

Therapy: Disrupting the HIV infection cycle

Drugs are a popular medical strategies for keep viral load down by disrupting infection cycle of HIV

Drug therapies: drugs designed to bind to surface of HIV

and prevent it from attaching to target cells We will model a promising new type of treatment

Antisense gene therapy: allow HIV to bind to target cell and release viral RNA, but then attack viral RNA directly before it can be integrated into target cell genome HIV will try toevolve(change either protein or RNA) to escape this therapy

(3)

Drug Therapy versus Gene Therapy Illustration

!"#"$%&"'()*$

+',-$%&"'()*$

HIV Virion HIV Drug Therapy nucleus cytosol

HIV viral RNA

HIV Gene Therapy

(4)

Mutation and Recombination

Primary mechanisms for evolution:

Mutation: change of identity of a single RNA nucleotide.

Can also delete nucleotides

Recombination: two viral RNA sequences are spliced to produce a hybrid sequence.

!"#$"#"%

!"$$"#"%

!"#$"#"%

!"$$"#"%

HIV has one of the highest rates of mutation/recombination of any organism ever seen

(5)

Population Genetics

Due to high mutation and recombination rates, individuals infected with HIV can have several distinct HIVstrains

Need to model HIV as apopulation of sequences, not a specific sequence, that evolves in response to therapy

(6)

Issues with Evolutionary Response

Goal is to model the evolutionary response (through mutation and recombination) of HIV to therapy

Three crucial components of problem must be addressed:

Mutation vs. recombination

Rates for both processes of sequence change must be modeled simultaneously

Spatial heterogeneity

Therapies target specific regions of HIV genome and so evolution could also be in specific locations

Two sample comparison

Real interest is differences in mutation and recombination rates between treatment and control sequences

(7)

Overview of our Approach

Thecoalescent with recombination: a population

genetics model we build upon to model mutation and recombination rates for a population of sequences

We expand previous coalescent-based approaches to allow changes at nucleotide level (instead of protein level)

Blocking structurefor mutation and recombination rates

Allows for spatial heterogeneity while still sharing information between neighboring sequence regions

Hierarchical prior distributionhandles two sample

structure

Allows for differential treatment effect while pooling information between treatment and control sequences

(8)

Notation

Dataare aligned nucleotide sequencesH= (HC,HT)

HC= (hC

1, . . . ,hCn)are control sequences of lengthL

HT = (hT

1, . . . ,hTm)are treatment sequences of lengthL Parametersof interest areΘ= (ρC,ρT,µC,µT)

ρare recombination rates (treatment and control)

µare mutation rates (treatment and control)

All rates also vary spatially along length of sequences

AncestryGrelates all sequences to each other

(9)

Coalescent model for sequence history G

Coalescent: sequences coalesce into common lineages

back to their most recent common ancestor

!" #" $" %" &"

'()(*+,-)"."/01**().2" '()(*+,-)".3!" '()(*+,-)".3#" '()(*+,-)".3$"

Mutationratesµeasy to build into coalescent modelG

(10)

Estimation with Coalescent

Sequence ancestryGis not of direct interest: goal is mutation and recombination ratesΘ

Maximum likelihood estimation

sup

Θ,G

p(H|Θ,G)

orintegrationover all possible ancestriesG

p(H|Θ) =X

G

p(H|Θ,G)p(G)

are both very difficult tasks given thelarge spaceofG

Even for a relatively small number of sequences, such as 100, the space of possibleGis huge.

(11)

Product of Approximate Conditionals

Marginal likelihoodp(H|Θ)intractable over allG

PAC (Product of approximate conditionals) likelihood p(H|Θ) = p(h1|Θ)p(h2|h1,Θ)p(h3|h2,h1,Θ)· · ·

≈ pˆ(h1|Θ) ˆp(h2|h1,Θ)ˆp(h3|h2,h1,Θ)· · ·

Approximate sequencehk+1as a mosaic of sequences

(h1, . . . ,hk)generated by ahidden Markov model

ˆ

p(hk+1|h1:k,Θ)calculated using forward summing algorithm for HMMs. Depends on ordering of sequences so average calculation over several different orderings

(12)

Structure on Mutation and Recombination

PAC likelihood allows us to more easilyintegrate out

ancestryGso we can focus on modeling of parametersΘ

Θincludes mutation ratesµand recombination ratesρ Now need additional structure in our model to address:

Spatial heterogeneity: different rates for mutation and recombination in different sequence regions

Two-samples comparison: want to estimate differential rates between treatment vs. control populations

Should allow us to estimate differential evolution response to therapy

(13)

Hierarchical Blocking Structure

Hierarchical prioron mutationµand recombinationρ

Rates vary along sequence inpiece-wise constantway e.g. Bµcontiguous blocks(µ1, . . . , µBµ)of mutation rates

(14)

Blocking Structure Example

Grand central mutation and recombination rates (gray) Central mutation/recombination rate for each block (blue) Treatment and control rates around central rate (black)

(15)

Model Implementation

PAC likelihoodpˆ(H|Θ)gives sequencesHas function of mutation and recombination ratesΘ

Hierarchical prior distributionP(Θ)for spatial heterogeneity and two sample comparison Focus onposterior distributionfor inference:

p(Θ|H)∝pˆ(H|Θ)p(Θ)

MCMC implementation: Gibbs and Metropolis-Hastings moves for most parameters as well as reversible jump moves for the blocking structure

(16)

MCMC moves

1 Reversible jump moves for blocking structure:

1 Choose block uniformly to split or merge with a neighbor 2 Move block boundary to the left or right

2 Gibbs moves for rate parameters 1 Sample treatment vs control rates(µT

j , µCj )for each block

2 Sample central mutation rateµj for each block 3 Sample grand central mutation rateµ0

4 Sample variance of treatment and control mutation ratesσµ2 5 Sample variance of central mutation ratesσ2

µ0

3 Same set of blocking and rate moves for recombination 4 MH move for transition/transversion ratioκ

(17)

Application to Antisense Gene Therapy

VIRxSYS gene therapy: data generatedin vitrofrom a sample of wt-HIV that were exposed to VIRxSYS gene therapy and a control sample

Focus ontreatment effects: differential mutation rate

µT −µCand recombination rateρT −ρCfor each location along the sequence

(18)

Mutation Treatment Effect of Antisense Gene Therapy

The large increase in mutation overlaps with the antisense target region.

Increases in mutation to the left of antisense target region are consistent with other gene therapy studies.

(19)

Recombination Treatment Effect of Antisense Gene Therapy The area of decreased recombination corresponds to the area of increased mutation, but it is not significant.

Wide posterior intervals in part because recombination does not seem to have a strong spatial signal

(20)

Simpler Approaches for Mutation

Simplest approach would just be to examine mutation directly throughsegregating sites

Segregating sites are nucleotide locations where at least one sequence in the sample differs from the others.

A

!

GA

T

TACA

!

CAT

!

ATT

AC

C

A

!

GA

C

TACA

G

CAT

!

ATT

AC

C

A

!

GA

T

TACA

!

CAT

!

ATT

GC

C

A

!

GA

T

TACA

!

CAT

!

ATT

AC

C

A

!

GA

T

TACA

!

CAT

!

ATT

AA

C

(21)

Comparison to Segregating Sites

Compare segregating sites (blue = treatment, red = control) to posterior differential mutation rateµT −µC

Higher densityof segregating sites around elevated

mutation area, but our model allows sharing of information between closely located sites

(22)

Summary

Our sophisticated model allows us to measure viral

evolutionary changethrough spatially-varying

recombination and mutation at the nucleotide level. Our model measures pairwise differences in mutation and recombination between treatment and control groups, allowing estimation of spatially varying treatment effects. Our methodology able to detectbiologically relevant signal in two HIV applications:

Identified drug-resistant mutations in Enfuvirtide drug therapy

Detect elevated mutation rates that overlap with antisense target in VIRxSYS gene therapy

References

Related documents

Economic development in Africa has had a cyclical path with three distinct phases: P1 from 1950 (where data start) till 1972 was a period of satisfactory growth; P2 from 1973 to

The operational definition is the model you can test for reliability and validity using the tools of science.. Once validated at some level, the operational definition could then

The paper assessed the challenges facing the successful operations of Public Procurement Act 2007 and the result showed that the size and complexity of public procurement,

RF call drop due to uplink failure: RF call drop due to uplink failure is a kind of a call drop where the link failure occurs when the network site is unable to decode a

bursts and hence the length of the newly created leader segments as the background electric field

In the last subsection, it was mathematically shown that the proposed modeling and test generation approaches are efficient, based on the assumptions that 14 chosen products of

Figure 2 COPD severity at initial spirometry-confirmed diagnosis in patients with COPD of global Initiative for Chronic Obstructive Lung Disease (gOLD) stage I or higher (n =

Scatter plot of the observed (empty boxes) values of the Einstein time and of the expected values of the median T E , 50 % (filled stars), with respect to the self-lensing optical