• No results found

plicing out introns) part one

In document python for biologists (Page 56-61)

'n this eer!ise, ere being as-e% to pro%&!e a program that %oes the 9ob of a spi!eosome 5 spits a D;4 se<&en!e at to spe!ifie% o!ations to ma-e three pie!es, then 9oin the o&ter to pie!es together1.

+ets start by spitting the se<&en!e &p into three bits. e have to &se the

s&bstring notation from earier in the !hapter, an% e nee% to ta-e !are ith the n&mbers. e -no that if e give a stop position for a s&bstring then it i go on to the en% of the inp&t string, so rather than fig&re o&t the position of the en% of the se<&en!e, e 9&st be ay an% &se a big n&mber. 7eres the !o%e the first ine,  here e store the D;4 se<&en!e in the variabe my_dna, is too ong to fit on one

ine on the page, so it oo-s i-e its sprea% o&t over m&tipe ines=:

BI Chapter 2: #rinting an% manip&ating tet my/dna B *A)&%A)&%A)&%A)&%A&)%A&)A%)&A)A%&)A)%&A)%)A%&)A&)&%A)&%A)&%A)&%A)&%A)&%A)& %A)&%A)&%A)&A)%&)A)&A)&%A)&%A)A)&%A)%&A)&%A&)A&)A)* exon1 B my/dna$1!"3 exon2 B my/dna$J1!10000 print(exon1 C exon2#

$he o&tp&t from this !o%e oo-s vag&ey right:

)&%A)&%A)&%A)&%A&)%A&)A%)&A)A%&)A)%&A)%)A%&)A&)&%A)&%A)&%A)&%A)&A)&%A)&%A) A)&%A)%&A)&%A&)A&)A)

b&t hen e oo- more !osey e !an see that something is not right. $he printe% !o%ing se<&en!e is s&ppose% to start at the very first !hara!ter of the inp&t

se<&en!e, b&t its starting at the se!on%. e have forgotten to ta-e into a!!o&nt the fa!t that #ython starts !o&nting from ero, so o&r n&mbers are a too high by one. +ets try again:

my/dna B *A)&%A)&%A)&%A)&%A&)%A&)A%)&A)A%&)A)%&A)%)A%&)A&)&%A)&%A)&%A)&%A)&%A)&%A)& %A)&%A)&%A)&A)%&)A)&A)&%A)&%A)A)&%A)%&A)&%A&)A&)A)* exon1 B my/dna$0!"2 exon2 B my/dna$J0!10000 print(exon1 C exon2#

;o the o&tp&t oo-s !orre!t 5 the !o%ing se<&en!e starts at the very beginning of the inp&t se<&en!e:

A)&%A)&%A)&%A)&%A&)%A&)A%)&A)A%&)A)%&A)%)A%&)A&)&%A)&%A)&%A)&%A)&A)&%A)&%A )A)&%A)%&A)&%A&)A&)A)

B@ Chapter 2: #rinting an% manip&ating tet

plicing out introns) part to

$his is a straightforar% pie!e of n&mber!r&n!hing. $here are a !o&pe of ays to go abo&t it. e !o&% &se the eon startstop !oor%inates to !a!&ate the ength of the !o%ing portion of the se<&en!e. 7oever, sin!e eve area%y ritten the !o%e to generate the !o%ing se<&en!e, e !an simpy !a!&ate the ength of it, an% then %ivi%e by the ength of the inp&t se<&en!e:

my/dna B

*A)&%A)&%A)&%A)&%A&)%A&)A%)&A)A%&)A)%&A)%)A%&)A&)&%A)&%A)&%A)&%A)&%A)&%A)& %A)&%A)&%A)&A)%&)A)&A)&%A)&%A)A)&%A)%&A)&%A&)A&)A)*

exon1 B my/dna$0!"2

exon2 B my/dna$J0!10000

coding/length B len(exon1 C exon2# total/length B len(my/dna#

print(coding/length  total/length#

$he o&tp&t shos that ere neary right:

0.772377237723G

e have !a!&ate% the !o%ing proportion as a fra!tion, b&t the eer!ise !ae% for a per!entage. e !an easiy fi this by m&tipying by 100. ;oti!e that the symbo for m&tipi!ation is not x, as yo& might thin-, b&t 4. $he fina !o%e:

my/dna B

*A)&%A)&%A)&%A)&%A&)%A&)A%)&A)A%&)A)%&A)%)A%&)A&)&%A)&%A)&%A)&%A)&%A)&%A)& %A)&%A)&%A)&A)%&)A)&A)&%A)&%A)A)&%A)%&A)&%A&)A&)A)*

exon1 B my/dna$0!"2

exon2 B my/dna$J0!10000

coding/length B len(exon1 C exon2# total/length B len(my/dna#

print(100 K coding/length  total/length#

0 Chapter 2: #rinting an% manip&ating tet

77.2377237723G

atho&gh e probaby %ont reay re<&ire that n&mber of signifi!ant fig&res. 'n !hapter  e i earn ho to format the o&tp&t ni!ey.

plicing out introns) part t+ree

$his so&n%s <&ite tri!-y, b&t e have area%y %one the har% bit in part one. 4 e nee% to %o is etra!t the intron se<&en!e as e as the eons, !onvert it to oer !ase, then !on!atenate the three se<&en!es to re!reate the origina genomi!

se<&en!e: my/dna B *A)&%A)&%A)&%A)&%A&)%A&)A%)&A)A%&)A)%&A)%)A%&)A&)&%A)&%A)&%A)&%A)&%A)&%A)& %A)&%A)&%A)&A)%&)A)&A)&%A)&%A)A)&%A)%&A)&%A&)A&)A)* exon1 B my/dna$0!"2 intron B my/dna$"2!J0 exon2 B my/dna$J0!10000

print(exon1 C intron.loer(# C exon2#

+oo-ing at the o&tp&t, e see an &pper !ase D;4 se<&en!e ith a oer !ase se!tion in the mi%%e, as epe!te%:

A)&%A)&%A)&%A)&%A&)%A&)A%)&A)A%&)A)%&A)%)A%&)A&)&%A)&%A)&%A)&%atcgatcgatcg atcgatcgatcatgctA)&A)&%A)&%A)A)&%A)%&A)&%A&)A&)A)

hen e are appying severa transformations to tet, as in this eer!ise, there are &s&ay a n&mber of %ifferent ays e !an rite the program. For eampe, e

!o&% store the oer !ase version of the intron, rather than !onverting it to oer !ase hen printing:

1 Chapter 2: #rinting an% manip&ating tet

?r e !o&% avoi% &sing variabes for the introns an% eons a together, an% %o everything in one big print statement:

print(my/dna$0!"2 C my/dna$"2!J0.loer(# C my/ dna$J0!10000#

$his ast option is very !on!ise, b&t a bit har%er to rea% than the more verbose ay. 4s the eer!ises in this boo- get onger, yo& noti!e that there are more an% more %ifferent ays to rite the !o%e 5 yo& may en% &p ith so&tions that oo- very %ifferent to the eampe so&tions. hen trying to !hoose beteen %ifferent ays to rite a program, aays favo&r the so&tion that is !earest in intent an% easiest to rea%.

2 Chapter 3: ea%ing an% riting fies

3:

Reading and riting files

In document python for biologists (Page 56-61)