• No results found

Multiple Sequence Alignment. Hot Topic 5/24/06 Kim Walker

N/A
N/A
Protected

Academic year: 2021

Share "Multiple Sequence Alignment. Hot Topic 5/24/06 Kim Walker"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

Multiple Sequence Alignment

Hot Topic 5/24/06

Kim Walker

(2)

Outline

n 

Why are Multiple Sequence Alignments

useful?

n 

What Tools are Available?

n 

Brief Introduction to ClustalX

n 

Tools to Edit and Add Features to your

(3)

Why Do a Multiple Sequence

Alignment?

n 

Multiple Sequence Alignment can reveal

sequence patterns

n  Demonstration of homology between >2

sequences

n  Identification of functionally important sites

n  Protein function prediction

n  Structure prediction

n  Search for weak but significant similarities in

database

n  Design PCR primers for related gene identification

(4)

Types of MSA

n 

Global Alignment - Attempts to align as many

characters as possible from end to end

n  Optimal Global Alignments - Dynamic

Programming

n  Global progressive - Match most common

sequences together

n  Iterative multiple - Multiple re-building attempts to

find best alignment

n 

Local Alignment - Attempts to align regions of

sequences with the highest density of

matches, creating >1 islands of

sub-alignments

(5)

Multiple Sequence Alignment

Programs

n 

Many resources exist either as web tools or

free programs you can download and install

n 

Global Alignment Programs

n  ClustalX - window interface for ClustalW

(EMBOSS) MSA tool, program can be download

from here: ftp://ftp-igbmc.u-strasbg.fr/pub/ClustalX/

n  Multialign - web tool using hierarchical clustering:

http://prodes.toulouse.inra.fr/multalin/multalin.html

n 

Local Alignment Programs

n  Block Maker

http://bioinfo.weizmann.ac.il/blocks/blockmkr/www/ make_blocks.html

(6)

Getting Started with ClustalX

n 

Input sequences: Input files should be in

FASTA format saved using a text based

editor (not MS Word)

>Mouse_Oct4 AGTCGATTGCA

You can either have all your sequences in one

file using File ->Load Sequences

Or add sequences one at a time using

File->Append Sequences

Note: ClustalX doesn’t like

(7)

Output Options

n 

Alignment -> Output Format Options

n  What do you plan to do with this alignment

afterwards? ClustalX vs. MSF format of output file.

n 

Color - what colors do you want your nt or

aa?

n  Defaults: DNA - A orange, C red, T blue, G green

n  Protein - GPST orange, HKR red, FWY blue, ILMV

green

By changing the coldna.par or colprot.par files - you can highlight different AA properties or

(8)

Alignment

n 

Alignment -> Alignment Parameters

n  There are many options here that advanced users

can change to alter how ClustalX calculates the alignment

n 

Alignment -> Do Complete Alignment

n  ClustalX will ask where to save the Guide Tree

File (which can be opened using TreeView) and the Alignment file itself

n  Note - the Tree file is Guide tree, not a phylogenetic

(9)

Alignment cont

n 

Symbols

n  * indicates positions which have a single, fully

conserved residue

n  : indicates that one of the following ‘strong’

groups is fully conserved: STA, NEQK, NHQK,

NDEQ, QHRK, MILV, MILF, HY, FYW

n  . indicates that one of the following 'weaker'

groups is fully conserved: CSA, ATV, SAG, STNK, STPA, SGND, SNDEQK, NDEQHK, NEQHRK, FVLIM, HFY

n  An amino acid substitution matrix is used to

(10)

Profile Alignment

n 

Create a new alignment using an existing

alignment or a profile as a guide.

n 

How to?

n  Load sequences of Profile 1

n  Do alignment

n  Switch to Profile Alignment Mode

n  Load sequences of Profile 2.

n  Now go to Alignment -> Align sequences to Profile

1, which will take 1 sequence at a time and align it to Profile 1

n  Alternatively, Align Profile 2 with Profile 1, in which

(11)

Are alignment programs 100%

right?

n 

No alignment program can take the

place of an educated human eye.

n 

Use programs like ClustalX as a guide,

but don

t be afraid to adjust the

alignment as you see fit.

n 

Jalview is one program you can use to

(12)

What is Jalview?

n 

Jalview is an alignment editor that can be

used for viewing, editing, analysis, annotation

and publishing. Jalview can be download for

free here: http://www.jalview.org/

n 

Documentation:

http://www.jalview.org/refcard.pdf

n 

Since Jalview is Java based, users will have

troubles installing and using Jalview without

the newest version of Java. For instance,

Mac users, The newest Jalview will only work

with OS 10.4 (Tiger).

(13)

Jalview cont.

n 

Jalview can be used with nucleotide

and protein sequences, but there are

more features for protein sequence

analysis

(14)

Jalview - Getting Started

n 

File -> Input Alignment -> From File

n  Choose which file type your alignment is

n  Will notify when sequences are successfully

loaded

n 

Within alignment window

n  Color -> choose color scheme according to

program

(15)

Basic Editing

n 

Inserting or deleting gaps

n  Single sequence - Press ‘shift’ and left mouse

button, drag either left or right

n  Group of sequences - Press ‘control’ and left

mouse button, drag either left or right

n 

Draw a Box, highlighting a sequence area

n  Click and drag your mouse over the sequence

area.

n  Right click on the highlighted area and a menu will

appear.

n  Groups -> Group color or Border color, You can

change the color of the group and of the border of the box.

(16)

Jalview Advanced

n 

Jalview allows you to run:

n 

Jnet - a neural network secondary

structure prediction method protein

prediction

n 

ClustalW - the general multiple

sequence alignment program in which

ClustalX is based

(17)

Export

n 

File -> Export -> HTML, PNG, or EPS

n 

Export file type depending on what you

(18)

References

n 

Bioinformatics: Sequence and Genome

Analysis. David W. Mount. CSHL Press, 2001

and 2004.

n 

From Sequence to Multiple Alignment,

http://iona.wi.mit.edu/bio/education/how-to-ma.html

n 

Spring Course Notes

http://jura.wi.mit.edu/bio/education/index.html

n 

Questions? Need Help?

References

Related documents

for RadiNET Pro Starter Edition Compatible Monitors          RadiForce Monitor (Built-In Swing Sensor models not included ) Compatible Software RadiCS version 3.4.0

Registration begins for first SHRM-CP and SHRM-SCP exam window Jan 2015 Streamlined Recertification Process Competency- Based Universal Reduced Barriers to Participation

Global Economic Crises, Migration, and Remittances: China and Southeast Asia during the Great Depression of the 1930s.. Kee-Cheok Cheong*, Kam-Hing Lee** and Poh-Ping Lee***

Barriers, enablers and initiatives for uptake of advance care Barriers, enablers and initiatives for uptake of advance care planning in general practice: A systematic review

- give rise to a small number of different cell types, typically generate the cell types of the tissue in which they reside..

When opening a previously saved file (horizontal alignment, vertical alignment, soft foot, offset, machine train, etc.) you are allowed to open that file in the original

The purpose of this study was to identify the association between performance on a dynamic risky decision making task and white matter microstructure in a sample of 48

Slide Show screen If you want to create additional slide shows, there are several possibilities: Either click the Create New button on the slide shows browsing screen, or click the