2.18 Bioinformatics tools
2.18.1 Sequence Analysis Tools
Basic Local Alignment search tool BLAST
(http://blast.ncbi.nlm.nih.gov/Blast.cgi).
This tool was established to discover (local) homology between two
sequences. Protein and nucleotide sequence databases can be used for a
given sequence of interest. This program calculates the statistical
significance of an alignment (Altschul, Gish et al. 1990).
The BLAST algorithm has many variation; BLASTN, BLASTP, BLASTX,
TBLASTN, mega BLAST and psi-BLAST. These different algorithms use
are according to the query input (nucleotide, protein or translated sequences)
with searches against a vast number of organism sequences.
MUMmer (http://www.tigr.org/software/mummer)
MUMmer 3.0 is open-source software that enables genome sequence
comparison of large genomes. MUMmer can align incomplete genomes
with the system. The graphical viewing tools afford different ways to
analyse genome alignments (Kurtz, Phillippy et al. 2004).
Artemis (http://www.sanger.ac.uk/Software/Artemis/v8/)
Artemis is a DNA sequence viewer and annotation tool that allows
visualisation of sequence features and the results of analyses within the
context of next generation data.
CLUSTALW (http://www.ebi.ac.uk/Tools/msa/clustalw2/)
CLUSTALW (1.83) is one of the most powerful programs used to achieve
multiple sequence alignments. This program allows the presentation of
multiple nucleotide and protein sequence alignments (Larkin, Blackshields
et al. 2007).
MUSCLE (http://www.drive5.com/muscle/)
MUSCLE (v3.6) is a computer program most widely used in biology to
create multiple sequence alignments of proteins. MUSCLE uses different
algorithms including fast distance estimation and progressive alignment.
The accuracy and speed of the program is better than CLUSTALW, since
hundreds of sequences can be aligned in seconds (Edgar 2004).
FigTree (http://tree.bio.ed.ac.uk/software/figtree/)
FigTree (v1.3.1) is a program for graphical viewing of phylogenetic trees.
The program was designed to show summarized and annotated trees formed
FastTree (http://www.microbesonline.org/fasttree/)
FastTree (v2.1.7) is an open-source software construct, which can infer
maximum likelihood phylogenetic trees from alignments of nucleotide or
protein sequences. Millions of alignments can be done in a reasonable
amount of time and memory (Price, Dehal et al. 2010).
Geneious (http://www.geneious.com/)
By using Geneious (v7.1.3) software, one can analyse integrated protein and
DNA sequences, perform BLAST and get access to public databases. The
most powerful analysis that can be done using this software is the sequence
alignments manageability for both pair-wise and multiple sequence
alignments and visualization of the sequence alignments. The alignment
results can be viewed as phylogenetic trees.
OrthoMCL (http://www.orthomcl.org/orthomcl/?rm=orthomcl)
OrthoMCL (v1.4) is one of the most commonly used programs to perform
identification of orthologous groups. In addition, access to these groups is
extremely important for study gene/protein evolution and comparative
genomics and genome annotation.
All against All BLASTP between species and within species with Markov
Cluster algorithm methods can be performed to find all orthologous groups
with any recent paralogs. Ortholog analysis by using OrthoMCL can be
applied with two genomes or it can be extensive to cluster orthologs from
multiple species in order to constructing orthologous groups (Li, Stoeckert
Mauve (http://gel.ahabs.wisc.edu/mauve)
Mauve (v2.3.1) software is a powerful package applied to determine the
presence of rearrangements and horizontal transfer in a genome. It is used
for the identification and alignment of conserved genomic DNA (Darling,
Mau et al. 2004). Mauve alignments were used in this study to draw
comparison between whole genomes as well as examine the reasons of
rearrangements within genomes of E. faecium.
BRIG (http://sourceforge.net/projects/brig/)
The BLAST Ring Image Generator BRIG (v1.0) is a desktop application
written in Java 1.6. This application was used in genome comparisons and
generates a circular image for the genome. The comparison in this
application depends on the Basic Local Alignment Search Tool (BLAST)
and CGView for image rendering. For generating genomes maps in BRIG in
this study DNA or protein files were used.
MeV (http://www.tm4.org/mev.html)
Multi experiment Viewer MeV (v10.2) is a beneficial microarray data
analysis tool, including high-level algorithms for statistical analysis,
classification, clustering, visualization, and biological argument discovery
(Chu, Gottardo et al. 2008). MeV was used in this study for clustering
Unipro UGENE (http://ugene.unipro.ru)
UGENE (v1.11.5) is open-source software that can be used as a
multiplatform software. It offers visualization of annotated genome
sequences, multiple sequence alignments and phylogenetic trees
(Okonechnikov, Golosova et al. 2012). In this study UGENE software was
used to identify and map the repetitive units in the genomes.
Phenolink (http://bamics2.cmbi.ru.nl/websoftware/phenolink/)
Phenolink is a web-tool to identify genetic links between phenotypes. It uses
~omics technologies that connect phenotypes with high-throughput
molecular biology information. The purpose is to see through cellular
mechanisms underlying an organism's phenotype (Bayjanov, Molenaar et al.
2012). A default parameter was used to identified E. faecium phenotypes.
CRISPRs Finder (http://crispr.u-psud.fr/Server/CRISPRfinder.php)
CRISPRFinder is a free access web service. CRISPRs stands for Clustered
regularly interspaced short palindromic repeats. Five tools are available in
CRISPRs Finder, which can be used for:
1. Detecting very short CRISPRs that consist of one or two motifs.
2. Identifying highly conserved regions (DR) and extracting similarly sized
unique sequences, which lie between the DRs called spacers.
3. Obtaining the AT-rich leader sequence, which flanks the CRISPR cluster
on one side.
5. To identify the highly conserved regions (DR) are present in other
prokaryotic sequenced genomes (Grissa, Vergnaud et al. 2007).
Island Viewer (http://www.pathogenomics.sfu.ca/islandviewer)
IslandViewer is a freely accessible web service that provides detection of
gene clusters likely to be of horizontal origin, called Genomic islands (GIs).
These clusters contain genes such as virulence, antibiotic resistance or other
important adaptation genes. IslandViewer uses a graphical interface that
allows easy viewing and the island data of both the chromosome and the
gene level can be downloaded. The server uses three methods to identify the
GI regions. IslandPick; comparative genomic GI prediction method to
advance stringent data sets of GIs and non-GIs, SIGI-HMM; This method
measures codon usage to identify possible GIs by using Hidden Markov
Model (HMM). Finally, IslandPath-DIMOB; this method visualises several
common characteristics of GIs such as abnormal sequence composition or
the occurrence of genes that are functionally related to mobile elements
(Langille and Brinkman 2009).
PHAST (http://phast.wishartlab.com)
PHAST is a fast web server used to distinguish, annotate and graphically
present prophage sequences and prophage features within bacterial genomes
IS Finder (https://www-is.biotoul.fr// )
IS Finder is a database provides a list of insertion sequences elements
isolated from Eubacteria and Archaea. The IS elements in this database are
defined in individual files which contains their general features such as
name, size and family plus their DNA and protein sequences. In addition,
for the comparison an on-line BLAST search is available.