Introduction to bioinformatics
• Bioinformatics is the science concerned with the development and application of computer hardware and software to the acquisition, storage, analysis, and visualization of biological information.
• It has the following three component.
- The development of new algorithms and statistics for assessing the relationship among large sets of biological data.
e.g DNA Sequence data.
- Application of these tools for the analysis and interpretation of the various biological data. e.g nucleotide sequences, amino acid sequences.
-The development of database of database for an efficient storage, access and management of various biological information.
• The ‘bioinformatics’ is a combination of
‘biology’ and informatics.
• Informatics is the science of how to use data, information and knowledge to improve human health and the delivery of health care services.
• Bioinformatics derives knowledge from computer analysis of biological data.
• These can consist of the information stored in the genetic code, but also experimental results from various sources, patient statistics, and scientific literature.
• Research in bioinformatics includes method development for storage, retrieval, and analysis of the data.
• Bioinformatics is a rapidly developing branch of biology and is highly interdisciplinary, using techniques and concepts from informatics, statistics, mathematics, chemistry, biochemistry, and physics.
• It has many practical applications in different areas of biology and medicine.
DNA Sequences
• The symbols used to represent DNA sequence data.
• The four bases are denoted by single letters A (Adenine), C (cytosine ), G (guanine), and T (Thymine)
• For example , the sequence data may indicate that the base present at a specific position may be either G or A, it is purine.
• Similarly , if a position may have either C or T, it is pyrimidine.
• The base sequence of the two complementary strands of a DNA molecules are represented by this system of symbols.
Amino Acid Sequences of Proteins
• The amino acids were conventionally represented by three- letters symbols. e.g. Ala for alanine, Val for valine, etc.
• But in Bioinformatics, they are denoted by single letter, e.g A for alanine C for cyctine, D for aspartics acid, etc.
• But some position in protein sequences have ambiguities this situation is comparable to that for DNA sequences.
•
Single letter code Amino acid Three letter Code
A Alanine Ala
B Asparagine Asx
C Cystine Cys
D Aspartic acid Asp
E Glutamic Acid Glu
F Phenylanine Phe
G Glycine Gly
H Histidine His
I Isoleucine Ile
K Lysine Lys
L Leucine Leu
M Methionine Met
Conti...
Single letter code Amino acid Three letter Code
N Asparagine Asn
P Proline Pro
Q Glutamine Glu
R Arginine Arg
S Serine Ser
T Threonine Thr
V Valine Val
W Tryptophan Trp
Y Tyrosine Tyr
Z Glutamic acid Glx
X Any amino acid Xaa
Branches of Bioinformatics
• A living cell is a system where cellular components such as genome, the gene transcript, and the proteins interact with each other, and these interactions determine the fact of the cell. e.g Whether a stem cell is going to become a liver cell or a cancer cell.
The three branches of bioinformatcs...
1. Genomics
2. Transcriptomics
3. Proteomics
Genomics
Makes Trancriptomics
Makes
Proteomics
The three major branches of Bioinformatics DNA
RNA
Protein
Genomics
• Genomics is the study of whole genomes of organisms,
and incorporates elements from
genetics. Genomics uses a combination of recombinant DNA, DNA sequencing methods, and bioinformatics to sequence, assemble, and analyse the structure and function of genomes.
• Genomics play a significant role in modern biological research in which the nucleotide sequences of all the chromosomes of an organism are mapped and the location of different genes and their sequence are determined.
Transcriptomics
• Transcriptomics is the study of the transcriptome, which includes the whole set of mRNA molecules in one or a population of biological cells.
• This study helps us to depict the expression level of genes, often using techaniques such as DNA microarrys, that is capable of sampling ten thousands of different mRNAs at a time.
• This kind of new technique has helped biologist to routinely monitor the gene expression between the control cells and treatment cells.
Proteomics :
The proteome is the entire set of proteins that is, or can be, expressed by a genome, cell, tissue, or organism at a certain time. It is the set of expressed proteins in a given type of cell or organism, at a given time, under defined conditions.
Proteomics is the study of the proteome.
• Proteomics represents the earliest to identify a major sub- class of cellular components, the proteins and their interactions.
• Proteomics involves the sequencing of amino acid in a protein determining its 3D structure and relating it to the function of the protein.
Genomics
Transcriptomics
Proteomics
Metabolomics
What is Bioinformatics?
The newest, fastest growing specialty in the life sciences that integrates
biotechnology and computer science.
Computers aid to collect, analyze, and interpret biological information at the molecular level.
Bioinformatics encompasses a set of software tools that aid in:
molecular sequence analysis,
structural analysis
functional analysis
of genes & genomes and their corresponding products
Understand a living cell and how it functions at molecular level
Develop data basses and computational tools
Tools are used to mine (analyze) databases to generate knowledge to better understand the living systems
Goal of Bioinformatics
Biological Data basses : Why
Why?
Store all the data(information) related to Genomics, Transcriptomics, preoteomics, Metabolomics in Data Bases
Make biological data available to scientists.
To make biological dataavailable in computer-readable form.
Types of Databases
Primary Databases: Store raw DNA/RNA and protein data submitted by scientists
GenBank: by NCBI USA www.ncbi.nlm.nih.gov/genbank/
EMBL: European : www.ebi.ac.uk/embl/
DDBJ: Japan www.ddbj.nig.ac.jp/
PDB: Protein Data bank http://www.rcsb.org/pdb/home/home.do
Anticipated Benefits of
Genome Research & Bioinformatics
Molecular Medicine : Gene Testing ,
Pharmacogenomics Gene Therapy
improve diagnosis of disease
detect genetic predispositions to disease
create drugs based on molecular information
use gene therapy and control systems as drugs
design “custom drugs” (pharmacogenomics) based on individual genetic profiles
Microbial Genomics
rapidly detect and treat pathogens in clinical practice
develop new energy sources (biofuels)
monitor environments to detect pollutants
protect citizenry from biological and chemical warfare
clean up toxic waste safely and efficiently
DNA Identification (Forensics)
identify potential suspects whose DNA may match evidence left at crime scenes
establish paternity and other family relationships
identify endangered and protected species as an aid to wildlife officials (could be
detect bacteria and other organisms that may pollute air, water, soil, and food
match organ donors with recipients in transplant programs
determine pedigree for seed or livestock breeds
Benefits : …contined
Benefits …cont
Agriculture, Livestock Breeding, and Bioprocessing
grow disease-, insect-, and drought-resistant crops
breed healthier, more productive, disease-resistant farm animals
grow more nutritious produce
develop biopesticides
incorporate edible vaccines incorporated into food products
develop new environmental cleanup uses for plants like tobacco