4.1
S y n op sis
T o ad d ress the c h allen g es id e n tifie d in se c tio n 3 .5 , it is p ro p o sed th at sem an tic rela tio n sh ip s b a se d o n th e p ro p e rtie s o f c o n c e p ts m ay solve m an y o f th e d a ta in te g ratio n p ro b le m s in th e b io in fo rm a tic s dom ain. T h is c h ap ter starts b y in tro d u c in g c o m p a ra tiv e g en o m ics, its im p o rtan ce as a d o m ain , an d v a rio u s ty p es o f b io lo g ic a l relatio n sh ip s. T h e p ro p o se d S o ft L in k M o d el (S L M ) a p p ro a c h is th e n in tro d u ced , in w h ic h in te g ratio n is b a sed o n rela tio n sh ip s b e tw e e n c o n c e p ts, n o t ju s t on fleld -v alu es. A featu re o f the SL M a p p ro a c h is th a t th e u se r can cu sto m ize th e lin k ag e o f d a ta so u rces, b y c re a tin g h is/h e r o w n S oft L in k M o d el, w h ic h reflects a lin k ag e to b e in v e s tig a te d in th e research.
4.2
C o m p a ra tiv e g en o m ics
C o m p arativ e g e n o m ics is th e stu d y o f re la tio n sh ip s b e tw e e n g en o m es o f d ifferen t sp ecies an d th e a n aly sis an d c o m p a ris o n o f th ese g en o m es [100]. It is u su a lly u n d e rta k en to d isc o v e r n e w p ro p e rtie s o f genes.
C o m p arativ e g e n o m ics o ffers o p p o rtu n itie s to d ra w on in fo rm atio n from h isto rically d istin c t d isc ip lin e s, to lin k d isp a rate b io lo g ical k in g d o m s, and so b rid g e b a sic a n d a p p lie d scien ce. C ro ss-sp ecies c o m p ariso n s are in cre asin g th e u n d e rs ta n d in g o f h o w genes are stru ctu red , an d h o w g en e stru c tu re re la te s to g en e fu n ctio n , and h o w ch an g es in D N A h av e c o n trib u te d to th e p la n e t’s b io lo g ical d iv ersity
[150]. T h is h a s led to n ew c o m p u ta tio n a l m eth o d s b e in g d ev elo p ed th a t in v estig ate c h ro m o so m al o rg a n isa tio n , stru ctu re and h o m o lo g y .
B y in teg ratin g fu n ctio n al an d s e q u e n c e d a ta across species, b io lo g ists are ab le to an n o tate the g e n o m e o f a sp e cie s b y u sin g fu n ctio n al d a ta from o th er species. F u rth e rm o re , c o m p a ra tiv e g en o m ics p ro v id es ev id en ce o f close e v o lu tio n ary re la tio n s h ip s b e tw ee n gene fam ilies. A cco rd in g to A d jay e and h is c o lla b o ra to rs,
The advantages o f cross-species comparison are two-fold. First, cross-species gene-expression comparison is a pow erful tool fo r the discovery o f evolutionarily conserved mechanisms and pathways o f expression control. The advantage o f cDNA microarrays in this context is that broad areas o f homology are compared and hybridization probes are sufficiently large so that sm all inter-species differences in nucleotide sequences would not affect the analytical results. This com parative genomics approach allows a common set o f genes within a specific developmental, metabolic, or disease-related gene pathw ay to be evaluated in experimental models o f human diseases. Second, the use o f microarrays in studies o f mammalian species other than human and rodents may advance our understanding o f human health and disease [6].
C u rren tly , 4 0 to 60% o f th e g en es fo u n d in n e w g e n o m ic seq u en ces do n o t h av e assig n ed fu n ctio n s. S o m e fu n c tio n s c a n b e d ed u ced by c o m p u tatio n al-stru ctu re d e te rm in a tio n a n d p ro te in fo ld in g , b u t m an y research p ro b lem s rem ain to b e so lv e d in th is a re a [107]. T hus, co m p u tatio n al m eth o d s w ill c o n tin u e to p la y a m a jo r ro le in the fu n ctio n al a n n o ta tio n o f g e n o m e s in th e fo re s e e a b le future.
4.3 B io lo g ica l r e la tio n sh ip s
P alak al an d h is c o lla b o ra to rs d e fin e o b ject an d rela tio n sh ip s as follow s:
The term "object" refers to any biological entity such as a protein, gene, cell cycle, etc. and “relationship” refers to any dynamic action one object has on another, e.g. protein inhibiting another protein or one object belonging to another object such as, the cells composing an organ [161].
A b io lo g ical rela tio n sh ip c a n ta k e se v e ra l form s. In [52], the fo llo w in g classes o f re la tio n sh ip a re g iv en :
• E v o lu tio n ary (fo r e x am p le, h o m o lo g , o rth o lo g o r p aralo g ),
• F u n ctio n al G e n o m ic (fo r e x a m p le , a b io lo g ic a l p ro c e ss, a cellu lar c o m p o n e n t, o r a m o le c u la r fu n ctio n ),
• S tru ctu ral, • P h y lo g e n etic,
• M a p p in g T e rm in o lo g y (M ark ers, L in k a g e , o r S y n te n y ),
• G e n etic o r M o le cu lar C o n c ep t (fo r e x a m p le , G en es, P o ly m o rp h ism s),
• C o n ta in m e n t, an d
• N o m e n c la tu re (fo r ex am p le, g en e A in sp e c ie s X = g en e B in sp ecies Y ).
W e are c o n ce n tra tin g in th is th e sis o n E v o lu tio n a ry R elatio n sh ip s (h o m o lo g , o rth o lo g , p a ra lo g ) an d so m e o f th e F u n c tio n a l G enom ic rela tio n sh ip s (b io lo g ica l p ro c e ss , c e llu la r c o m p o n e n t, m o lecu lar fu n ctio n ), as th ey are u se d to d isc o v e r re m o te ev o lu tio n ary and fu n ctio n al sim ila rities b e tw e e n g e n e p ro d u c ts . S in ce ev o lu tio n arily - rela te d g en es are h ig h ly lik e ly to sh a re c o m m o n asp ects o f function, a m ea su re m e n t o f th ese re la tio n sh ip s, w h ic h d e te rm in e s h o w sim ilar th ey are, can b e u sefu l fo r g en e fu n c tio n a l a n n o ta tio n .
4.3.1
H om ologou s s e q u e n c e s
H o m o lo g y is d efin ed b y H illis as "sim ila rity d u e to in h erita n ce fr o m a
co m m o n an cestor" [104]. T w o se q u e n c e s are h o m o lo g o u s i f th ey share
a c o m m o n ev o lu tio n ary h isto ry , i.e., th e re e x iste d an an cestral m o le cu le in the p a st th at w as an cestral to b o th o f th e seq u en ces. A h o m o lo g can b e e ith e r w ith in th e sam e o rg a n ism (a p a ra lo g ), o r am o n g d ifferen t sp ecies (an o rth o lo g ) (see F ig u re 4 .1 ).
4.3.1.1 T y p es o f H o m o lo g y
T h ere are m an y ty p es o f h o m o lo g y [1 0 4 ], fo r in stan ce:
• O rthology
O rth o lo g o u s g en es are h o m o lo g s th a t e v o lv e d as a resu lt o f a sp eciatio n e v en t [104]. In o th e r w o rd s, o rth o lo g y is a h o m o lo g y th at refle cts th e d escen t o f a sp ecies [164]. O rth o lo g o u s g en es m ay o r m ay n o t h av e th e sam e fu n ctio n .
• P aralogy
T h is is a h o m o lo g y refle ctin g th e d e sce n t o f g e n es. P a ra lo g o u s g en es are h o m o lo g s th a t d iv erg ed as a resu lt o f a g e n e d u p lic a tio n ev en t [104]. P a ra lo g y m ay b e d istin g u ish ed fro m o rth o lo g y b y ch eck in g w h e th e r o r n o t tw o h o m o lo g s are fo u n d in th e sa m e in d iv id u a l [164].
• X e n o lo g y
X e n o lo g o u s g en es are h o m o lo g s th at d iv e rg e d as a re s u lt o f a lateral gen e tra n sfe r [104]. A n tib io tic re s ista n t g e n e s are a classic exam ple o f X en o lo g s.
• S y n o lo g y
Sy n o lo g s are g en es th at en d u p in an o rg a n is m th ro u g h a fusion o f lin eag es [104].
homologs
paralogs
frog a
chickOC
mousea mouse p
chick P
frogP
a-chain gene
p-chain gene
gene duplication