ASSIGNMENT 2 cont.
Self study at S - star
i.e. Comparative Genomics
Page 2
What to compare? 2. Predicted ORFs. How to identify gene in a genome? - Accurate identification of genes in procaryotes and unicellular eucaryotes can be achieved by - homology to know genes in other species - ~ 80 % of genes - Statistical methods: GenMark, Glimmer - Accurate is much poorer for multicellular eucaryotes, especially human - order – of – magnitude more difficult because of o large and complex intron regions o alternative spicing - Statistical methods: GenScan, Genie - Statistical analysis + homology: PROCRUSTES - + mRNA sequences and homology with other close genome - Manual adjustment is often required as the last step Predicted ORFs - Total number of predicted Open reading Flame - Percentage of the Genome (coding) - Average length - predicted genes with homology and assigned function - predicted genes with homology but no function - H. pyroli specific genes - Strain – specific genes - Location of strain – specific genes
ORFs H. pyroli 26695 H. pyroli J99 Total 1590 1495 Percentage of Genome (Coding) 91.0 90.8 Average length 954 998 Functionally classified 875 895 Conserved with no function 275 290 H. pyroli specific 345 367 Strain – specific genes* 117 89 * Haft of the strain – specific genes are clustered in a plasticity zone with different (G + C) content, suggestive of horizontal DNA transfer.
J99 Genome : Updated Function Assignment Sequences/Assignment Numbers Percentage Sequence with a 3D homolog 242 16 Function assigned by clear homology 1001 67 Function assigned by tentative homology 41 2 Homologue found but no function assigned 402 26 No homologue found 45 3 Total 1489 100