Complete Information on Genome Similarity, SNPS and Comparative Genomics



1. Genome Similarity :

When the work on Human Genome Project was in progress, several unknown samples were collected from different regions. During this moment two questions were raised: (i) genome of individuals to be sequenced, and (ii) the percentage of genome similarity between two individuals.

Though all the humans are similar, yet what is the percentage of genome similarity between them? Human genome is about 99.8% similar to each other.

The percentage of similarity refers to 'consensus human genome'. It means that 0.2% difference in genome makes them different from each other. There lies a difference of one in thousand nucleotides in the genomes of two different individuals. Only this percentage of DNA brings uniqueness in them.

2. Single Nucleotide Polymorphisms (SNPs) :

SNPs are the variations in a nucleotide sequence which occur due to change even in a single base (e.g. A, G T or C). Therefore, certain sites of the sequence nucleotide bases of different individuals differ as below:

In human genome, SNPs occur at 1.6-3.2 million sites. According to changes in bases SNPs affect the gene function.

DNA fingerprinting of individuals is possible due to these genetic variations in non-coding parts of genome. This technique is used in search of criminals, rapists, solving parentage problem, confirming identity of individuals, etc.

It should be kept in mind that always the genetic variations are some time harmful. Because our body becomes susceptible or resistant to certain diseases i.e. protects from all kinds of pathogens (the disease causing agents).

Besides, genetic variations also govern the severity of illness and the body responses to treatment of medicines. Fig. 4.5 shows the effect of medicine to patients as decided by a physician on the basis of SNPs present on patient's genome.

On average, SNPs occur at every 500-1,000 nucleotides in human DNA. SNPs can help to (i) associate sequence variation with heritable phenotypes, (ii) facilitate studies in population, and (iii) evolutionary biology, and add in positional cloning and physical mapping.

SNPs tend themselves to highly automate fluidic or DNA chip-based analyses and have quickly become the focus of several large scale development and mapping in humans and other organisms.

3. Comparative Genomics :

When the complete genome sequences of cellular life forms become available, a notable findings were recorded. It was found that one third of the genes encoded on each genome had no predictable or known function.

In E. coli K12 (which is all time favourite model organisms of molecular biologists) about 40% genes have unknown function. The level of evolutionary conservation of microbial proteins is rather uniform with about 70% of gene products from each of sequenced genomes having homologs in distant genomes.

The function of these genes can be predicted by comparing different genomes and by transferring functional annotations of protein from better studies organisms to their orthologs (direct descendants of a sequence in the common ancestor) from less-studied organisms.

For better understanding of genomes, biology and organisms, this makes comparative genomics a powerful approach. Comparative genomics includes several distinct aspects; analysis of protein sets from completely sequenced genomes is one of them. There are several databases (e.g. general purpose databases and organisms specific databases) used for comparative genomics.

Genome sequencing projects have made very clear that genomes of different organisms (e.g mouse and man) may be very similar. Systeny between human chromosome 1 and mouse chromosome 1, 3, 4 and 8. It has been found that the genomes of human and chimpanzee differ by 1-3%.

The working DNA of mouse and human is shared by about 97.5%. About 12% of -18,000 genes of the worm C. elegans show sequence similarity to yeast gene. This datum is based on proteins encoded by this much number of genes.

About 2,000 genes of yeast (one third of -6,000 genes) are functionally similar to the genes of C. elegans. Such similarity between the organisms suggests that in spite of evolution of organisms about 100 million years ago, their genomes have not changed much. That is why some percentage of similarity exists between genes of organisms.

Examples of Comparative Genomics:

Comparative genomics may be exemplified by the genetic basis of fruit development. The researcher is interested in identifying the genes that are involved in ripening green mangoes into yellow mangoes.

The biochemical pathway involved in ripening process is also to be determined. It shows a comparative genomics approach that can be used in purely in silico effort to determine the genes involved in ripening of mangoes.

In this approach the genome of mango fruit is compared to the annotated genomes of similar species to identify the genes and the functions that they do.