The strategy of Human Genome Project was to prepare a series of maps of human chromosome with finer resolutions. For this the chromosomes were divided into small pieces which should be cloned.
Then the chromosome pieces were arranged in such a way that could correspond to their locations on the chromosome. Each fragments arranged in an order was sequenced after preparation of map.
There are several methods used for small scale sequencing of genome. But these methods do not sequence the entire genome. The two direct methods used for genome sequencing and one indirect method (using mRNA but not DNA) have been discussed in this section.
1. Direct Sequencing of Bacterial Artificial Chromosome (BAC) :
BAC vectors are stable and introduce a complex foreign DNA of 80-100 kb in E. coli cells. Therefore, BAC is used in construction of genomic library. Screening of genomic library is done through searching of common restriction fragments.
Then BAC clone mapping is done just to determine the arrays of contigs (i.e. contiguous clones) which overlap. The large DNA fragments are broken into small pieces and the mapped contigs are sequenced. Thus the direct sequencing procedure involves the sequencing of small pieces of DNA taken from adjacent stretches of a chromosome.
2. Random Shotgun Sequencing :
In this approach different types of cloning vectors are used based on genome size of organisms. For example, shotgun libraries of genomic DNA are constructed in small inserts (10 kb) plasmid vector, and medium insert (10 kb plasmid vectors.
However, most libraries from organisms with larger genome are constructed using phage λ, cosmid, BAC or YAC vectors; therefore, DNA inserts of about 23, 45, 350, and 100 kb respectively can be inserted.
The cleaved genomic DNA has several small fragments. These are randomly inserted into plasmids. DNA sequencing of both the small and medium sized insert plasmid libraries is carried out. It is done at both the ends of inserts of randomly selected clones so that it should cover the genome at least three times.
If you sample a few random genomic DNA containing plasmids, you will find that (i) some plasmids contain inserts which will be different from the others, and (ii) some plasmids will have inserts which may contain some regions present in one insert and a few regions present in other insert i.e. overlapping inserts.
The overlapping inserts come from different regions of the same genomic location. The different regions lie either left or right sides to each other. The both ends of each insert (whether overlapping or non-overlapping) is sequenced.
The information of sequence is put in computer database. The overlapping sequences are identified through a computer programme which joins the all sequences into one contiguous stretch.
In spite of such effort it is likely that all the inserts would have not been included by a particular sample which was required to provide complete information of the sequence. If there exists such possibility, the specific regions of the genome are cleaved, cloned and sequenced separately.
Thus the complete information of the sequence is generated. This approach rapidly reveals 90% of the desired sequence information. The remaining few gaps are filled by oligonucleotide primers. The shotgun sequencing strategy relies on enormous computing power to assemble the randomly generated sequences.
3. The Expressed Sequence Tag (EST) Approach :
The EST approach was pioneered by J. Craig Venter and co-workers at the National Institute of Health (NIH) (U.S.A.) in the early 1990s. He developed a new method of investing the genes by focusing the attention of the active portion of the genome as mRNA.
Venter and co-workers isolated mRNA molecules (instead of fragments of genomic DNA) and constructed DNA molecules. They treated DNA as a part of chromosomal DNA and sequenced to create ‘expressed sequence tags’ (ESTs).
The ESTs were used as handles for isolating the complete genes. Following EST strategy, plenty of databases of nucleotide sequences were generated. Consequently it helped to prepare the transcript map of human genome at preliminary level. The EST technique demonstrated the possibility of sequencing all genes to highest levels. This attempt boosted up the growth of genomic industry.
Thereafter, Venter switched towards sequencing the entire human genome through whole genome shotgun strategy.
In this approach he matched the overlapping ends of DNA fragments, fitted contiguous pieces and attempted for genome sequencing by a ‘genome assembly programme’. By sequencing the entire genome of some microorganisms, the validity of this method was also proved.