ABSTRACT

Sequencing of the human genome has revealed that, contrary to expectation, there are comparatively few protein-coding genes—about 20,000–21,000 according to the most recent estimates. These genes vary widely in size and internal organization, with the coding exons often separated by large introns, which often contain highly repetitive DNA sequences. The human genome is subdivided into a large nuclear genome with more than 26,000 genes, and a very small circular mitochondrial genome with only 37 genes. To validate gene predictions supportive evidence was sought, mostly by evolutionary comparisons. DNA methylation has important consequences for gene expression and allows particular gene expression patterns to be stably transmitted to daughter cells. Gene duplication has been a common event in the evolution of the large nuclear genomes found in complex eukaryotes. Genes in simple organisms such as bacteria are comparatively similar in size and are usually very short.