The Origin and Evolution of Metabolic Pathways: Why and How did Primordial Cells Construct Metabolic Routes?
© Springer Science+Business Media, LLC 2012
Published: 15 September 2012
The emergence and evolution of metabolic pathways represented a crucial step in molecular and cellular evolution. In fact, the exhaustion of the prebiotic supply of amino acids and other compounds that were likely present on the primordial Earth imposed an important selective pressure, favoring those primordial heterotrophic cells that became able to synthesize those molecules. Thus, the emergence of metabolic pathways allowed primitive organisms to become increasingly less dependent on exogenous sources of organic compounds. Comparative analyses of genes and genomes from organisms belonging to Archaea, Bacteria, and Eukarya reveal that, during evolution, different forces and molecular mechanisms might have driven the shaping of genomes and the emergence of new metabolic abilities. Among these gene elongations, gene and operon duplications played a crucial role since they can lead to the (immediate) appearance of new genetic material that, in turn, might undergo evolutionary divergence, giving rise to new genes coding for new metabolic abilities. Concerning the mechanisms of pathway assembly, both the analysis of completely sequenced genomes and directed evolution experiments strongly support the patchwork hypothesis, according to which metabolic pathways have been assembled through the recruitment of primitive enzymes that could react with a wide range of chemically related substrates. However, the analysis of the structure and organization of genes belonging to ancient metabolic pathways, such as histidine biosynthesis, suggests that other different hypothesis, i.e., the retrograde hypothesis, may account for the evolution of some steps within metabolic pathways.
The Primordial Cells and Metabolism
But how did the expansion of genomes occur? The following section will focus on the molecular mechanisms that guided this transition, i.e., the expansion and the refinement of ancestral metabolic routes, leading to the structure of the extant metabolic pathways.
The Role of Duplication and Fusion of DNA Sequences in the Evolution of Metabolic Pathways in Early Cells
The Starter Types and Explosive Expansion of Metabolism in the Early Cells
Different molecular mechanisms may have been responsible for the expansion of early genomes and metabolic abilities. Data obtained in the last decade clearly indicate that a very large proportion of the gene set of different (micro)organisms is the outcome of more or less ancient gene duplication events predating or following the appearance of the LUCA and involving ancestral genes, referred to as the starter types, a term first coined by Lazcano and Miller (1994), that underwent (many) duplications. These findings strongly suggest that the duplication and divergence of DNA sequences of different size represents one of the most important forces driving the evolution of genes and genomes during the early evolution of life. Indeed, this process may allow the formation of new genes from pre-existing ones. However, there are a number of additional mechanisms that could have increased the rate of metabolic evolution, including the modular assembly of new proteins by gene fusion events and horizontal gene transfer, the latter permitting the transfer of entire metabolic routes or part thereof.
Fate of duplicated genes
The structural and/or functional fate of duplicated genes is an intriguing issue that has led to the proposal of several classes of evolutionary models accounting for the possible scenarios emerging after the appearance of a paralogous gene pair.
Duplication events can generate genes arranged in tandem or scattered at different loci within the genome (Fani 2004; Li and Graur 1991). If an in-tandem duplication occurs, at least two different scenarios for the structural evolution of the two copies can be depicted: (1) the two genes undergo an evolutionary divergence, becoming paralogs; and (2) the two genes fuse, doubling their original size forming an elongated gene (see below). Moreover, if the two copies are not arranged in tandem, they may either (1) become paralogous genes; or (2) one copy may fuse to an adjacent gene, with a different function, giving rise to a mosaic or chimeric gene that potentially may evolve to perform other metabolic role(s). Tandem duplications of DNA stretches are often the result of an unequal crossing-over between two DNA molecules, but other processes, such as replication slippage, may be invoked to explain the existence of tandemly arranged paralogous genes. The presence of paralogous genes at different sites within a microbial genome might be the result of ancient activity of transposable elements and/or duplication of genome fragments as well as whole-genome duplications (Fani 2004).
The functional fate of the two (initially) identical gene copies originating from a duplication event depends on the further modifications (evolutionary divergence) that one (or both) of the two redundant copies accumulates during evolution. It can be surmised, in fact, that after a gene duplicates, one of the two copies becomes dispensable and can undergo several types of mutational events, mainly substitutions, that, in turn, can lead to the appearance of a new gene, harboring a different function in respect to the ancestral coding sequence (Fig. 6). On the other hand, duplicated genes can also maintain the same function in the course of evolution, thereby enabling the production of a large quantity of RNAs or proteins (gene dosage effect).
DNA duplications may also concern entire clusters of genes involved in the same metabolic pathways and transcribed from a promoter into a polycistronic mRNA, i.e., entire operons or part thereof. Thus, we can imagine that if an entire operon A, responsible for the biosynthesis of amino acid A, duplicates giving rise to a couple of paralogous operons, one of the copies (B) may diverge from the other and evolve in such a way that the encoded enzymes catalyze reactions leading to a different amino acid, B. If this event actually occurs, it might provoke a (rapid) expansion of the metabolic abilities of the cell and the increase of its genome size (Fani and Fondi 2009).
Once acquired, metabolic innovations might have been spread rapidly between microorganisms through horizontal gene transfer mechanisms.
In addition to gene duplication, another route of gene evolution is the fusion of independent cistrons leading to bi- or multifunctional proteins (Brilli and Fani 2004b; Xie et al. 2003). Gene fusions that have been disclosed in genes of many metabolic pathways provide a mechanism for the physical association of different catalytic domains or of catalytic and regulatory structures (Jensen 1976). Fusions frequently involve genes coding for proteins that function in a concerted manner, such as enzymes catalyzing sequential steps within a metabolic pathway (Yanai et al. 2002). Fusion of such catalytic centers likely promotes the channeling of intermediates that may be unstable and/or in low concentration. The high fitness of gene fusions can also rely on the tight regulation of the expression of the fused domains. Even though gene fusion events have been described in many prokaryotes, they may have a special significance among nucleated cells, where the very limited number, if not the complete absence, of operons does not allow the coordinate synthesis of proteins by polycistronic mRNAs.
Gene Duplication and Fusion Acting Together: Gene Elongation
Hypotheses on the Origin and Evolution of Metabolic Pathways
As discussed in the previous sections, the emergence and refinement of basic biosynthetic pathways allowed primitive organisms to become increasingly less dependent on exogenous sources of chemical compounds accumulated in the primitive environment as a result of prebiotic syntheses. But how did these metabolic pathways originate and evolve? And what is the role that the molecular mechanisms described above (gene elongation, duplication, and/or fusion) played in the assembly of metabolic routes? How the major metabolic pathways actually originated is still an open question, but several different theories have been suggested to account for the establishment of metabolic routes. All these ideas are based on gene duplication. Two of them are discussed in the following paragraphs.
Gene duplication has also been invoked in another model, the so-called patchwork hypothesis (Ycas 1974; Jensen 1976), according to which metabolic pathways may have been assembled through the recruitment of primitive enzymes that could react with a wide range of chemically related substrates. Such relatively slow, non-specific enzymes may have enabled primitive cells containing small genomes to overcome their limited coding capabilities. Figure 9 shows a schematic three-step model of the patchwork hypothesis: (a) an ancestral enzyme E0 endowed with low substrate specificity is able to bind to three substrates (S1, S2, and S3) and catalyze three different, but similar, reactions; (b) a duplication of the gene encoding E0 and the subsequent divergence of one of the two copies leads to the appearance of enzyme E2 with an increased and narrowed specificity; and (c) a further gene duplication event, followed by evolutionary divergence, leads to E3. In this way, the ancestral enzyme E0 belonging to a given metabolic route is “recruited” to serve other novel pathways.
The patchwork hypothesis is also consistent with the possibility that an ancestral pathway may have had a primitive enzyme catalyzing two or more similar reactions on related substrates of the same metabolic route and whose substrate specificity was refined as a result of later duplication events.
In this way, primordial cells might have expanded their metabolic capabilities. Additionally, this mechanism may have permitted the evolution of regulatory mechanisms coincident with the development of new pathways (Fani 2004; Lazcano et al. 1995).
The Reconstruction of the Origin and Evolution of Metabolic Pathways
How can the origin and evolution of metabolic pathways be studied and reconstructed? By assuming that useful hints may be inferred from the analysis of metabolic pathways existing in contemporary cells (Peretò et al. 1998), important insights into the evolutionary development of microbial metabolic pathways can be obtained by (1) the use of bioinformatic tools that allow the comparison of gene and genomes from organisms belonging to the three cell domains (Archaea, Bacteria, and Eukarya), and (2) laboratory studies in which new substrates are used as carbon, nitrogen, or energy sources. These are the so-called directed-evolution experiments in which a microbial (typically bacterial) population is subjected to a (strong) selective pressure that leads to the establishment of new phenotypes capable of exploiting different substrates (Clarke 1974; Mortlock and Gallo 1992). By assuming that the processes involved in acquiring new metabolic abilities are comparable to those found in natural populations, directed-evolution experiments can provide useful insights in early cellular evolution (Fani 2004).
Histidine Biosynthesis: A Paradigm for the Study of the Origin and Evolution of Metabolic Pathways
Histidine biosynthesis is a metabolic crossroad and plays an important role in cellular metabolism, being interconnected to both the de novo synthesis of purines and to nitrogen metabolism. The connection to purine biosynthesis results from an enzymatic step catalyzed by imidazole glycerol phosphate synthase, a heterodimeric protein composed by one subunit each of the hisH and hisF products (Alifano et al. 1996). Chemical and biological data suggest that histidine was present in the primordial soup and that this biosynthetic route is ancient. It has also been suggested that histidine-containing small peptides could have been involved in the prebiotic formation of other peptides and nucleic acid molecules, once these monomers accumulated in primitive tidal lagoons or ponds (Fani and Fondi 2009 and references therein). If primitive catalysts required histidine, then the eventual exhaustion of the prebiotic supply of histidine and histidine-containing peptides imposed a selective pressure favoring those microorganisms capable of synthesizing histidine. Hence, this metabolic pathway might have been assembled long before the appearance of the LUCA (Brilli and Fani 2004a, b; Fani et al. 1994, 1995; Alifano et al. 1996; Fondi et al. 2009b), but once the entire pathway was assembled, it underwent major rearrangements during evolution, as suggested by the wide variety of different clustering strategies of his genes that has been documented.
How the his pathway originated remains an open question, but the analysis of the structure and organization as well as the phylogenetic analyses of the his genes in (micro)organisms belonging to different phylogenetic archaeal, bacterial, and eukaryal lineages reveals that different molecular mechanisms played an important role in shaping this pathway. Actually, an impressive series of well-documented duplication (Fani et al. 1994), elongation (Fani et al. 1994) and fusion (Brilli and Fani 2004a, b; Fani et al. 2007) events has shaped this pathway. Therefore, the histidine biosynthetic pathway represents an excellent model for understanding the molecular mechanisms driving the assembly and refinement of metabolic routes.
The Refinement and Expansion of Metabolic Abilities Through a Cascade of Gene Elongation and Duplication Events: hisA and hisF
Gene Fusion in the Assembly of Histidine Biosynthesis
It has been recognized that (at least) seven (hisD, N, B, H, F, I, and E) out of the ten his biosynthetic genes (hisGDCNBHAFIE) underwent different single or multiple fusions in diverse prokaryotic and eukaryal phylogenetic lineages, demonstrating that gene fusion represents one of the most important routes for the evolution of his genes. Recently (Fani et al. 2007), the amino acid sequences of all the available His proteins have been analyzed for (1) gene structure, (2) phylogenetic distribution, (3) timing of appearance, (4) horizontal gene transfer, (5) correlation with gene organization, and (6) biological significance. Data obtained allowed the reconstruction of the evolutionary history of three interesting gene fusions. Quite interestingly, it has been demonstrated that fusion events involving different histidine biosynthetic genes that gave rise to genes coding for bifunctional or multifunctional enzymes, such as hisNB, hisIE, and hisHF, occurred in different evolutionary timescales and in different (micro)organisms, and that they have very different phylogenetic distributions (see below).
The whole body of data permitted the depiction of a likely scenario for the origin and evolution of histidine biosynthetic genes. According to the model proposed (Fani et al. 2007; Fondi et al. 2009b) on the basis of the available data, it has been suggested that the complete histidine biosynthetic pathway was assembled long before the appearance of the LUCA, which possessed mono-functional his genes. Concerning the organization of these genes in LUCA, it is not still possible to establish if they were (1) scattered throughout its genome, (2) organized in a single more-or-less compact operon, or (3) exhibited a mixed organization (i.e., some scattered genes or organized in more mini-operons).
However, it is quite clear that after the divergence from LUCA, the organization of histidine biosynthetic genes underwent several different rearrangements.
Concerning the structure of his genes, the only “universal” gene fusion concerns hisA and hisF genes, which are the outcome of a cascade of (at least) two gene elongation events followed by a paralogous gene duplication. This suggests that the two elongation events as well as the paralogous duplication event leading to hisA and hisF are very ancient, i.e., they predate the appearance of LUCA. During the early steps of molecular evolution, hisA and its copies underwent multiple duplication events leading to a paralogous gene family. The fusion between hisI and hisE occurred more than once in Bacteria, indicating a phenomenon of convergent evolution. Moreover, this gene might have been horizontally transferred (Fani et al. 2007). The hisNB fusion is a relatively recent evolutionary event that occurred in the γ-branch of proteobacteria. This fusion was parallel to the introgression of hisN into an already formed and more or less compact his operon. Having once occurred, the fusion was fixed and transferred to other proteobacteria and/or CFB group along with the entire operon or part thereof. The fusions involving hisH and hisF were found only in two bacteria.
Metabolic pathways of the earliest heterotrophic organisms arose during the exhaustion of the prebiotic compounds present in the primordial soup.
In the course of molecular and cellular evolution, different mechanisms and different forces might have concurred in the emergence of new metabolic abilities and the shaping of metabolic routes. However, duplication of DNA regions represents a major force of gene and genome evolution. The evidence for gene elongation, gene duplication, and operon duplication events sugggests, in fact, that the ancestral forms of life might have expanded their coding abilities and their genomes by “simply” duplicating a small number of mini-genes (the starter types) via a cascade of duplication events involving DNA sequences of different size. In addition to this, gene fusion also played an important role in the construction and assembly of chimeric genes.
The dissemination of metabolic routes between micro-organisms might be facilitated by horizontal transfer events. The increasing frequency of protein phylogenies that are in conflict with the conventional universal tree (Brown and Doolittle 1997) and the finding that the horizontal transfer of genetic information is pervasive among microbial lineages and that it may occur across different phylogenetic kingdoms (Gogarten et al. 1996; Lazcano and Miller 1996) indicates that this mechanism played a major role in shaping genome architectures and in fostering genetic adaptation and evolution. The horizontal transfer of entire metabolic pathways or part thereof might have had a special role during the early stages of cellular evolution.
There are many different schemes that can be proposed for the emergence and evolution of metabolic pathways, depending on the available prebiotic compounds and the available enzymes previously evolved. Even though most data coming from the analysis of completely sequenced genomes and directed-evolution experiments strongly support the patchwork hypothesis, we do not think that all the metabolic pathways arose in the same manner. In our opinion, the different schemes might not be mutually exclusive. Thus, some of the earliest pathways may have arisen from the Horowitz scheme, some from the semi-enzymatic proposal and later ones from Jensen’s enzyme recruitment hypothesis. However, other ancient pathways, including histidine biosynthesis, might be assembled using (at least) two different schemes (Horowitz and Jensen).
- Alifano P, Fani R, Lió P, Lazcano A, Bazzicalupo M, Carlomagno MS, Bruni CB. Histidine biosynthetic pathway and genes: structure, regulation and evolution. Microbiol Rev. 1996;60:44–69.PubMed CentralPubMedGoogle Scholar
- Brilli M, Fani R. Molecular evolution of hisB genes. J Mol Evol. 2004a;58:225–37.View ArticlePubMedGoogle Scholar
- Brilli M, Fani R. The origin and evolution of eukaryal HIS7 genes: from metabolon to bifunctional proteins? Gene. 2004b;339:149–60.View ArticlePubMedGoogle Scholar
- Brown JR, Doolittle WF. Archaea and the prokaryote-to-eukaryote transition. Microbiol Mol Biol Rev. 1997;61:456–502.PubMed CentralPubMedGoogle Scholar
- Clarke PH. The evolution of enzymes for the utilization of novel substrates. Cambridge: Cambridge University Press; 1974.Google Scholar
- Copley RR, Bork P. Homology among (betaalpha)(8) barrels: implications for the evolution of metabolic pathways. J Mol Biol. 2000;303:627–41.View ArticlePubMedGoogle Scholar
- Fani R. Gene duplication and gene loading. In: Microbial evolution: gene establishment, survival, and exchange. Washington, DC: ASM; 2004Google Scholar
- Fani R, Fondi M. Origin and evolution of metabolic pathways. Phys Life Rev. 2009;6:23–52.Google Scholar
- Fani R, Chiarelli I, Liò P, Bazzicalupo M. The evolution of the histidine biosynthetic genes in prokaryotes: a common ancestor for the hisA and hisF genes. J Mol Evol. 1994;38:489–95.View ArticlePubMedGoogle Scholar
- Fani R, Lió P, Lazcano A. Molecular evolution of the histidine biosynthetic pathway. J Mol Evol. 1995;41:760–74.View ArticlePubMedGoogle Scholar
- Fani R, Brilli M, Fondi M, Lió P. The role of gene fusions in the evolution of metabolic pathways: the histidine biosynthesis case. BMC Evol Biol. 2007;7 Suppl 2:S4.PubMed CentralView ArticlePubMedGoogle Scholar
- Fondi M, Emiliani G, Fani R. Origin and evolution of operons and metabolic pathways. Res Microbiol. 2009a;160:502–12.Google Scholar
- Fondi M, Emiliani G, Liò P, Gribaldo S, Fani R. The evolution of histidine biosynthesis in Archaea: insights into his genes structure and organization in LUCA. J Mol Evol. 2009b;69:512–26.Google Scholar
- Gogarten JP, Hilario E, Olendzenski L. Gene duplications and horizontal gene transfer during early evolution. In: Roberts DML, Sharp P, Alderson G, Collins MA, editors. Evolution of microbial life. Cambridge: Cambridge University Press; 1996. p. 1996.Google Scholar
- Holliday GL, Fischer JD, Mitchell BO, Thornton JM. Characterizing the complexity of enzymes on the basis of their mechanisms and structures with a bio-computational analysis. FEBS J. 2011; 278:3835–45.Google Scholar
- Horowitz NH. On the evolution of biochemical syntheses. Proc Natl Acad Sci USA. 1945;31:153–7.PubMed CentralView ArticlePubMedGoogle Scholar
- Horowitz NH. The evolution of biochemical syntheses—retrospect and prospect. In: Bryson V, Vogel HJ, editors. Evolving genes and proteins. New York: Academic; 1965. p. 15–23Google Scholar
- Jensen RA. Enzyme recruitment in evolution of new function. Annu Rev Microbiol. 1976;30:409–25.View ArticlePubMedGoogle Scholar
- Lazcano A, Miller SL. How long did it take for life to begin and evolve to cyanobacteria? J Mol Evol. 1994;34:546–54.View ArticleGoogle Scholar
- Lazcano A, Miller SL. The origin and early evolution of life: prebiotic chemistry, the pre-RNA world, and time. Cell. 1996;85:793–8.View ArticlePubMedGoogle Scholar
- Lazcano A, Diaz-Villagomez E, Mills T, Orò J. On the levels of enzymatic substrate: implications for the early evolution of metabolic pathways. Adv Space Res. 1995;15:345–56.View ArticlePubMedGoogle Scholar
- Lewis EB. Pseudoallelism and gene evolution. Cold Spring Harb Symp Quant Biol. 1951;16:159–74.View ArticlePubMedGoogle Scholar
- Li WH, Graur D. Fundamentals of molecular evolution. Sunderland: Sinauer; 1991.Google Scholar
- Miller SL. Production of amino acids under possible primitive earth conditions. Science. 1953;117:528–9.View ArticlePubMedGoogle Scholar
- Mortlock RP, Gallo MA. Experiments in the evolution of catabolic pathways using modern bacteria. In: Mortlock RP, Gallo MA, editors. The evolution of metabolic functions. Boca Raton: CRC; 1992Google Scholar
- Ohno S. Evolution by gene duplication. Berlin: Springer; 1970.View ArticleGoogle Scholar
- Oparin AI. Proiskhozhdenie zhizny. Moscow: Izd. Moskovhii RabochiI; 1924.Google Scholar
- Peretò J, Fani R, Leguina JI, Lazcano A. Enzyme evolution and the development of metabolic pathways. In: Cornish-Bowden A, editor. New beer in an old bottle: Eduard Buchner and the growth of biochemical knowledge. Valencia: Universitat de Valencia; 1998. p. 173–98.Google Scholar
- Xie G, Keyhani NO, Bonner CA, Jensen RA. Ancient origin of the tryptophan operon and the dynamics of evolutionary change. Microbiol Mol Biol Rev. 2003;67:303–42.PubMed CentralView ArticlePubMedGoogle Scholar
- Yanai I, Wolf YI, Koonin EV. Evolution of gene fusions: horizontal transfer versus independent events. Genome Biol. 2002;3.Google Scholar
- Ycas M. On earlier states of the biochemical system. J Theor Biol. 1974;44:145–60.View ArticlePubMedGoogle Scholar