OUP user menu

Attenuation regulation of amino acid biosynthetic operons in proteobacteria: comparative genomics analysis

Alexey G. Vitreschak, Elena V. Lyubetskaya, Maxim A. Shirshin, Mikhail S. Gelfand, Vassily A. Lyubetsky
DOI: http://dx.doi.org/10.1111/j.1574-6968.2004.tb09555.x 357-370 First published online: 9 January 2006


Candidate attenuators were identified that regulate operons responsible for biosynthesis of branched amino acids, histidine, threonine, tryptophan, and phenylalanine in γ- and α-proteobacteria, and in some cases in low-GC Gram-positive bacteria, Thermotogales and Bacteroidetes/Chlorobi. This allowed us not only to describe the evolutionary dynamics of regulation by attenuation of transcription, but also to annotate a number of hypothetical genes. In particular, orthologs of ygeA of Escherichia coli were assigned the branched chain amino acid racemase function. Three new families of histidine transporters were predicted, orthologs of yuiF and yvsH of Bacillus subtilis, and lysQ of Lactococcus lactis. In Pasteurellales, the single bifunctional aspartate kinase/homoserine dehydrogenase gene thrA was predicted to be regulated not only by threonine and isoleucine, as in E. coli, but also by methionine. In α-proteobacteria, the single acetolactate synthase operon ilvIH was predicted to be regulated by branched amino acids-dependent attenuators. Histidine biosynthetic operons his were predicted to be regulated by histidine-dependent attenuators in Bacillus cereus and Clostridium difficile, and by histidine T-boxes in L. lactis and Streptococcus mutans.

  • Regulation of gene expression
  • Attenuation
  • Branched-chain amino acids
  • Histidine
  • Threonine
  • Aromatic amino acids

1 Introduction

Bacteria use many different regulatory mechanisms to control transcription and translation of genes in response to concentration of metabolic products. One of possible targets for regulation is the nascent transcript during transcription elongation. Attenuation or antitermination mechanisms that involve formation of alternative RNA structures were observed in diverse bacterial groups with different molecules influencing the choice between these structures [1, 2]. In enteric bacteria, many amino acid biosynthetic operons (trp, his, leu, ilvGMEDA, ilvBN, and thr) as well as the phenylalanyl-tRNA synthetase operon pheST are regulated by transcription attenuation [3]. This mechanism is based on coupling between transcription and translation. The nascent leader transcript contains a short open reading frame that encodes the leader peptide. Soon after transcription initiation, a secondary structure element (1:2) forms that causes RNA polymerase to pause (Fig. 1A). This pause allows the ribosome to initiate translation of the leader peptide. Then, the translating ribosome disrupts the paused complex and transcription resumes, coupled with translation. Then, two possibilities exist depending on the level of the relevant amino acid in the cell. Under the condition of amino acid starvation, the level of charged tRNA is low and it causes ribosome stalling at codons for this amino acid (regulatory codons). When transcription proceeds, the antiterminator structure (2:3) folds and prevents terminator formation, resulting in transcription readthrough into downstream genes (Fig. 1B). Under the condition of amino acid excess, the level of charged tRNA is high and translation efficiently proceeds to the stop codon of the leader peptide. When ribosome translates the leader peptide, it prevents formation of the antiterminator structure, thereby promoting formation of the terminator (3:4), which causes premature termination of transcription (Fig. 1C). Thus, the ribosome plays the role of a mediator, sensing the concentration of charged tRNA, which in turn depends on the concentration of the amino acid. Expression of an operon corresponding to a biosynthetic pathway common for several amino acids may be regulated by all of these amino acids, and in this case the leader peptide reading frame contains several types of regulatory codons, for all amino acids.

Figure 1

The mechanism of the leader-peptide-dependent transcriptional attenuation of amino acid biosynthetic genes in bacteria. (1:2) – pause hairpin. Two alternative conformations of the 5′ UTR leader mRNA are shown, termination (1:2)/(3:4) and antitertmiantion (2:3).

Comparative analysis of bacterial genomes is a powerful approach to the analysis of regulation on the DNA or RNA levels and reconstruction of metabolic pathways [46]. Using available experimental data as a training set, we developed a program for prediction of attenuators (named LLLM [7, 38]) and applied it to the analysis of upstream regions of orthologous amino acid biosynthetic genes. This resulted in identification of candidate attenuators not only in γ-proteobacteria, but in α- and β-proteobacteria, low-GC Gram-positive bacteria, as well as bacteria from some other taxa (Table 1). Analysis of regulatory peptide open reading frames allowed for prediction of the regulating amino acids. Finally, analysis of positional clustering of genes and regulatory signals leads to identification of new candidate members of the biosynthetic pathways of branched chain amino acids, histidine, threonine, and aromatic amino acids.

View this table:
Table 1

The list of genomes with taxonomy and abbreviations

α-ProteobacteriaRhizobialesSinorhizobium melilotiSM
Agrobacterium tumefaciensATU
Rhizobium leguminosarumLE
Mesorhizobium lotiMLO
Bradyrhizobium japonicumBJA
Rhodopseudomonas palustrisRPA
Brucella melitensisBME
SphingomonadalesSphingomonas aromaticivorans #SAR
RhodobacterallesRhodobacter sphaeroides #RS
RhodospirillalesMagnetospirillum magnetotacticum #MMA
Rhodospirillum rubrum #RR
RickettsialesRickettsia prowazekiiRP
CaulobacteralesCaulobacter crescentusCO
β-ProteobacteriaBurkholderia pseudomallei #BPS
Ralstonia solanacearumRSO
Nitrosomonas europaeaNE
Bordetella pertussisBP
Neisseria meningitidisNM
γ-ProteobacteriaEnterobacterialesEscherichia coliEC
Salmonella typhiTY
Klebsiella pneumoniae #KP
Erwinia carotovoraEO
Yersinia pestisYP
PasteurellalesHaemophylus influenzaeHI
Pasteurella multocidaVK
Actinobacillus actinomycetemcomitans #AB
Mannheimia haemolytica #PQ
VibrionalesVibrio choleraeVC
Vibrio vulnificusVV
Vibrio parahaemolyticusVP
AlteromonadalesShewanella oneidensisSH
Microbulbifer degradans #MDE
PseudomonadalesPseudomonas aeruginosaPA
Pseudomonas putidaPP
Pseudomonas fluorescens #PU
Pseudomonas syringaePY
Azotobacter vinelandii #AV
Acinetobacter spp. #AC
XanthomonadalesXanthomonas campestrisXCA
Xylella fastidiosaXFA
FirmicutesBacillalesBacillus subtilisBS
Bacillus cereusZC
Bacillus haloduransHD
Bacillus stearothermophilus #BE
Oceanobacillus iheyensisOI
LactobacillalesEnterococcus faecalisEF
Enterococcus faeciumEFA
Streptococcus mutansSM
Streptococcus pyogenesST
Streptococcus pneumoniaeSPY
Streptococcus equi #SEQ
Streptococcus agalactiaeSAQ
Clostridialeslostridium acetobutylicumCA
Clostridium perfringesCP
Clostridium botulinumCB
Clostridium difficile #DF
Clostridium tetaniCT
Clostridium thermocellumCTE
Bacteroidetes/ChlorobiBacteroides fragilisBX
Porphyromonas gingivalisPFI
ThermotogaeThermotoga maritimaTM
Petrotoga miothermaPMI
Deinococcus/ThermusDeinococcus radioduransDR
  • Unfinished genomes are marked by #.

Three branched-chain amino acids, leucine, isoleucine and valine, are metabolically coupled in a common biosynthetic pathway, which consists of two parts (Fig. 2A). In the first part, the metabolic pathway starts from pyruvate and proceeds to valine through acetolactate synthase (IlvIH, IlvBN, and IlvGM), ketol-acid reductoisomerase (IlvC), dihydroxy-acid dehydratase (IlvD), and branched-chain amino acid aminotransferase (IlvE). Biosynthesis of leucine starts from one of the intermediates, 2-oxoisovalerate, and proceeds through 2-isopropylmalate synthase (LeuA), 3-isopropylmalate dehydratase (LeuDC), and branched-chain amino acid aminotransferase (IlvE). In the second part, the metabolic pathway starts from 2-oxobutanoate and the same proteins (IlvIH, IlvBN, IlvGM; IlvC, IlvD, and IlvE) are involved in the biosynthesis of another branched-chain amino acid, isoleucine.

Figure 2

Selected amino acid biosynthetic pathways of γ- and α-proteobacteria. (A) ILV (isoleucine, leucine, and valine); (B) HIS (histidine); (C) THR (threonine); and (D) aromatic amino acids (tryptophan, tyrosine, and phenylalanine).

In Eschericha coli, isoleicine, leucine, and valine biosynthetic genes (“ILV genes” below) are clustered in several operons, ilvGMEDA, ilvBN, ilvC, ilvIH, and leuABCD[8]. Three paralogs of acetolactate synthase are encoded by genes ilvBN, ilvIH, and ilvGM from three different transcriptional units. The ilvBN and ilvIH genes are transcribed as separate operons, whereas ilvGM is located within the ilvGMEDA operon. The ilvGMEDA and ilvBN operons are regulated by transcription attenuation, and the leader peptide reading frame of the attenuator contains regulatory codons for all three amino acids, isoleucine, leucine, and valine [9]. The leuABCD operon contains genes for the leucine biosynthesis and expression of this operon also is regulated by transcription attenuation [10]. The leader peptide of the leu transcription attenuator includes regulatory codons for only one amino acid, leucine. These and other operons is also regulated by repressors of transcription: ilvC by IlvY, ilvIH, and ilvGMEDA operons by LRP [1114].

The histidine biosynthesis pathway consists of 10 steps and starts from 5-phosphoribosyl diphosphate, a product of the pentose phosphate pathway (Fig. 2B). The histidine biosynthesis in E. coli involves nine enzymes: HisGEIAFHBCD, HisF, and HisH being isozymes [15]. All genes of the histidine pathway are known to form one his operon regulated via transcription attenuation [16]. The leader peptide reading frame of the histidine attenuator includes a run of histidine regulatory codons.

The threonine biosynthesis is linked with biosynthesis of other amino acids, aspartate, lysine, methionine, and branched chain amino acids (Fig. 2C). A part of the pathway, which is common for threonine, methionine, and lysine biosynthesis, starts from aspartate. E. coli has three aspartate kinase isozymes, ThrA, MetL, and LysC, that catalyze the conversion of aspartate to 4-aspartylphosphate [17, 18]. ThrA and MetL have an additional homoserine dehydrogenase (Hom) domain that catalyzes conversion of aspartate 4-semialdehyde to homoserine. The biosynthesis of branched chain amino acids starts at threonine (Fig. 2C).

In E. coli, expression of three isozyme genes, thrA, metL, and lysC, is under different regulation. Transcription of the thrABC operon is regulated by a threonine-isoleucine-dependent attenuator [19]. At that, regulation of the thrABC operon by isoleucine is an interesting example of repression by a distant product (biosynthesis of branched-chain amino acids is known to start from threonine). The aspartokinase activity of ThrA is feed-back inhibited by threonine [17]. The metBL operon is regulated by repressor MetJ in response of the concentration of S-adenosylmethionine [18]. Finally, lysC is possibly regulated by a lysine riboswitch LYS-element in response of the concentration of lysine (mutations in the leader region of lysC release the lysine repression in E. coli[20] and, moreover, LYS-element is located upstream of lysC[2123]), whereas the aspartokinase activity of LysC is feed-back inhibited by lysine. Thus, the expression and activity of ThrA, MetL, and LysC isozymes are controlled by the concentration of respective amino acids.

Biosynthesis of three aromatic amino acids, tryptophan, phenylalanin, and tyrosine, is metabolically coupled (Fig. 2D) [24]. It starts with the common pathway leading from phosphoenolpyruvate and erythrose 4-phosphate through 3-deoxy-d-arabino-heptulosonate-7-phosphate and shikimate to chorismate. Then the pathway divides into the terminal pathways, specific for each aromatic amino acid [24].

The trp operon of E. coli is regulated both by transcription attenuation and transcription repression. Transcription repressor TrpR regulates transcription initiation [25], whereas premature termination of transcription is under control of an attenuator containing two tryptophan codons [26]. The pheA gene, encoding chorismate/prephenate dehydratase, and pheST operon, encoding phenylalanyl-tRNA synthetase, are regulated by phenylanaline attenuation [27, 28]. In α-proteobacterium Rhizobium meliloti, the trp(E/G) gene is known to be regulated by transcriptional attenuation [29]. In Gram-positive bacteria, tryptophan biosynthetic genes are known to be regulated by the T-box antitermination mechanism or by TRAP [30, 31]. Previously we have analyzed regulation of aromatic amino acids in γ-proteobacteria [32]. Here we extend this analysis, considering newly sequenced genomes from all proteobacteria.

2 Data and methods

Complete and partial sequences of bacterial genomes were downloaded from GenBank [33]. Preliminary sequence data were obtained also from the WWW sites of The Institute for Genomic Research (http://www.tigr.org), University of Oklahoma's Advanced Center for Genome Technology (http://www.genome.ou.edu), the Sanger Centre (http://www.sanger.ac.uk), the DOE Joint Genome Institute http://www.jgi.doe.gov), and the ERGO Database [34]. The list of genomes with taxonomy and abbreviations is given in Table 1.

Protein similarity search was done using the Smith–Waterman algorithm implemented in the GenomeExplorer program [35]. Orthologous proteins were initially defined by the best bidirectional hit criterion [36] and if necessary confirmed by construction of phylogenetic trees for the corresponding protein families. The phylogenetic trees were constructed by the maximum likelihood method implemented in PHYLIP [37]. Multiple sequence alignments were done using CLUSTALW [38]. Transmembrane segments were predicted using TMpred (http://www.ch.embnet.org/software/TMPRED_form.html). The COG [36], InterPro [39] databases were used to verify the protein functional and structural annotation.

Attenuators of transcription were found using LLLM program. This program identifies candidate attenuators defined as alternative RNA hairpins such that the upstream hairpin overlaps a short open reading frame (candidate leader peptide) containing runs of regulatory codons, whereas the downstream hairpin is a candidate terminator followed by a run of Us. For details see [7, 40, 41].

3 Results

3.1 Isoleucine, leucine, and valine biosynthesis

Orthologs of the branched-chain amino acids (ILV) genes in genomes of γ-, β- and α-proteobacteria were identified by similarity search. Positional gene clusters corresponding to possible ILV operons are shown in Table 2. Then, the LLLM program was applied to upstream regions of the predicted ILV operons in all proteobacterial genomes. New candidate transcriptional attenuators were identified.

View this table:
Table 2

Attenuator-like signals were found in upstream regions of candidate ilv operons in γ-proteobacteria (Enterobacteria, Pasteurellales, Vibrionales, Shewanella oneidensis, and Xanthomonadales). In Pseudomonadales and other bacteria, the ilv genes are scattered along a genome, and some of them are also preceded by candidate attenuators. The ilvBN operon, which encodes genes for one of the acetolactate synthase isozymes in Enterobacteria, also was predicted to be regulated by the attenuation mechanism via leucine and valine regulatory codons. Other predicted attenuators include regulatory codons for three amino acids, isoleucine, leucine, and valine, similar to the experimentally studied attenuators of E. coli (Fig. 3).

Figure 3

Alignment of predicted transcription attenuators of branched chain amino acid biosynthetic operons in γ- and α-proteobacteria. Genome abbreviations are as in Table 1. Gene (operon) names are given. Regulatory RNA secondary structures are shown atop of the alignments. Base-paired positions are either indicated by the gray background or underlined. Numbers indicate the number of nucleotides between the aligned regions and the leader peptide start, the latter is set in bold. Regulatory codons in the leader peptides are substituted by single-letter amino acid abbreviations: I (isoleucine), L (leucine), and V (valine).

The structure of the candidate ilv biosynthetic operons varies in the analyzed genomes. For example, the order of genes in the ilv operon is ilvGMEDA in Enterobacteria and Vibrionales, but in Xanthomonadales, the order is ilvCGM–tdcB–leuA. In the latter case, the tdcB gene is possibly co-regulated with the ILV genes. Its product is threonine dehydratase which catalyzes reactions in both serine and ILV metabolism.

Another possible co-regulation event was observed in Pasteurella multocida. A gene with unknown function (orthologous to hypothetical gene ygeA of E. coli) is located within the ilv operon (ilvGM–ygeA–DA), and a candidate attenuator was found upstream of this operon. YgeA is weakly similar to the amino acid racemase protein RacX from B. subtilis, which converts l-aspartate to d-aspartate [42, 43]. Thus, ygeA likely encodes a new kind of racemase, possibly ILV racemase.

The leu operon, which includes only genes for the leucine synthesis, is predicted to be regulated by attenuation in some γ-proteobacteria (Enterobacteria, Pasteurellales, Vibrionales, Alteromonadales, and Xanthomonadales), but not in Pseudomonadales and other species. The leader peptide reading frames of all predicted attenuators include runs of leucine codons.

Little is known about regulation of ILV genes in α-proteobacteria. Expression of the ilvIH genes encoding the two subunits of acetolactate synthase has been studied in Caulobacter crescentus, and the region between ilvIH and the transcription initiation site was shown to have the properties of a transcription attenuator [44] (in the cited paper this operon is called ilvBN, not ilvIH, but phylogenetic analysis of all three acetolactate synthases shows that this gene is located on the branch corresponding to ilvIH, data not shown). We analyzed upstream regions of all ILV genes of available α-proteobacterial genomes and found attenuator-like structures (Table 2). α-Proteobacteria have one acetolactate synthase, IlvIH. The ilvIH operon is possibly regulated by transcription attenuation in Rhizobiales (Sinorhizobium meliloti, Agrobacterium tumefaciens, Mesorhizobium loti, Bradyrhizobium japonicum, Rhodopseudomonas palustris, and Brucella melitensis), Rhodobacter spp., Magnetospirillum magnetotacticum, C. crescentus, and a deeply rooted bacterium Deinococcus radiodurans (Deinococcus/Thermus group). The leader peptide reading frames of predicted attenuators include runs of isoleucine, leucine, and valine regulatory codons (Fig. 3). Conversely, in γ-proteobacteria, operons encoding two other acetolactate synthase isoenzymes, ilvBN (present only in Enterobacteria) and ilvGM, but not ilvIH, are regulated by attenuators.

There exist two groups of homologous 2-isopropylmalate synthases, leuA and leuA2 (approx. 30% sequence identity). The leuA genes, orthologs of leuA from E. coli were observed in γ-proteobacteria, excluding Pseudomonadales, and in some α-proteobacteria, whereas the leuA2 genes, homologs of well-studied 2-isopropylmalate synthases from Actinobacteria and Fungi, in particular Corynebacterium glutamicum[45] and Saccharomyces cerevisiae[46], respectively, were observed in α-proteobacteria, some β-proteobacteria and Pseudomonadales. In α-proteobacteria, both types of 2-isopropylmalate synthase genes have candidate attenuators in upstream regions (Table 2). Although these attenuators have leader peptide reading frames with runs of leucine regulatory codons, the terminator structures are weak and lack runs of uridines (Fig. 3). At that, one should note that a similar situation was observed in the case of trpE and trpGDC operons in Pseudomonas putida, where transcripts were attenuated despite the absence of strong ρ-independent terminator structures [47]. Moreover, we found a possible attenuator with a strong G/C-rich terminator upstream of the leuA gene in D. radiodurans.

3.2 Histidine biosynthesis

Orthologs of the histidine biosynthetic (HIS) genes in bacterial genomes were identified by similarity search. Positional gene clusters corresponding to candidate HIS operons are listed in Table 3. The LLLM program with parameters obtained by analysis of known attenuator structures was used to scan the upstream regions of predicted HIS operons in all analyzed genomes (for details see [7]). New candidate transcriptional attenuators were identified, mainly in γ-proteobacteria. We also identified attenuator-like structures in some low-GC Gram-positive bacteria, Bacteroidetes/Chlorobi group and Thermotogales.

View this table:
Table 3

Positional analysis and analysis of regulation showed that in most γ-proteobacteria (Enterobacteria, Pasteurellales, Vibrionales, and Shewanella oneidensis), all histidine biosynthetic genes are clustered and possibly regulated via the transcription attenuation mechanism (Table 3). All candidate attenuators upstream of the his operons in these bacteria have similar features: a leader peptide reading frame with a run of histidine regulatory codons and terminator/antiterminator structures (Fig. 4). We found no attenuators upstream of HIS genes in Pseudomonadales, Xanthomonadales, and some other γ-proteobacteria.

Figure 4

Alignment of predicted transcription attenuators of histidine biosynthetic operons in various bacteria. Notation as in Fig. 3. H denotes histidine regulatory codons in the leader peptide reading frame.

Analysis of upstream regions of HIS genes in other taxonomic groups revealed attenuator-like structures in the Bacillus/Clostridium group, Bacteroidetes/Chlorobi, and Thermotogales. In those cases, histidine biosynthetic operons, which include most of HIS genes, are possibly regulated. We observed diversity of mechanisms for regulation of the HIS gene expression. In particular, in Lactococcus lactis and Streptococcus mutans, the his operon is regulated by the T-box antitermination mechanism [48], Vitreschak A, unpublished], whereas in Bacillus cereus and Clostridium difficile, the his operon seems to be regulated via transcription attenuation. Other Streptococcus spp. as well as Entrerococcus spp. lack histidine biosynthetic genes. Moreover, B. cereus has two copies of the hisZ gene, which are predicted to be regulated by transcriptional attenuation: one as a part of the his biosynthetic operon; the other, as a separate gene with a possible histidine attenuator structure in the upstream region (Table 3). hisS gene in this bacterium, as well as orthologous hisS genes in Bacillus spp., Listeria spp., Enterococcus spp., and L. lactis, are located separately and predicted to be regulated by the T-box antitermination mechanism [49], Vitreschak A, unpublished].

Several hypothetical genes were predicted to belong to the histidine regulons. HI0325 from Haemophylus influenzae, which encodes a putative transporter with 10 transmembrane segments, has a candidate histidine attenuator in the upstream region. This gene is widely distributed, but not universal in bacteria. In a number of genomes, in particular in Fusobacterium nucleatum and Bacillus halodurans, this gene is clustered with histidine utilization genes (the hut locus). Thus, HI0325 and its orthologs (yuiF in B. subtilis) possibly constitute a new family of histidine transporters.

Another example is the BC0629 gene from B. cereus that also is possibly regulated via the histidine-dependent attenuation. This gene (yvsH in B. subtilis) is homologous to the arginine:ornithine antiporter arcD from Pseudomonas aeruginosa and lysine permease lysI from Corinobacterium glutamicum. All these proteins belong to the basic amino acid/polyamine antiporter APA family [http://tcdb.ucsd.edu/tcdb/background.php]. B. cereus has two yvsH paralogs, yvsH1 (BC0629) and yvsH2 (BC0865). The former is a candidate lysine transporter whose expression was predicted to be regulated by the lysine via the LYS-element riboswitch mechanism [21]. The upstream region of yvsH2 contains a candidate attenuator whose leader peptide reading frame contains a run of histidine regulatory codons (Fig. 4). Thus, yvsH2 (BC0629) is possibly involved in the histidine transport. The predicted specificity of this transporter is consistent with experimental data for the homologous HisJ and LAO transporters, which both bind histidine, arginine, lysine, and ornithine, albeit with different affinities towards these ligands [50].

A very similar situation was observed in the case of two paralogous transporter genes in L. lactis, lysP, and lysQ. Both proteins are similar (more than 50% identity) to the experimentally identified lysine permease lysP of E. coli[51]. In the L. lactis genome, lysP was predicted to be regulated by a LYS-element and thus to be involved in the lysine transport [21]. On the other hand, the upstream region of lysQ contains a candidate histidine attenuator (Fig. 4). Thus, these two transporters can have different affinity to lysine and histidine, and because of that be regulated one by lysine and the other one by histidine.

All genes required for the histidine biosynthesis were identified in all analyzed bacteria, the only exception being the histidinol-phosphatase domain of HisB in Pseudomonas spp. Neither similarity search nor positional analysis and analysis of regulation provided a candidate for this enzymatic activity.

On the other hand, at least three non-homologous proteins with unknown function (shown in Table 3 as vatB, actX2, and actX3 in P. multocida, Mannheimia haemolytica, and Polaribacter filamentus, respectively), encoding putative acetyltransferases, that are possibly co-regulated with HIS genes. These candidate acetyltransferases could catalize conversion of histamine to 4-β-acetylaminoethyl-imidazole. This is one of the steps of the histidine modification (http://www.genome.ad.jp/kegg/metabolism.html), for which only enzymatic activity, EC 2.3.1., is known, but no genes have been assigned yet.

3.3 Threonine biosynthesis

We analyzed regulation of the thr biosynthetic operon in proteobacteria. Orthologs of thr genes were identified by similarity search. Candidate thr operons and possible regulation are shown in Table 4. Enterobacteria, Pasteurellales, Vibrionales, Shewanella oneidensis, and Xanthomonadales have the same gene order thrABC in the threonine biosynthetic loci. In Pseudomonadales and some other genomes, the threonine biosynthetic genes are scattered along genome. Moreover, in Enterobacteria, Pasteurellales, Vibrionales, S. oneidensis, and Xanthomonadaels, thrA encodes a bifunctional protein, aspartate kinase/homoserine dehydrogenase, whereas in Pseudomonadales and some other γ-proteobacteria thrA2 (aspartate kinase) and hom (homoserine dehydrogenase) are located in different loci. Finally, two homoserine kinase genes, thrB2 and thrH[52], neither homologous to thrB of E. coli, were observed in Pseudomonadales (Table 4).

View this table:
Table 4

Then, we analyzed upstream regions of the predicted thr operons by LLLM trained on known attenuators. New candidate transcriptional attenuators were identified in γ-proteobacteria (Table 4). They have all properties of threonine attenuators: a short leader peptide reading frame with a run of threonine and isoleucine codons, as well as alternative termination and antitermination RNA structures (Fig. 5). Our results predicted that thr operons are regulated by transcription attenuation in Enterobacteria, Pasteurellales, Vibrionales, Shewanella oneidensis, and Xanthomonas campestris.

Figure 5

Alignment of predicted transcription attenuators of thr operons in γ-proteobacteria. Notation as in Fig. 3. T, I, and M denote, respectively, threonine, isoleucine, and metionine regulatory codons in the leader peptide reading frame.

Closer analysis showed that in Pasteurellales (H. influenzae, P. multocida, Actinobacillus actinomycetemcomitans, and M. haemolytica), the leader peptide reading frame contains not only standard regulatory codons for threonine and isoleucine, but also numerous methionine codons (Fig. 5). Thus, the thr operons in Pasteurellales seem to be regulated by concentration of three amino acids, threonine, isoleucine, and methionine, instead of the former two. Indeed, Pasteurellales have only one copy of the bifunctional aspartate kinase/homoserine dehydrogenase protein, instead of two isozymes ThrA and MetL in other γ-proteobacteria, where the expression of these isozymes is regulated by threonine/isoleucine and by methionine, respectively. Thus, it makes sense that the single ThrA isozyme of Pasteurellales is regulated not only by threonine and isoleucine, but by methionine as well. One more, monofunctional aspartate kinase LysC, is present in three of the five Pasteurellales, P. multocida, Haemophylus ducrei, and M. haemolytica, and the expression of lysC has been predicted to be regulated by lysine via LYS-element riboswitches, as in E. coli[2123].

3.4 Tryptophan and phenylalanine biosynthesis

Orthologs of the trp and pheA genes in γ- and α-proteobacteria were identified by similarity search. Candidate trp, pheA, and pheST operons are shown in Table 5. Candidate attenuators were identified upstream of these operons by the LLLM program (Table 5).

View this table:
Table 5

Candidate trp attenuators found in Enterobacteria, Vibrionales, and Shewanella oneidensis have leader peptide reading frames with tryptophan regulatory codons and antitermination/termination-like structures (Fig. 6). The trp(E/G) gene, which encodes fused components of anthranilate synthase responsible for the first step of the tryptophan biosynthesis, is possibly regulated by transcription attenuation in all analyzed Rhizobiales (order of α-proteobacteria) and in Bordetella pertussis (belonging to β-proteobacteria).

Figure 6

Alignment of predicted transcription attenuators of trp, pheA, and pheST operons in γ- and α-proteobacteria. Notation as in Fig. 3. W and F denote tryptophan and phenylalanine regulatory codons, respectively, except in trp operons of Pseudomonas, where tryptophan codons TGG are retained.

The pheA operon may be regulated by candidate phenylalanine-dependent attenuators in Enterobacteria, Vibrionales, and S. oneidensis, whereas the pheST operon seems to be regulated only in Enterobacteria.

Candidate attenuators of the trpE and trpGDC operons in Pseudomonadales have some peculiar properties. There is experimental evidence that transcription of the trpE and trpGDC operons is regulated by attenuation [47], but no strong ρ-independent transcriptional terminators could be found in the leader regions of these operons. We aligned sequences upstream of the trpE and trpGDC operons from five pseudomonads. The region of sequence conservation corresponds to a possible leader peptide, which contains two nearly adjacent tryptophan codons (Fig. 6). It seems that in this case the terminator and antiterminator structures are less pronounced and maybe less stable than those in other attenuators.

4 Discussion

This analysis allowed us to identify a large number of candidate attenuators and predict the amino acid(s) responsible for the regulation, demonstrated variability of regulatory mechanisms for the amino acid biosynthetic pathways even in closely related genomes, and allowed for functional annotation of hypothetical genes encoding transporters and enzymes. In particular, candidate attenuators were found in some taxonomic groups where this mechanism of regulation was studied little (α-proteobacteria, low-GC Gram-positive bacteria) or not at all (Bacteroidetes/Chlorobi group and Thermotogales).

This analysis, as well as other comparative studies, demonstrate the diversity and evolutionary lability of regulatory mechanisms based on formation of alternative RNA structures, especially in low-GC Gram-positive bacteria. Indeed, we observed candidate histidine attenuators regulating his operons in bacilli and clostridiae, but T-boxes in streptococci that have this operon. It is known that transcription attenuation and T-box antitermiantion mechanisms are prevalent in Proteobacteria and Gram-positive bacteria, respectively. We demonstrate that these different mechanisms, based on switching between two conformations of the RNA nascent transcript, are involved in regulation of the his operons in low-GC Gram-positive bacteria. For example, candidate histidine attenuators regulate his operons in B. cereus and C. difficile, but not in L. lactis and S. mutans, where this role is taken by histidine T-boxes. Moreover, in B. cereus both regulatory mechanisms are present, where histidine attenuators regulate two operons his and hisZ2, whereas the third one, hisS, is regulated by a histidine T-box. This situation is similar to the one with the methionine biosynthesis pathway, which is regulated by T-boxes in streptococci, S-box riboswitches in bacilli and clostridiae, and by transcription repression in lactobacilli [53].

In the case of transcription attenuation, we suppose an ancient origin of this regulatory mechanism. Indeed, we found possible attenuators of amino acid biosynthetic genes not only in proteobacteria, but also in low-GC Gram-positive bacteria, Bacteroidetes/Chlorobi, and, notably, in deeply rooted bacteria, Thermotogales and D. radiodurans. The hypothesis of the ancient origin of transcription attenuation and some others regulatory mechanisms, based on formation of alternative RNA structures, is reasonable. In fact, a number of riboswith elements involved in regulation of genes from various metabolic pathways (vitamin, purine, lysine, and methionine biosynthesys and transport) were identified in a large number of distant bacteria [for review see [52].

Candidate threonine/isoleucine-dependent attenuators were found upstream of thr operons in Enterobacteria, Pasteurellales, Vibrionales, Shewanella oneidensis, and Xanthomonadales. In Pasteurellales, attenuators of the thr operon were predicted to respond not only to the level of threonine and isoleucine, but also to methionine. Thus, the single bifunctional aspartate kinase/homoserine dehydrogenase ThrA of these species is regulated by all three amino acids. In fact, probable regulation of thrA in Pasteurellales by not only threonine and isoleucine but also methionine concentration is quite interesting. This is reasonable since the enzyme is located just upstream of the methionine biosynthesis pathway.

Finally, several new functional annotations were made by analysis of regulatory mechanisms and positional clusters of genes. Orthologs of ygeA of E. coli were predicted to encode branched chain amino acid racemase based on similarity to other racemases and regulation by ILV-attenuator in P. multocida. The products of vatB, actX2, and actX3 from P. multocida, M. haemolytica, and P. filamentus, respectively, were predicted to catalyze conversion of histamine to 4-β-acetylaminoethyl-imidazole. Three types of predicted histidine transporters are orthologs of yuiF and yvsH of B.subtilis, and lysQ of L. lactis. They are regulated by candidate histidine attenuators in some bacteria (HI0325/yuiF in H. influenzae and BC0629/yvsH in B. cereus, lysQ) and positionally linked to histidine biosynthesis or utilization genes.

This study was partially supported by grants from the Howards Hughes Medical Institute (55000309), the Ludwig Institute of Cancer Research (CRDF RB0-1268), and the Program “Molecular and Cellular Biology” of the Russian Academy of Sciences.


We are grateful to Andrei Mironov and Dmitry Rodionov for useful discussions, and to Lev Leont'ev for programming assistance.


  1. [1].
  2. [2].
  3. [3].
  4. [4].
  5. [5].
  6. [6].
  7. [7].
  8. [8].
  9. [9].
  10. [10].
  11. [11].
  12. [12].
  13. [13].
  14. [14].
  15. [15].
  16. [16].
  17. [17].
  18. [18].
  19. [19].
  20. [20].
  21. [21].
  22. [22].
  23. [23].
  24. [24].
  25. [25].
  26. [26].
  27. [27].
  28. [28].
  29. [29].
  30. [30].
  31. [31].
  32. [32].
  33. [33].
  34. [34].
  35. [35].
  36. [36].
  37. [37].
  38. [38].
  39. [39].
  40. [40].
  41. [41].
  42. [42].
  43. [43].
  44. [44].
  45. [45].
  46. [46].
  47. [47].
  48. [48].
  49. [49].
  50. [50].
  51. [51].
  52. [52].
  53. [53].
View Abstract