OUP user menu

Transcription initiation in the Escherichia coli K-12 malI–malX intergenic region and the role of the cyclic AMP receptor protein

Georgina S. Lloyd, Kerry Hollands, Rita E. Godfrey, Stephen J.W. Busby
DOI: http://dx.doi.org/10.1111/j.1574-6968.2008.01365.x 250-257 First published online: 1 November 2008


The Escherichia coli K-12 malImalX intergenic region contains two divergent promoters, which have been investigated by both mutational and biochemical analysis. The malX promoter drives transcription initiation from a location that is 43 bp upstream from the malX translation start codon. Expression from the malX promoter is dependent on binding of the cyclic AMP receptor protein (CRP) to a DNA site centred 41.5 bp upstream of the transcript start. The malI promoter drives transcription initiation from a location 85 bp upstream from the malX transcript start and it is active without the CRP. Expression from the malI promoter can be stimulated by the CRP. Mutational analysis suggests that the malI promoter has an unusual organization.

  • Escherichia coli
  • malX
  • malI
  • transcription initiation
  • cycle AMP receptor protein


The enterobacterial cyclic AMP receptor protein (CRP) is a global transcription factor that plays a major role in controlling bacterial metabolism (Kolb et al., 1993; Barrett et al., 2005), and whole genome studies in Escherichia coli K-12, using a variety of approaches, have now identified nearly 200 target promoters (Gosset et al., 2004; Zheng et al., 2004; Grainger et al., 2005). CRP recognizes a 22-bp target sequence and, at many promoters, activates transcription by making one or more direct contacts with RNA polymerase (Busby & Ebright, 1999). There is a great diversity in the organization of different CRP-activated promoters, and thus, CRP binds upstream of the promoter −35 element at some promoters (Class I promoters) while, at others, CRP binds to a DNA target that overlaps the −35 element (Class II promoters).

In a recent study, Hollands et al. (2007) investigated the effects of CRP at 11 poorly characterized loci in the E. coli K-12 genome. One of these was the regulatory region of the malXY transcription unit that previously had been sequenced and annotated by Reidl & Boos (1991). Hollands et al. (2007) showed that this regulatory region contains a Class II CRP-dependent promoter. However, the location of this promoter is inconsistent with the previously suggested −10 and −35 elements (Reidl & Boos, 1991) and with the malX transcription start point listed in EcoCyc (Karp et al., 2007). Thus, in the first part of this work, we located the malX promoter transcript start and its −10 element. In the second part, we turned our attention to the divergent malI promoter, which drives expression of MalI, a LacI family transcriptional repressor (Reidl et al., 1989), and we exploited genetic analysis to identify sequence elements that are important for its activity.

Materials and methods

Standard recombinant DNA methods were used in this work and all the different synthetic oligodeoxynucleotide primers used are listed in the Supporting Information, Table S1. EcoRI–HindIII fragments carrying the malX and malI promoters were inserted into the cloning vector plasmid, pSR, encoding resistance to ampicillin (Kolb et al., 1995), or into the low copy number lac expression vector plasmid, pRW50, encoding resistance to tetracycline (Lodge et al., 1992), and recombinants were propagated in the Δlac E. coli K-12 strain, M182, as described by Hollands et al. (2007). The activities of the malX and malI promoters cloned in pRW50 were deduced from measurements of β-galactosidase expression after transformation into M182 and its Δcrp derivative, as in Hollands et al. (2007).

The starting material was the EcoRI–HindIII malX100 fragment shown in Fig. 1. The shorter malX300 fragment was generated by PCR using primers D62261 and D42890. The p11G mutation in the malX100 fragment was made by PCR using primer D50644 and mutagenic primer D55439. Other EcoRI–HindIII fragments carrying the malImalX intergenic region were generated from the malX100 fragment using PCR. DNA sequences are numbered with the transcription start sites labelled as +1 and upstream and downstream sequences are assigned negative and positive coordinates, respectively.

Figure 1

Base sequence of malX100 promoter fragment. The figure shows the sequence of the nontemplate strand of the malX100 promoter fragment used in this work, from the upstream EcoRI site to the downstream HindIII site. The figure highlights the locations of the divergent malI and malX translation start codons (bold underlined), the two DNA sites for CRP (doubly underlined) and the proposed malX promoter −10 hexamer element (boxed). The sequence is numbered with the malX transcript start proposed here as +1 (underlined and bold). The location of the malX transcription start that is currently listed in EcoCyc is indicated by a grey box at position −52. The filled triangle indicates the upstream limit of the malX300 deletion construct.

Derivatives of the malI100 fragment carrying G for C substitutions at either position −39 (p39G) or position +22 (l22G), with respect to the malI transcript start, that inactivate, respectively, CRP site 1 or CRP site 2, were constructed as described by Hollands et al. (2007). To generate libraries of random mutations in the malI promoter, the malI300 EcoRI–HindIII fragment cloned in pRW50 was amplified by error-prone PCR with the flanking D10520 and D10527 primers using Taq DNA polymerase and conditions as described by Barne et al. (1997). The products of four PCR reactions were restricted with EcoRI and HindIII, purified and each was recloned into pRW50. After transformation into E. coli strain M182, colonies carrying recombinants were screened on MacConkey lactose indicator plates and candidates with altered Lac phenotypes were selected. For each candidate, the entire EcoRI–HindIII insert was sequenced. The p13G mutation in the malI300 fragment was generated by megaprimer mutagenesis using mutagenic primer D56427. The megaprimer was synthesized by PCR with the mutagenic primer and the downstream D10527 flanking primer. This megaprimer was then used in a second-round PCR with the upstream D10520 flanking primer to generate the mutant EcoRI–HindIII promoter fragment.

Transcript starts were located by measuring the lengths of primer extension products, following the protocols of Zyskind & Bernstein (1989). To do this, RNA was purified from M182 cells containing pRW50 carrying either the malX100 or malI100 fragments, using a QIAgen RNeasy mini kit, and hybridized to 5′ end-labelled D49724 primer that corresponds to sequence downstream of the HindIII site in pRW50. The sizes of primer extension products were measured on gels that were calibrated with sequence reactions.

Footprinting experiments at the malX and malI promoters were performed on purified PstI–HindIII fragments, generated after cloning the malX100 or malI350 EcoRI–HindIII fragments into pSR. Fragments were labelled at the HindIII end with [γ-32P] ATP. Purified E. coli RNA polymerase holoenzyme was purchased from Epicentre Technologies, and CRP was purified as described by Ghosaini et al. (1988). Each reaction (20 μL) contained c. 3 nM labelled DNA fragment in 20 mM 4-(2-hydroxyethyl)-1-piperazine ethane sulphonic acid (HEPES) (pH 8.0), 5 mM MgCl2, 50 mM potassium glutamate, 1 mM dithiothreitol (DTT) and 500 μg mL−1 bovine serum albumin, and 0.2 mM cyclic AMP, CRP and RNA polymerase as required. Samples were analysed by denaturing gel electrophoresis. Gels were calibrated with Maxam–Gilbert ‘G+A’ sequencing reactions of the labelled fragment and quantified using a Bio-Rad Molecular Imager FX and Quantity One software (Bio-Rad).

DNase I experiments were performed exactly as described by Savery et al. (1996). For potassium permanganate experiments, reactions were set up, incubated at 37 °C for 30 min, and then treated with 200 mM potassium permanganate for 3 min. Reactions were then stopped with 50 μL of stop solution (3 M ammonium acetate, 100 mM EDTA, 1.5 M β-mercaptoethanol). Following phenol–chloroform extraction and ethanol precipitation, samples were resuspended in 1 M piperidine and incubated for 30 min at 90 °C. Samples were again purified by phenol–chloroform extraction and ethanol precipitation, and analysed by electrophoresis as above.

Results and discussion

Transcription initiation in the malI–malX intergenic region

The starting point of this work was the 265 base pair malX100 EcoRI–HindIII fragment, illustrated in Fig. 1, which carries the segment of the chromosome of E. coli K-12 MG1655 between the starts of the divergent malI and malX genes, including the regulatory intergenic region. This fragment was cloned into the pRW50 lac expression vector plasmid to give a fusion of the malX promoter to the lac genes. Hollands et al. (2007) exploited this fusion to study the malX promoter and showed that it is a Class II CRP-dependent promoter. The malI–malX intergenic region contains two DNA sites for CRP (see Fig. 1) but, using mutational and deletion analysis, it was found that CRP site 1 is essential for malX promoter activity, while CRP site 2, which has an approximately fivefold lesser affinity for CRP, plays no role (Hollands et al., 2007; G.S. Lloyd, unpublished data). Previous studies have found that for Class II CRP-activated promoters, the distance between the centre of the DNA site for CRP and the −10 hexamer element is usually 32 bp (Busby & Ebright, 1999) and, thus, we predict that the malX−10 hexamer sequence is 5′-TATCTT-3′ (Fig. 1). To assess the importance of this element, we constructed a derivative of the malX100 fragment with a point mutation (p11G) that changes the hexamer to 5′-TGTCTT-3′. Data listed in Table 1 show that the p11G substitution reduces CRP-dependent activation of the malX promoter by more than 90%. These results argue that the malX transcription start must be c. 50 bp downstream from that currently shown on EcoCyc (see Fig. 1). Thus, we used primer extension to map the malX promoter transcript start in the malX100 fragment cloned in pRW50. Figure 2 shows that the major extension product is 137–139 bases long, which places the transcript start at the location marked as +1 in Fig. 1, just downstream of our suggested malX promoter −10 hexamer. This experiment also revealed a second, less intense, longer primer extension product, which could be due to a transcript starting upstream near position −90. To investigate this, we constructed the truncated malX300 fragment in which the EcoRI site was moved to position −90 (Fig. 1). Data in Table 1 show that the deletion of upstream sequences in the malX300 fragment has little or no effect on malX promoter activity, suggesting that the contribution of any secondary upstream promoter is minor. This deletion also removed CRP site 2 and hence our result confirms that this site plays little or no role in CRP-dependent activation of the malX promoter.

View this table:
Table 1

Measurement of malX and malI promoter activities

Promoter fragmentActivity in M182Activity in M182 Δcrp
malX1001930 ± 52128 ± 4
malX100 p11G162 ± 5198 ± 1
malX3001966 ± 90102 ± 11
malI1002205 ± 951266 ± 9
malI100 p39G1210 ± 681229 ± 14
malI100 l22G2086 ± 681197 ± 26
malI3002602 ± 161507 ± 45
malI3501300 ± 84900 ± 23
malI500140 ± 9149 ± 6
malI6009 ± 210 ± 2
  • The table lists β-galactosidase activities (in Miller units) measured in the Δlac strain M182 or its Δcrp derivative carrying different promoter∷lacZ fusions cloned in pRW50. Cells were grown aerobically in Luria–Bertani (LB) medium containing 35 μg mL−1 tetracycline to exponential phase (OD650 nm 0.3–0.5). Each value is the mean ± 1 SD from at least three independent experiments.

Figure 2

Primer extension analysis. The figure shows the results of primer extension analysis of malX or malI RNA extracted from Escherichia coli K-12 strain M182 carrying pRW50/malX100 or pRW50/malI100, grown aerobically to mid-exponential phase (OD650 nm 0.4–0.5) in Luria-Bertani medium. The sizes of the malX and malI primer extension products were determined by calibration against sequence reactions (lanes G, A, T and C).

In a parallel primer extension experiment, we also located the start point of the divergent malI transcript. Figure 2 shows that the major malI primer extension product is 196–198 bases long, which places the transcript start at or near position −85 with respect to the malX+1. This is 5 bp upstream from the malI transcript start point determined experimentally by Reidl et al. (1989) using primer extension.

In vitro study of the malX promoter

To confirm the position of the malX promoter, we used DNase I footprinting to visualize the binding locations of CRP and RNA polymerase, and potassium permanganate footprinting to detect promoter unwinding. The DNase I footprints (Fig. 3, left-hand panel) show that purified CRP alone binds at two targets, which correspond to the predicted Site 1 and Site 2. Each site contains two enhanced bands, which are typical for bound CRP (Belyaeva et al., 1996). Together with CRP, purified RNA polymerase gives a clear footprint that runs from position +15 to upstream of CRP bound at Site 1. This is the footprint that would be expected for RNA polymerase at a Class II promoter dependent on CRP bound at Site 1 (Belyaeva et al., 1996). Note that the enhancements in the DNA sites for CRP remain, indicating the formation of a ternary CRP–RNA polymerase–promoter DNA complex. The potassium permanganate footprinting experiment (Fig. 3, right hand panel) shows that unwinding of the malX promoter −10 region requires the presence of both CRP and RNA polymerase. In the absence of CRP, RNA polymerase gives much weaker protection of the malX promoter fragment with maximum protection around position −80. The following experiments argue that this is due to binding at the malI promoter that is CRP independent.

Figure 3

DNase I and potassium permanganate footprint analysis at the malX promoter. The figure shows the analysis of in vitro footprinting reactions at the malImalX intergenic region using DNase I (left hand panel) or potassium permanganate (right hand panel). Samples analysed in each lane contained the maX100 promoter (PstI–HindIII fragment, end-labelled on the template strand) and purified CRP (100 nM) and/or holo RNA polymerase (RNAP; 25 nM in lanes 4 and 9, 50 nM in lanes 3, 5, 8 and 10) as indicated. Gels were calibrated using Maxam–Gilbert ‘G+A’ sequencing reactions (lane GA) and are numbered with the proposed malX transcription start as +1.

Investigation of the malI promoter

To study malI promoter activity, the EcoRI and HindIII sites of the malX100 fragment were interconverted to give the malI100 fragment (Fig. 4), and this was cloned into the pRW50 lac expression vector to give a fusion of the malI promoter to the lac genes. Measurements of the expression of this fusion in strain M182, and in its Δcrp derivative, show that, in contrast to the malX promoter, the malI promoter is active in the absence of CRP, with a modest 1.7-fold stimulation by CRP (Table 1). To investigate this stimulation, we constructed derivatives of the malI100 fragment with C to G substitutions at position −39 (p39G) or at position +22 (l22G) that change the key 5′-TCACA-3′ element in CRP site 1 or CRP site 2 to 5′-TGACA-3′. Results listed in Table 1 show that the p39G change, which inactivates CRP site 1, has no effect on CRP-independent expression from the malI promoter but suppresses the CRP-dependent stimulation. In contrast, the l22G substitution, which inactivates CRP site 2, has no effect on malI promoter activity either with or without CRP.

Figure 4

Base sequence of malI100 promoter fragment. The figure shows the sequence of the nontemplate strand of the malI100 promoter fragment used in this work, from the upstream EcoRI site to the downstream HindIII site. The figure highlights the locations of the divergent malX and malI translation start codons (bold underlined) and the two DNA sites for CRP (doubly underlined). The proposed malX and malI transcription starts are indicated as +1 and the corresponding −10 hexamer elements are highlighted. The proposed malI promoter −35 hexamer element is highlighted by underlining. The sequence is numbered with respect to the malI transcript start as +1. The filled triangles indicate the upstream limit of the malI300, malI350, malI500 and malI600 deletion constructs. The arrows below the sequence indicate the locations of the p11G, p40G and p46T ‘down’ mutations, and the arrow above the sequence indicates the location of the p33G ‘up’ mutation.

Next, we used deletion and mutation analysis to attempt to define the sequence elements responsible for malI promoter activity. Hence, a set of nested deletion derivatives of the malI100 fragment was made, in which the EcoRI site was moved closer to the transcript start (shown in Fig. 4). The effect of each deletion on malI promoter activity was then measured, after cloning each truncated fragment into pRW50. Data listed in Table 1 show that deletion to position −93 (malI300) results in a small increase in promoter activity while deletion to position −73 (malI350) reduces promoter activity by 30–40%. Further deletions to positions −31 (malI500) or −9 (malI600) reduce promoter activity to low levels.

Several point mutations that affected malI promoter activity were then identified (see Fig. 4), after screening libraries of randomly generated changes in the malI promoter (see Materials and methods). A single base substitution at position −33 (p33G) was found to increase activity by approximately fourfold in both M182 and in its Δcrp derivative. In contrast, substitutions at positions −11 (p11G), −40 (p40G) and −46 (p46T) reduced malI promoter activity to 40–50% of the starting activity in Δcrp cells, and suppressed CRP-dependent activation. To explain these results, we propose that the substitutions at positions −33 and −11 alter the malI promoter −35 and −10 hexamer elements, respectively, and that the other substitutions affect its upstream element. Thus, the malI promoter −35 hexamer is 5′-TTACGC-3′ from positions −35 to −30, and, hence, promoter activity is increased by the p33G change to 5′-TTGCGC-3′. Similarly, we propose that the −10 hexamer sequence is 5′-TAAGAT-3′ from positions −12 to −7, and, hence, promoter activity is decreased by mutation at position −11 (p11G) to 5′-TGAGAT-3′. To investigate the possibility, suggested by Reidl et al. (1989), that the overlapping 5′-TATAAG-3′ from positions −14 to −9 functions as the −10 element, we used site-directed mutagenesis at position −13 (p13G) to alter this hexamer to 5′-TGTAAG-3′ and found that this mutation had very little effect on malI promoter activity (R.E. Godfrey, unpublished data).

Confirmation of our assignment of 5′-TAAGAT-3′, from positions −12 to −7, as the malI promoter −10 hexamer element was sought using potassium permanganate footprinting to measure DNA opening in the binary RNA polymerase–malI promoter DNA complex. Results in Fig. 5 show that RNA polymerase induces the appearance of reactive bands at the malI promoter from near position +1 to near position −10 (lane 2). These bands are suppressed by the p11G mutation (lane 6) but unchanged by the p13G substitution (lane 4). The significance of the bands higher up the gel is unclear. They are likely to be due to DNA distortions caused by the binding of other RNA polymerase molecules, and differences due to the p11G substitution are secondary consequences of the occupation or nonoccupation of the malI promoter.

Figure 5

Potassium permanganate footprint analysis at the malI promoter. The figure shows the result of in vitro potassium permanganate footprinting with the maI350 promoter and derivatives, using PstI–HindIII fragments end-labelled on the template strand (lanes 1 and 2, malI350; lanes 3 and 4, maI350 p13G; lanes 5 and 6, malI350 p11G). Samples were incubated with (lanes 2, 4 and 6) or without (lanes 1, 3 and 5) 50 nM purified RNA polymerase holoenzyme. The gel was calibrated using a Maxam–Gilbert ‘G+A’ sequencing reaction (lane GA) and is numbered with respect to the proposed malI transcription start site.


The E. coli malX and malY genes encode proteins for the transport and metabolism of an as yet unidentified substrate (Zdych et al., 1995; Clausen et al., 2000). Hollands et al. (2007) previously identified a Class II CRP-dependent promoter upstream of the malX gene, and this has been confirmed here. CRP is known to be essential for the expression of many genes involved in the transport and catabolism of different substrates (Kolb et al., 1993), and it is unsurprising to find that the expression of the malXY operon is regulated by a completely CRP-dependent promoter. Reidl et al. (1989) found that the divergently expressed gene upstream of malX, malI, encodes a LacI family transcription repressor that represses malXY expression, most likely by binding to operator targets in the malXmalI intergenic region. It is supposed that this repression is modulated by the as yet unidentified substrate (Reidl & Boos, 1991). Here, we report that the malI promoter is active in the absence of CRP. A rationale for this would be to ensure that the MalI repressor is made in all conditions, most likely to prevent unnecessary expression of MalX and MalY, or perhaps also to fulfil other functions. Note that, although our assays were performed in a malI+ genetic background, our conclusions concerning promoter activities and CRP-dependent activation are not ‘distorted’ by repression by MalI, which appears to be present at low levels that are insufficient to repress the malX and malI promoters carried by our expression vector plasmid, pRW50. Thus, the measured activities of both promoters are unchanged by the introduction in trans of higher copy number plasmids carrying the malXmalI intergenic region (R.E. Godfrey, unpublished data).

The malX promoter is a ‘textbook’ Class II CRP-dependent promoter, with the −10 element positioned 32 bp downstream of CRP site 1, which is centred at position −41.5 upstream from the transcription start point. This is the optimal arrangement for activation by CRP at a Class II promoter (Busby & Ebright, 1999). The strong dependence on CRP is due to the lack of promoter elements able to recruit RNA polymerase without help from an activator. The upstream weaker-binding CRP site 2 plays little or no role in activation, despite being centred at a position (−101.5) where, potentially, it could function synergistically with the downstream bound CRP (Belyaeva et al., 1998).

The malI promoter contrasts sharply to the malX promoter, and carries elements that can recruit RNA polymerase without help from an activator. Potentially, CRP could both repress and activate the malI promoter. However, our data (Table 1) show that CRP binding to site 2, centred at position +17.5, causes little or no repression, while CRP binding to site 1, centred at position −43.5 causes only weak activation. The modesty of the activation is partly due to the activator-independent strength of the malI promoter, but principally it is because CRP site 1 is located 34 bp upstream of the −10 hexamer element, which is 2 bases upstream of the optimal position for Class II activation (Busby & Ebright, 1999). Several pieces of evidence suggest that the malI promoter has an unusual organization. First, the proposed −35 hexamer, 5′-TTACGC-3′, bears little resemblance to the consensus, 5′-TTGACA-3′. Second, despite four independent screens, we were unable to isolate any single ‘down’ mutation that reduced activity by more than 75%. Third, unlike at other E. coli promoters (Spassky et al., 1988; Jayaraman et al., 1989; Miroslavova & Busby, 2006), a G substitution at position 2 of the −10 hexamer element failed to completely suppress promoter activity in vivo. Finally, upstream sequences between positions −93 and −73 contribute to promoter activity (Table 1). Thus, we propose that malI expression is driven by an unusual factor-independent promoter, and we suggest that this may be a consequence of its peculiar sequence. For example, the sequence flanking the transcript start point from position −14 to +9 contains only three G : C base pairs. Also, starting at position −74, the base sequence consists of the motif 5′-TAN8-3′ repeated seven times. Recall that a T–A base step in DNA produces a positive roll that can cause a discrete kink, that many DNA-binding proteins bend DNA at such steps, and that phased T–A base pairs can facilitate the creation of higher order structures (Dickerson, 1998). Hence, we speculate that the sevenfold iteration of 5′-TAN8-3′, which is unique in the E. coli K-12 genome, creates a curved structure that can capture RNA polymerase in spite of single point changes. Clearly, further biochemical and genetic investigations are essential.

Supporting Information

Additional Supporting Information may be found in the online version of this article:

Table S1. Oligonucleotide primers used in this work.


This work was funded by a Wellcome Trust program grant and a BBSRC DTA studentship. We thank Christine Webster and undergraduate project students, Josh Lilley, Jennifer Crossley and Maria Jesus Pina, for some of the constructions.


  • Editor: Stephen Smith


View Abstract