OUP user menu

Transcript analysis of Escherichia coli K-12 insertion element IS5

R. Gary Sawers
DOI: http://dx.doi.org/10.1016/j.femsle.2005.02.019 397-401 First published online: 1 March 2005

Abstract

The mobile insertion element IS5 is a relatively small but genetically compact DNA sequence of 1195 bp found in variable copy number in the genome of Escherichia coli strains. This study presents a detailed transcript analysis of the population of IS5 elements present in E. coli strains MC4100 and MG1655. The findings indicate that the ins5A gene comprising 978 bp is transcribed from its own promoter, which is located close to the right-hand end of the element. The two divergently transcribed genes ins5C and in5B form an operon, and this transcript is fully contained within the borders of the ins5A transcript. Although transcription out of IS5 from element-internal promoters was negligible, in the case of MG1655 a major transcript was found to extend into the insertion element. This suggests that IS5-specific transcription can be influenced by the specific location of the element in the chromosome, the orientation it adopts and the gene it interrupts.

Keywords
  • Insertion element
  • IS5
  • Overlapping transcripts
  • Transcriptional read-through

1 Introduction

In contrast to Salmonella species, which generally do not contain insertion elements, the genomes of Escherichia coli strains contain variable numbers of insertion sequences (IS). These are mobile bacterial DNA elements that can transpose to many sites on the chromosome and their activity can result in various genetic rearrangements. One of the most common IS elements is the 1195 bp IS5[13]. IS5 can be localised to a number of conserved positions in the genome and the copy number can vary from 11 in the sequenced E. coli strain MG1655 [4] to 23 in W3110 [5]. The IS5 elements in the chromosome of W3110 have been named is5A through is5W. They are found inserted within genes, as well as in intergenic regions, and can have both enhancer and silencer functions [6].

A remarkable feature of IS5 is its compact coding structure [7,8]. Three overlapping structural genes, comprising a total of 519 codons, are encoded on the 1195 bp element (Fig. 1A). The extreme right-hand and left-hand ends of the element have16 bp terminal inverted repeat sequences and these are adjacent to the 4 bp duplicated target site, which has the consensus CA/TAG/A [13]. The ins5A gene product is essential for transposition function; however, the roles of the ins5B and C gene products are less clear.

Figure 1

RT-PCR analysis of transcripts generated within IS5. (A) Schematic representation of the genetic organisation of the 1195 bp insertion element IS5. The three genes ins5A, B, and C are depicted as block arrows. The orientation of the element is based on that described previously [2,3] with the left end having the nucleotide position 1. The ins5A gene encompasses the sequence positions 1127–147 bp, the ins5B gene 495 bp (or possibly 525 bp, see [3]) to 851 bp, and the ins5C gene 205–462 bp. The dotted arrows indicate transcripts that are detected within IS5, as demonstrated in this study. The approximate hybridisation locations and orientations of the oligonucleotide primers used in the RT-PCR experiments are indicated (see also Table 1). (B) Ordered addition RT-PCR analysis of transcripts within IS5. Total RNA (2.5 μg) isolated from aerobically grown cultures was used in each RT-PCR reaction. The lanes labelled M show a 100 bp DNA ladder (Amersham Biosciences); lane 1, primer IS5-RT used for the RT reaction and then primer IS5-3 was added for PCR; lane 2, primer IS5-3 used for the RT reaction and then primer IS5-RT was added for PCR; lane 3, PCR using 10 ng of chromosomal DNA isolated from MC4100 and primers IS5-3 and IS5-RT; lane 4, primer IS5-3 used for the RT reaction and then primer IS5-B was added for PCR; lane 5, primer IS5-B used for the RT reaction and then primer IS5-3 was added for PCR; lane 6, PCR using 10 ng of chromosomal DNA isolated from MC4100 and primers IS5-3 and IS5-B.

Through the construction of gene fusions with galK it was deduced that the ins5A, ins5B and ins5C genes have promoters and expression occurs at a low level [7,8]. However, examination of the in vivo expression levels suggests that the ins5B gene is unlikely to be expressed from its own promoter [8]. Moreover, attempts to overproduce the 5B protein using the native ribosome-binding site of the ins5B gene failed [9], suggesting that the ins5B gene is transcribed together with ins5C. Despite having been characterised more than 20 years ago, no detailed transcript analysis of IS5 has been undertaken. In this study, it is shown that two major transcripts could be identified. One of these transcripts originates from the ins5A promoter, while the origin of the second indicates that the ins5C and ins5B genes form an operon.

2 Materials and methods

2.1 Bacterial strains, plasmids and oligonucleotides

The bacterial strains, plasmid and oligonucleotides used in this study are listed in Table 1.

View this table:
Table 1

Strains, plasmids and oligonucleotides used in this study

Strain, plasmid or oligonucleotideRelevant characteristicsReferences
MC4100FaraD139Δ(argF-lac) U169 ptsF25 deoC1 relA1 flbB530 rpsL150 λ[10]
MG1655a.k.a. NCM3629 prototroph[4]
MG1655 hfqLike MG1655, but hfq-1::W(cmr)This work
GS081Like MC4100, but hfq-1::Ω(cmr)[12]
pD3A pUC19 derivative including IS5; AmpR[13]
IS5-35′-CCCCTTGTATCTGGCTTTCAC-3′ sequence position 243-263 bp in IS5This work
IS5-A5′-GGCGCTTACTGCTGAATTCACTGTCGG-3′ sequence position 1079–1105 bp in IS5This work
IS5-B5′-GACCCACAGCCTGGTCACCACCGCGG-3′ sequence position 561–536 bp in IS5This work
IS5-C5′-GGGTTGCTGAAAAACGATAACCAACTGG-3′ sequence position 254–218 bp in IS5This work
IS5-RT5′-CATCATGAGTCATCAACTTACCTTC-3′ sequence position 1131–1107 bp in IS5This work

2.2 Isolation of chromosomal DNA and total RNA

Chromosomal DNA was isolated according to Sambrook and Russell [14]. Total RNA was isolated from bacterial cultures grown to mid-exponential phase using the Qiagen RNeasy kit according to the manufacturer's instructions (Qiagen manual, Qiagen Ltd., Crawley, UK).

2.3 Analysis of mRNA transcripts

Primer extension analysis was performed exactly as described [15] with 25 μg of total RNA and 0.2 pmol of [32P]-labelled oligonucleotides IS5-A and IS5-C (Table 1).

RT-PCR was performed with 2.5 μg of total RNA using the Access RT-PCR System (Promega, Southampton, UK). RNA samples were treated with RNase-free DNase and re-purified prior to use in RT-PCR experiments. The conditions used for the RT-PCR were exactly as described by the manufacturer. Control experiments without the prior addition of reverse transcriptase delivered no PCR product indicating that the RNA was free of chromosomal DNA contamination. The nucleotide sequences of the oligonucleotides that were used for RT-PCR are shown in Table 1. DNA sequences were determined using the method of Sanger et al. [16] with the labelled oligonucleotide primer that was used for the primer extension reaction.

3 Results

3.1 The ins5C and ins5B genes are co-transcribed

Previous studies on the expression of the ins5A, B, and C genes made use of transcriptional fusions [7,8]. However, a detailed transcript analysis of IS5 has not been performed. To determine the transcripts that are expressed within IS5, initial experiments were performed using RT-PCR. By performing the reverse transcriptase (RT) reaction with a single oligonucleotide primer and adding a second oligonucleotide primer after the RT reaction is completed, it is possible to define whether a transcript is made and the direction in which transcription occurs. Analysis of the ins5A gene transcript using primers IS5-3 and IS5-RT revealed that an 888 bp PCR product was detected only when primer IS5-3 was used in the RT reaction (Fig. 1B, lane 2). No PCR product was detected when the RT reaction was performed with primer IS5-RT (Fig. 1B, lane 1). This indicates that a single transcript extends from position 1131 bp to position 243 bp (see Table 1 for primer hybridisation positions), and that it is expressed in the right to left direction in Fig. 1A. Moreover, this experiment reveals that no transcript initiating upstream of ins5C extends to the other end of the IS5 element. A control PCR with chromosomal DNA as template yielded an 888 bp DNA product as anticipated (Fig. 1B, lane 3).

Primers IS5-3 and IS5-B hybridise within the ins5C and ins5B genes, respectively (Table 1). A 318 bp DNA product was delivered regardless of whether primer IS5-3 or IS5-B was used in the RT reaction (compare Fig. 1B, lanes 4 and 5). This indicates that transcription across this region of the IS5 insertion element proceeds in both directions. Control RT-PCR experiments in which the reverse transcriptase enzyme was omitted from the reactions delivered no DNA product, indicating that the RNA preparations were devoid of chromosomal DNA contamination. These findings also indicate that the ins5C and ins5B genes are co-transcribed and form an operon.

3.2 Determination of the transcription initiation sites of the ins5A gene

In a previous study, galK gene fusions were used to delimit the ins5A gene promoter to between nucleotides 1097 bp and the end of IS5 at position 1195 bp [7] (see Fig. 1A). To localise the transcription initiation site for the ins5A precisely, total RNA was isolated from two independent strains of E. coli, MC4100 and MG1655 (Table 1), and from each strain both the wild type and an hfq mutant was analysed (Fig. 2A). The levels of transcript in an hfq mutant were analysed because recent studies have shown that the RNA chaperone Hfq influences fnr transcription in the neighbourhood of in5F in MC4100 [11] and it was possible the IS5 transcription might be altered in an hfq mutant. A single transcription initiation site at cytosine 1155 was determined for MC4100 strains, while for MG1655 strains, a second transcription initiation site at thymine 1151 was also observed (Fig. 2A, compare lanes 1 and 2 with 3 and 4). No effect of the hfq mutation on ins5A transcript levels was noted. Careful examination of the autoradiogram in Fig. 2A reveals a further weak transcription initiation site at nucleotide position 1162 in MG1655 strains. Potential −10 RNA polymerase recognition sequences are located between 1170 and 1159 (Fig. 2B). Despite the fact that there are only poor −35 promoter sequences identifiable, they nevertheless fall within the 16 bp terminal inverted repeat at right-hand end of the insertion element (see Fig. 2B). The location of these transcription initiation sites correlates reasonably well with the earlier prediction, based on DNA sequence analysis, of a promoter between positions 1175 and 1139 [3,7].

Figure 2

The transcription initiation site of the ins5A gene. (A) Primer extension analysis was performed on total RNA as described in Section 2 using [32P]-labelled oligonucleotide primer IS5-A. The DNA sequence reactions are labelled G, A, T, and C. Lane 1, total RNA isolated from MC4100 (wild type); lane 2, total RNA isolated from GS081 (MC4100 hfq); lane 3, total RNA isolated from MG1655 (wild type); total RNA isolated from MG1655 hfq. The location of two transcription initiation sites identified for the ins5A gene are indicated on the right side of the panel and the DNA sequence of the sites is shown on the left. The unlabelled arrow at the top right of panel A indicates a transcript identified in MG1655 strains that initiates outside the boundary of the insertion element and transcription proceeds into ins5A. (B) The DNA sequence of the transcription initiation sites of ins5A are shown relative to the end of the insertion element (position 1195) and the AUG translation initiation codon of the ins5A gene. The transcription initiation sites are shown as angled arrows and the putative -10 promoter recognition sequences for RNA polymerase are highlighted by dotted underlining. The 16 bp terminal inverted repeat sequence is over-lined with an arrow.

It is noteworthy from Fig. 2A that in MG1655 strains a transcript initiating outside of IS5 extends into the insertion element. Since these analyses determine the combined transcripts of the whole population of IS5 elements in the respective strains, it is likely that this represents transcription into a single IS5 element in the MG1655 chromosome that is not present in the same location in MC4100 strains.

3.3 Determination of the transcription initiation site of the ins5C operon

High-resolution primer extension analysis identified two weak transcription initiation sites for the ins5C transcript, which are separated by one base and are located at nucleotide positions 181 and 183 (Fig. 3A). Analysis of the same RNA samples with a different oligonucleotide primer yielded the same findings (data not shown). Both transcription initiation sites were observed in MC4100 and MG1655 and in the wild type and hfq mutant strains. It can be concluded that Hfq does not influence IS5 transcription initiation.

Figure 3

The initiation site of the ins5CB transcript. (A) Primer extension analysis was performed on total RNA as described in Materials and methods using [32P]-labelled oligonucleotide primer IS5-C. The DNA sequence reactions are labelled G, A, T, and C. Lane 1, total RNA isolated from MC4100 (wild type); lane 2, total RNA isolated from GS081 (MC4100 hfq); lane 3, total RNA isolated from MG1655 (wild type); total RNA isolated from MG1655 hfq. The location of two transcription initiation sites identified for the ins5CB transcript are indicated on the right side of the panel and the DNA sequence of the sites is shown on the left. (B) The DNA sequence in the region of the transcription initiation sites of ins5C is shown relative to the GUG translation initiation codon of the ins5C gene. The transcription initiation sites of ins5C are shown as angled arrows and the putative −10 promoter recognition sequence for RNA polymerase is highlighted by dotted underlining.

The ins5C promoter has been previously localised to the DNA region between nucleotides 94 and 199 on IS5 [8] and therefore the findings presented here are in accord with this observation. The weak transcriptional signal identified in this experiment reflects the weak promoter expression identified previously, which approximated only 40% of that determined for the ins5A promoter [8].

As with the ins5A promoter, the ins5C promoter has a clearly identifiable −10 promoter sequence (Fig. 3B). The sequence TATCAT matches the consensus sequence of TATAAT in five out of the six positions. However, no clearly identifiable −35 promoter consensus sequence could be identified, although a sequence (TAGTGA) matching three of the six consensus (TTGACA) nucleotides is located 17 bp upstream of the −10 box.

4 Discussion

Two overlapping, divergent transcripts have been identified for the insertion element IS5. The longer transcript encodes the 5A protein, while the shorter transcript encodes both the 5C and 5B proteins. Remarkably, the shorter insCB transcript is completely contained within the longer ins5A transcript (see Fig. 1A).

One notable finding of this study was that the transcription initiation sites determined for both insA and insCB were at pyrimidine residues. Purine residues are more often, but not exclusively, used as sites of transcription initiation. Despite extensive efforts, no definitive transcription initiation site for ins5B could be identified (data not shown). Indeed, although it had been previously suggested that the ins5B gene is independently transcribed, promoter activity could only be detected in vitro but not in vivo [7,8]. This observation is in accord with the findings of the current study and suggests that the ins5B gene probably does not have its own promoter. Moreover, recombinant overproduction of the 5B protein could only be achieved by creating an artificial ribosome-binding site in front of the ins5B gene [9], which indicates that ins5C and ins5B are translationally coupled.

The detectable transcript levels for both ins5A and ins5CB were very low and it was only possible to determine the initiation sites by primer extension because the IS5 element is present in multiple copies on the chromosome of E. coli strains. No transcripts could be detected downstream of the ins5B gene, which suggests that termination of transcription probably occurs downstream of ins5B. This correlates well with the prediction, based on nucleotide sequence analysis, of transcription termination structures downstream of the ins5CB operon [3].

It was notable that transcription into one of the IS5 elements located on the chromosome of E. coli strain MG1655 was detected. This finding correlates well with previous findings [6] that transcription can continue into IS5 from promoters located outside the element. What consequence this transcript has on IS5 function is currently unclear.

The promoter activity reading into the IS5 in MG1655 was not detected when RNA from MC4100 was analysed, which indicates that MC4100 probably does not possess the same complement of IS5 elements that is found in MG1655 (see also [11]). It is unlikely that the particular IS5 element affected in MG1655 has the opposite orientation in MC4100, as this transcript was not detected when the ins5CB primer extension experiment was performed (data not shown).

Acknowledgements

Work in the author's laboratory was supported by the Biotechnology and Biological Sciences Research Council (UK) through a competitive strategic grant to the John Innes Centre.

References

  1. [1].
  2. [2].
  3. [3].
  4. [4].
  5. [5].
  6. [6].
  7. [7].
  8. [8].
  9. [9].
  10. [10].
  11. [11].
  12. [12].
  13. [13].
  14. [14].
  15. [15].
  16. [16].
View Abstract