OUP user menu

Differential expression of the multiple chaperonins of Mycobacterium smegmatis

Tara Rao , Peter A. Lund
DOI: http://dx.doi.org/10.1111/j.1574-6968.2010.02039.x 24-31 First published online: 1 September 2010


Mycobacterium smegmatis contains three chaperonin (cpn60) genes homologous to the Escherichia coli groEL gene. One of these (cpn60.1) is required for biofilm formation, but is nonessential, whereas a second (cpn60.2) is essential. Mycobacterium smegmatis is unique among Mycobacteria in having a third chaperonin gene, cpn60.3. The cpn60.1 gene has a gene upstream (cpn10) that is homologous to the gene for the E. coli co-chaperonin GroES. Phylogenetic analysis of the mycobacterial homologues suggests that early gene duplication and sequence divergence gave rise to the cpn60.1 and cpn60.2 genes found in all Mycobacteria species, while cpn60.3 appears to have been acquired by horizontal gene transfer. Here, we show that cpn60.2 and cpn10 are expressed more strongly than cpn60.1, while cpn60.3 shows very low levels of expression. The expression of all the genes, except cpn60.3, is significantly induced by heat shock, but much less so by other stresses. We mapped mRNA 5′-ends for the cpn10 and cpn60.1 genes, and measured the promoter activity of the upstream regions of both genes. The results show that the mRNA for this operon is cleaved between the cpn10 and cpn60.1 genes. These results are consistent with the evolution of a distinct function for the cpn60.1 gene.

  • chaperonin
  • heat shock
  • Mycobacterium smegmatis


Protein structures are fully determined by their amino acid sequences (Anfinsen, 1973). However, in vivo, molecular chaperones are required to assist the folding of many proteins to their native state under normal conditions, where a high protein concentration can lead to aggregation unless transiently exposed hydrophobic regions are protected (Lin & Rye, 2006; Ellis, 2007; Horwich et al., 2007). Chaperones also play a key role during stresses such as heat shock, which can lead to the partial unfolding of proteins. One group of chaperones, the chaperonins (Hemmingsen et al., 1988), is typified by the Escherichia coli GroEL protein, which is the only essential chaperone in that organism (Fayet et al., 1989). Chaperonins are tetradecamers made up of 60 kDa subunits arranged in two heptameric rings, each with a central cavity where protein folding can occur. Each subunit has three domains referred to as the apical, intermediate and equatorial domains (Braig et al., 1994). Bacterial chaperonins interact with a separate heptameric co-chaperonin. In E. coli, the co-chaperonin (GroES) is also essential (Fayet et al., 1989). Generically, chaperonins are referred to as Cpn60 proteins, and the co-chaperonins as Cpn10 proteins (Coates et al., 1993).

Chaperonins bind their client proteins by hydrophobic interactions, initially to the apical domain (Fenton et al., 1994). Binding of the co-chaperonin displaces the bound protein into the cavity, where it can fold without interacting with other proteins with which it might aggregate. The cycle of binding and release of co-chaperonin and client protein is mediated by ATP binding and hydrolysis, via a complex set of allosteric interactions within and between the two rings (reviewed in Saibil et al., 2001; Horwich et al., 2007). The oligomeric structure of the chaperonins is thus essential for them to function because it provides a cavity (the ‘Anfinsen cage’) for protected folding (Ellis, 1996; Weber et al., 1998). Analysis in E. coli showed that while nearly 250 proteins interact with GroEL, only about 85 of these proteins are obligate GroEL clients (Kerner et al., 2005). Thirteen of these proteins were found to be essential proteins, explaining the essential nature of groEL (Kerner et al., 2005). These numbers may be underestimates; other studies imply that a larger subset of the E. coli proteome includes GroEL clients (Chapman et al., 2006).

A survey of 669 complete bacterial genomes showed that 30% have more than one chaperonin gene (Lund, 2009). As GroEL binds and folds a structurally diverse range of proteins, this raises the question of what purposes the additional copies serve. Multiple copies could simply increase the chaperoning capacity of the cell, but a more likely explanation is that following gene duplication, one copy may have retained the essential chaperone function while the others have diverged to take on different roles (Goyal et al., 2006; Lund, 2009). Measurement of the relative rates of evolution of chaperonin homologues supports this model (Hughes, 1993). Genetic analyses in several diverse bacterial species also support the latter model, with additional copies of chaperonins being implicated in functions as diverse as root nodulation and nitrogen fixation in Bradyrhizobium japonicum and Sinorhizobium meliloti (Ogawa & Long, 1995; Fischer et al., 1999); protection of the photosynthetic apparatus against thermal stress in Synechocystis PCC6803 (Glatz et al., 1997; Asadulghani et al., 2003) and Anabaena L-31 (Rajaram & Apte, 2008); and the formation of biofilms and granulomas in Mycobacterium smegmatis and Mycobacterium tuberculosis, respectively (Ojha et al., 2005; Hu et al., 2008). The Actinobacteria were the first bacteria shown to have multiple chaperonins (Rinke de Wit et al., 1992; Kong et al., 1993). In all Mycobacteria for which complete genome sequences are available, there are two cpn60 genes: one (which we refer to as cpn60.1) in an operon with cpn10 and the other (cpn60.2) elsewhere on the chromosome. The cpn60.2 genes are found in all Actinobacteria, whereas cpn60.1 is sometimes absent, indicating that cpn60.2 encodes the essential chaperonin (Goyal et al., 2006). When cpn60.1 is absent, the cpn10 gene always remains, as predicted, as this gene is also essential in E. coli. As predicted from the above observations, cpn60.2 from M. tuberculosis and M. smegmatis is essential, but cpn60.1 is not (Ojha et al., 2005; Hu et al., 2008). The role of M. smegmatis cpn60.1 in biofilm formation is possibly due to its association with KasA, a key component of the FASII complex that is required for long-chain mycolic acid synthesis (Bhatt et al., 2005). The cpn60 genes and cpn10 genes of M. tuberculosis are heat inducible and negatively regulated by the HrcA repressor protein (Stewart et al., 2002; Hu et al., 2008). Mycobacterium smegmatis is unusual among Mycobacteria in that it has a third copy of a cpn60 gene (Ojha et al., 2005). The function of this gene is not known.

In other bacterial species that possess more than one chaperonin gene, the differential expression of these genes is generally seen. In particular, in cases where one gene has been shown from genetic analysis to be the essential chaperonin, this gene generally shows the highest level of expression, whereas the other genes that may play additional roles are expressed at lower levels or under more specific conditions (e.g. Fischer et al., 1993; de León et al., 1997; Kovács et al., 2001; Gould et al., 2007; Hu et al., 2008; Sato et al., 2008). As part of our characterization of the three chaperonin genes and the proteins that they encode in the mycobacterial species M. smegmatis, we have measured their expression under normal growth and in response to various stresses, and we report these results here.

Materials and methods

Plasmids, bacterial strains, oligonucleotides and growth conditions

The bacterial strains are shown in Table 1. All oligonucleotides were synthesized by Alta Biosciences or [for use in quantitative real-time PCR (qRT-PCR)] by Applied Biosystems, and are shown in Table 2. Escherichia coli was grown in Luria–Bertani (LB) broth. A solid medium was prepared by adding 1.5% agar to the LB broth. Mycobacterium smegmatis was cultured in Difco Middlebrook 7H9 broth (BD Biosciences) containing ADC and 0.05% Tween 80, or in Difco Middlebrook 7H10 agar with ADC (BD Biosciences) and 0.05% Tween 80. Antibiotics were used at 100 μg mL−1 (ampicillin) or 50 μg mL−1 (kanamycin) for E. coli, and 20 μg mL−1 (kanamycin) and 150 μg mL−1 (hygromycin) for M. smegmatis.

View this table:

Strains and plasmids

mc2155 Δcpn60.1Δcpn60.1Ojha (2005)
DH5αfhuA2 Δ(argF-lacZ)U169 phoA glnV44 Φ80 Δ(lacZ)M15 gyrA96 recA1 relA1 endA1 thi-1 hsdR17Hanahan (1983)
pSD5BE. coliMycobacterium shuttle vector containing promoterless lacZ geneD. Chatterji
pSD5B-SFpSD5B containing 204 bp upstream region of cpn60.1This study
pSD5B-LFpSD5B containing 393 bp upstream region of cpn60.1This study
pSD5B-cpn10pSD5B containing 369 bp upstream region of cpn10This study
pSD5B-cpn60.2pSD5B containing 335 bp upstream region of cpn60.2This study
pSD5B-cpn60.3pSD5B containing 335 bp upstream region of cpn60.3This study
View this table:

Oligonucleotides used in this study

NameSequence (5′→3′)Purpose
Cpn60.1 gsp1GACGTCGTTGGTCTTGGTGReverse primer for 5′RACE of cpn60.1
Cpn60.1 gsp2GACGGACTTCACCAGCTGNested reverse primer for 5′RACE of cpn60.1
Cpn10 gsp2GTTGTACTTGATCTCGGTGNested reverse primer for 5′RACE of cpn10
SF fwd XbaIGACGTTCTAGAGGGTGACACCGTForward primer to amplify shorter cpn60.1 upstream fragment
SF rev SphITAAGTCTCTTGCATGCGCCTACGReverse primer to amplify cpn60.1 upstream fragment
LF fwd XbaITCCATCGTCTAGAGCGTGAACForward primer to amplify longer cpn60.1 upstream fragment
Cpn10 fwd XbaICGTCGTCATCTAGAACACCGAGForward primer to amplify cpn10 upstream fragment
Cpn10 rev XbaICACGCTCGTCTAGATGGAGCCCReverse primer to amplify cpn10 upstream fragment
Cpn60.2 fwd XbaIGCCATGCAGGTCTAGAACGCCAForward primer to amplify cpn60.2 upstream region
Cpn60.2 rev SphITGTCTTAGCGCATGCGAAGTGTReverse primer to amplify cpn60.2 upstream region
Cpn60.3 fwd XbaICGTCGGTCTAGAAGGCGCGACCForward primer to amplify cpn60.3 upstream region
Cpn60.3 rev SphICTTTGGGCATGCTCGGAGTCCReverse primer to amplify cpn60.3 upstream region
  • Underlined bases denote restriction sites.

Phylogenetic analysis

Protein sequences were identified and extracted from GenBank, aligned using clustalw with default values, and phylogenetic trees were drawn using phylip or neighbourhood joining, using upgma for clustering.

RNA extraction and cDNA synthesis

A 10 mL mid-log culture of M. smegmatis (grown in 7H9 and ADC with 0.05% Tween80) was mixed with 4 vol. of 5 M GTC buffer (5 M guanidinium isothiocyanate) lysis solution and mixed rapidly by swirling. Cells were pelleted by centrifugation at 1200 g for 30 min, resuspended in 1 mL of 4 M GTC solution, centrifuged for a minute at 16 000 g and resuspended in 1.2 mL of TRI reagent (Fluka Biochemicals), which was added to 0.5 mL of 0.1 mm ceramic beads in 2-mL screw-capped microcentrifuge tubes. The tubes were spun using a reciprocal shaker (Hybaid Ribolyser) at the maximum speed setting (6.5) for 45 s, and then left at room temperature for 10 min. Chloroform (200 μL) was then added and the tubes were vortexed for 30 s. The tubes were then left at room temperature for 10 min to partition the aqueous and the organic phases and then centrifuged at 16 000 g at 4 °C for 15 min. The lighter aqueous phase was transferred to a fresh tube, mixed with an equal volume of chloroform, vortexed and incubated at room temperature for 10 min before centrifuging at 16 000 g at 4 °C for 15 min. The aqueous phase was transferred to a new tube and 0.8 vol. of isopropanol was added to precipitate the nucleic acid at −20 °C overnight. RNA was pelleted at 16 000 g for 20 min at 4 °C, washed once with 1 mL of 70% ethanol, repelleted and briefly air-dried before being resuspended in 100 μL of RNase-free water. The resuspended RNA was then further purified using the Qiagen RNeasy Mini Kit (Qiagen) according to the manufacturer's instructions. The pure RNA was stored at −80 °C.

RNA was DNase treated using the Ambion turbo-free DNA kit according to the manufacturer's instructions. cDNA was synthesized using the high-capacity cDNA reverse transcription kit (Applied Biosystems). A total of ∼1.2−1.5 μg of RNA was used in a 20-μL reaction in all cases. cDNA was synthesized using a PCR cycle of 25 °C for 10 min, 37 °C for 120 min and 85 °C for 5 s.


qRT-PCR was performed using the custom-made Taqman gene expression assays (Applied Biosystems). A total of 60 ng of cDNA was used in each 20 μL reaction. Reactions were performed in 20 μL containing 10 μL 2 × Taqman gene expression Mastermix (Applied Biosystems), 1 μL Taqman gene expression assay (Applied Biosystems) and 9 μL cDNA (60 ng). The real-time PCR cycle was carried out in an ABI Prism 7000 Sequence Detection System (Applied Biosystems) (50 °C for 2 min, 95 °C for 10 min and then 40 cycles of 95 °C for 15 s, followed by 60 °C for 1 min). The fold change in the expression levels of each of the genes was calculated using the ΔΔCt method (Livak & Schmittgen, 2001).


RNA was extracted from mid-log cultures of M. smegmatis as described above, and the 5′RACE system for the rapid amplification of cDNA ends (Version 2.0, Invitrogen) was used according to the manufacturer's instructions, using the primers cpn60.1 gsp1, cpn60.1 gsp2 and cpn10 gsp2. cDNA was tailed at the 5′ ends using poly-cytosine and transcriptional start sites were identified by detection of the junction of this poly-C tail in the sequenced cDNA.

Construction of plasmids

The promoterless lacZ E. coliMycobacterium shuttle vector pSD5B was used to analyse promoter activity (Jain et al., 1997). Fragments of varying lengths upstream of the cpn60 or cpn10 genes were amplified with primers containing XbaI and SphI sites, or XbaI sites alone. The products were digested as appropriate and ligated into plasmid pSD5B. The resultant recombinant plasmids contained the various promoter regions just upstream of the lacZ gene (Table 1 and Fig. 1).


(a) Genes cpn10, cpn60.1, cpn60.2 and cpn60.3 with fragment lengths for promoter probe studies (Embedded Image), CIRCE sequences (Embedded Image) and transcriptional start sites (*). (b). Promoter activity in Mycobacterium smegmatis transformed with plasmids pSD5B (1), pSD5B-SF (2), pSD5B-LF (3), pSD5B-cpn10 (4), pSD5B-cpn60.2 (5) and pSD5B-cpn60.3 (6). Assays were performed in triplicate. Error bars show the SD between the replicates.

β-Galactosidase assay

Each of the pSD5B constructs containing a promoter region was electroporated into M. smegmatis mc2155 cells. The strains were grown in liquid media at 37 °C for 2 days, after which their absorbance at OD600 nm was measured. Each culture (100 μL) was added to 900 μL Z buffer (30 °C). A drop each of 0.1% sodium dodecyl sulphate and chloroform was then added to the tubes, which were vortexed to lyse the cells. The reaction was started by adding 200 μL ONPG (4 mg mL−1) and mixing well. When a significant yellow colour developed, the reaction was stopped by addition of 500 μL 0.5 M sodium carbonate. The yellow liquid was then centrifuged at the maximum speed for 5 min, after which the absorbance of the supernatant was measured at OD420 nm. LacZ activity was calculated using the formula OD420 nm/(OD600 nm× culture volume in millilitres × time of incubation in minutes).

Results and discussion

Phylogenetic analysis

To be consistent with the general convention on the naming of chaperonin genes (Coates et al., 1993), we name the three chaperonin genes of M. smegmatis as cpn60.1, cpn60.2 and cpn60.3. In the genome sequence published, these are numbered MSMEG1583, MSMEG0880 and MSMEG1978, respectively. The percentage identities and similarities between the three proteins they encode, and E. coli GroEL for comparison (as determined from blast alignments) are shown in Fig. 2a. Cpn10 and E. coli GroES show 65/45% similarity/identity. The arrangement of the genes is shown in Fig. 1. We constructed phylogenies (Fig. 2b) from all the Cpn60 amino-acid sequences available from 11 complete mycobacterial genomes, as identified from blast searches of the individual genomes. All of these, with the exception of M. smegmatis, possessed two chaperonin homologues. Proteins were tentatively assigned as either Cpn60.1 or Cpn60.2, based on the presence of either histidine or glycine–methionine repeats at their C-termini (Lund, 2009). As was seen with actinobacterial Cpn60 proteins in general (Goyal et al., 2006), the chaperonins fell into two distinct clades: one of Cpn60.1 proteins and one of Cpn60.2 proteins (Fig. 2). The most parsimonious explanation of this result is that a gene duplication event took place in the common ancestor of present-day Mycobacteria, followed by divergence in sequence and function that has been preserved during subsequent speciation. Cpn60.3 from M. smegmatis was an outgroup to both of these clades. blast searches with individual Cpn60 proteins from M. smegmatis confirmed the following: Cpn60.1 or Cpn60.2 always had as their best-matched homologues from other Mycobacteria, but the best match to Cpn60.3 was from the soil actinomycete Rhodococcus jostii, which also has two other cpn60 genes in its genome (McLeod et al., 2006). It is thus highly likely that a relatively recent horizontal gene transfer event accounts for the presence of the cpn60.3 gene in M. smegmatis, but not in other Mycobacteria.


(a) Comparison of the levels of similarity between the different Mycobacterium smegmatis cpn60 genes and groEL. Values are derived from blast searches. (b) Phylogenetic tree of mycobacterial chaperonins. A phylogenetic tree of the mycobacterial chaperonins was constructed using clustalw and phylip, as described in Materials and methods.

Expression analysis

We used qRT-PCR to determine the relative levels of expression of the chaperonin genes in M. smegmatis under normal growth conditions and after the following stresses: heat shock (42 °C), osmotic stress (1.5 M NaCl), oxidative stress (10 and 20 mM H2O2) and ethanol stress (5% ethanol). These conditions were chosen to enable a direct comparison with an equivalent analysis on the cpn60.1 and cpn60.2 genes of M. tuberculosis (Hu et al., 2008).

Under nonstressed conditions, cpn60.2 was the most highly expressed gene, followed by the co-chaperonin cpn10 and then cpn60.1, while cpn60.3 expression was barely detectable (Fig. 3a). The relative levels of Cpn60.1 and Cpn60.2 protein are difficult to measure as the proteins do not resolve well on sodium dodecyl sulphate polyacrylamide gel electrophoresis and we were unable to obtain specific antibodies, but we observed that in Western blots, the intensity of the major band cross-reacting with the antichaperonin antibody did not change significantly between the wild type and the strain lacking cpn60.1 (data not shown). Thus, Cpn60.2 appears to be the most abundant chaperonin in the cell.


(a) Expression levels of chaperonin and co-chaperonin genes relative to cpn60.2, under normal growth conditions. (b) Fold induction of chaperonin gene expression after heat shock (42°C). Expression levels were measured by qRT-PCR as described in Materials and methods, and fold induction for each gene was calculated relative to its level of expression under normal growth.

Among the various stresses, heat shock produced large increases (typically between 20- and 200-fold) in the expression of all the genes, except for cpn60.3. We monitored heat shock-induced expression at 5, 10, 15 and 30 min after the stress. The levels of expression of all the genes increased steadily and peaked at 15 min postshock (Fig. 3b). Ethanol and oxidative stress showed much smaller levels of change (typically between five- and 15-fold 30 and 60 min, respectively, after shocking the cells) and oxidative stress produced no change (data not shown).

These results show several differences from the expression of the equivalent genes in M. tuberculosis under the same stresses (Hu et al., 2008), in particular, in the very high induction by heat shock, but this may relate to the fact that microarrays that have a poorer dynamic range than qRT-PCR were used to measure expression. We also measured the expression levels of cpn60.2, cpn60.3 and cpn10 in the strain of M. smegmatis lacking cpn60.1, and found that they were not significantly different from the wild type (data not shown). As the chaperonin level is generally regulated in response to the level of unfolded protein present in the cell, this shows that no significant general chaperoning capacity is lost in the absence of Cpn60.1, supporting the model that this protein plays a more specialized role. It is not possible from these findings to determine whether or not the Cpn60.1 and Cpn60.2 proteins form mixed complexes in the cell, but we consider this to be unlikely on the basis that we have previously shown that two chaperonin proteins from Rhizobium leguminosarum, which show a much higher primary sequence identity than do the two M. smegmatis proteins, preferentially form homo-oligomers when coexpressed (Gould et al., 2007).

In M. tuberculosis, regulation of expression of the duplicated cpn60 genes has been shown to involve the repressor HrcA (Stewart et al., 2002), which is widely implicated in heat shock regulation in diverse bacteria (Zuber & Schumann, 1994), and binding sites for this protein (CIRCE sequences) have been identified upstream of both genes. Mycobacterium smegmatis contains a clear homologue to the M. tuberculosis hrcA gene (MSMEG 4505: 86/95% identity/similarity). We searched the entire M. smegmatis genome for matches to the CIRCE sequence CTAGCACTCN9GAGTGCTAG, using the programme patternsearch implemented in xbase (Chaudhuri & Pallen, 2006). If no mismatches were allowed, only one occurrence of the CIRCE sequence was found, upstream of cpn10 (see Fig. 1). If up to two mismatches were allowed, a further candidate CIRCE sequence was found upstream of cpn60.2, although two other potential matches were also found upstream of genes that are not usually part of the heat shock regulon (data not shown). It is thus likely that heat shock regulation of cpn10, cpn60.1 and cpn60.2 is mediated by the HrcA protein binding at CIRCE sequences, but this remains to be proven. No CIRCE sequence was found upstream of cpn60.3, consistent with the observation that it is not induced by heat shock.

Transcript and promoter analysis

In M. tuberculosis, although cpn10 and cpn60.1 are adjacent on the chromosome, two putative transcriptional start sites have been proposed (Kong et al., 1993). One of these is upstream of cpn10, in the region containing the CIRCE sequence that binds HrcA to regulate the heat shock response (Zuber & Schumann, 1994; Stewart et al., 2002). A second was identified 29 bp upstream of the cpn60.1 gene. However, a more recent report showed no promoter activity in this intergenic region (Aravindhan et al., 2009), raising the possibility that there is a post-transcriptional cleavage of the mRNA for this operon.

Because of this, and because our results showed that in M. smegmatis the adjacent cpn10 and cpn60.1 genes are expressed at significantly different levels under similar conditions, we used 5′RACE with the primers cpn60.1 gsp1, cpn60.1 gsp2 and cpn10 gsp1 to determine the transcriptional start sites of the cpn10 and cpn60.1 genes. The results showed two potential transcriptional start sites, one 133 bp upstream from the cpn10 gene and the second in the intergenic region 31 bp upstream of the cpn60.1 gene (Fig. 1), similar to earlier findings with M. tuberculosis.

To investigate whether the intergenic region did indeed contain a promoter, varying lengths of upstream regions of the chaperonin genes and the cpn10–cpn60.1 intergenic region (Fig. 1) were cloned into the pSD5B reporter plasmid, and LacZ activity was measured following the transformation of these plasmids into M. smegmatis mc2155. Only the regions upstream of cpn10 and cpn60.2 exhibited promoter activity. Neither the shorter nor the longer intergenic fragment reported any promoter activity, as would have been expected had the putative start site identified shortly upstream of the cpn60.1 gene been genuine (Fig. 1). We therefore conclude that the mRNA 5′-end observed between cpn10 and cpn60.1 is likely to arise from a specific post-transcriptional cleavage event, similar to the situation reported in M. tuberculosis. The lower levels of expression of cpn60.1 compared with cpn10 may thus result from differential stabilities of the mRNAs for these two genes. This may have evolved from a need to match the levels of expression of the essential cpn10 and cpn60.2 genes, despite cpn10 being in an operon with the nonessential cpn60.1. Specific cleavage of mRNA encoding chaperonins and co-chaperonins has also been seen in Agrobacterium tumefaciens (Segal & Ron, 1995), although in this case, cleavage is heat shock induced and leads to the preferential degradation of the groES mRNA.

In conclusion, the results are consistent with a model where, in Mycobacteria, one chaperonin (Cpn60.2) acts as the main housekeeping chaperonin in the cell, folding a range of client proteins both under normal growth conditions and after stresses such as heat shock, while the other (Cpn60.1) has evolved to have more specialized functions that are not essential for viability, although they are also heat shock sensitive. The role of the Cpn60.3 protein that has been acquired recently by horizontal gene transfer is not known, but considering the expression levels, it is not likely to be significant.


We are grateful for the financial support from the Darwin Trust of Edinburgh (studentship to T.R.). We would like to thank Prof. D. Chatterji (IISc, Bangalore) for the generous gift of plasmid pSD5B.


  • Editor: Roger Buxton


View Abstract