OUP user menu

The CS6 colonization factor of human enterotoxigenic Escherichia coli contains two heterologous major subunits

Marcia K Wolf, Louise A.M De Haan, Frederick J Cassels, Geraldine A Willshaw, Richard Warren, Edgar C Boedeker, Wim Gaastra
DOI: http://dx.doi.org/10.1111/j.1574-6968.1997.tb10263.x 35-42 First published online: 1 March 1997


The genes encoding the CS6 colonization factor were cloned from two human enterotoxigenic Escherichia coli strains of different serotypes. The DNA sequences from both clones were nearly identical and contained four open reading frames. Two of them have homology to genes encoding molecular chaperones and ushers found in many other operons encoding colonization factors. The two remaining open reading frames encode two heterologous major subunit proteins which makes CS6 unique because other colonization factors have only one major subunit. Upstream and downstream of the CS6 operon the DNA sequences of the clones diverged abruptly.

  • Colonization factor
  • Escherichia coli
  • Enterotoxigenic Escherichia coli

1 Introduction

CS6 is a colonization factor (CF) of many human enterotoxigenic Escherichia coli (ETEC) strains. It is produced either together with other colonization factors (CS4, CS5, CFA/III) or alone [1]. Human ETEC can produce 20 antigenically distinct CFs and CS6 is one of seven CFs most commonly found on ETEC [2]. We found CS6 on 31% of ETEC isolated from soldiers in the Middle East [3]. CFs of ETEC function as adhesins to attach bacteria to intestinal epithelial cells. Attached bacteria can then deliver their toxin(s). It has not yet been shown that CS6 is an adhesin for human tissue [4], but a study in rabbits indicated that it is a CF [5]. Furthermore ETEC that produce only CS6 cause diarrhea in humans [3]. Since protective immunity against ETEC disease is mainly directed against the various CFs, CS6 should be included in a future vaccine [3]. To contribute to the development of future ETEC vaccines and to provide tools for epidemiological studies in endemic areas, we cloned and characterized the operon encoding CS6 from two strains of different serotype.

2 Materials and methods

2.1 Bacterial strains and media

E. coli M56, which contains a 61-MDa plasmid from ETEC strain E8775 (O24:H42) and expresses CS6, has been described [6]. E. coli DH5 was used as host for cloning and E. coli HB101 as host for plasmids used for production of heat, saline extracts. Clones from E8775 were grown in L broth. CS6 producing clones (pDEP3, pDEP4 and pDEP5) from ETEC strain E10703 (O167:H5) have been described [7]. The host for these plasmids was E. coli PC2495. Clones from pDEP3 were grown in terrific broth. Ampicillin was added at 50 or 100 μg/ml and chloramphenicol at 30 μg/ml. X-Gal (5-bromo-4-chloro-3-indolyl β-d-galactopyranoside, Sigma) was added at 0.004%. CFA plates were prepared as described [6].

2.2 Cloning of the CS6 operon from strain E8775

The plasmid from E. coli M56 was partially digested with HindIII and ligated to pUC19. The ligation mixture was transformed into E. coli DH5 and plated onto L agar with ampicillin and X-Gal. White colonies were picked to CFA plates with ampicillin and tested for CS6 expression (see below). Plasmids were purified as described [6].

2.3 Detection of CS6 expression

Western blotting of heat, saline extracts (prepared as described [6]) and colony blotting to detect CS6 expression was performed according to standard procedures [8]. Antisera specific for CS6 were prepared as described [6] except that rabbits were inoculated intravenously with live bacteria suspended in normal saline. Secondary antibody was peroxidase-conjugated goat anti-rabbit IgG (Cappel Laboratories, Cochranville, PA) and detection was by TMB substrate (Kirkegaard and Perry Laboratories, Inc., Gaithersburg, MD). Proteins were separated on precast 16% Tricine-sodium dodecyl sulfate polyacrylamide gels (SDS-PAGE, Novex, San Diego, CA).

2.4 Determination of N-terminal sequence

Heat, saline extracts from E8775 grown on CFA agar were ammonium sulfate precipitated at 20%, 40% and 60% saturation. Samples at 40% and 60% saturation were dialyzed against deionized water and loaded onto precast 16% Tricine-SDS-PAGE. Bands of approximately 16 kDa were excised from blots of these gels onto polyvinylidene difluoride (PVDF) membranes for automated gas-phase N-terminal sequencing as described [9].

2.5 DNA sequencing

DNA sequencing, synthesis of primers for DNA sequencing and DNA sequence analysis of clones derived from E8775 were performed as described [10]. Plasmids were purified for use as templates by standard methods [8]. DNA sequencing and computer analysis of sequence data derived from the clones of E10703 was performed as described [11]. Appropriate oligonucleotide primers were supplied by Pharmacia.

2.6 Site-directed mutagenesis of cssB

pDEP5 was partially digested with 0.4 U of PstI for 30 min at 37°C. A linear fragment of 7.2 kb was extracted from agarose and the 3′ sticky ends were made blunt as described [8]. The fragment was recircularized with DNA ligase and transformed to E. coli PC2495. Transformants were selected on L agar supplemented with ampicillin. Since pBR322 has a PstI site in the β-lactamase gene, only pDEP5 derivatives with mutations at the PstI site in cssB can grow. Two mutants were found to have the sequence CCCGGT in the cssB gene instead of CCCTGCAGGT. The production of CS6 by these plasmids (pIVB3-736 and pIVB3-737) was determined by ELISA as described [7] and by Western blots as described above.

3 Results and discussion

A stable clone (M233), containing approximately 24 kb insert DNA, was obtained from a partial digest of the 61 MDa plasmid from E. coli M56. A subclone (M285) containing a 4.9 kb HindIII-KpnI fragment that expressed CS6 was obtained from M233. CS6 expression by pM285 in E. coli HB101 was demonstrated by Western blotting of heat, saline extracts (Fig. 1). CS6 was detected in heat, saline extracts of bacteria grown on various media (Table 1), but not after growth at 17°C. This temperature regulation of CS6 expression is characteristic of other CFs from ETEC [1, 2, 12] and virulence genes in a variety of pathogenic bacteria [13, 14]. However, CS6 expression after growth on a variety of media is in contrast to other CFs that are expressed only if bacteria are grown on CFA agar [12].

Figure 1

CS6 expression from heat, saline extracts of bacteria grown on L agar at 37°C. Lanes: 1, E8775; 2, HB101; 3, M56; 4, HB101 carrying the plasmid from clone M233; 5, HB101 carrying the plasmid from clone M285; 6, HB101/pDEP5. A: Coomassie blue stain. B: Western blot using anti-CS6 sera.

View this table:

Regulation of CS6 expression

CFA 37°C+++
CFA 17°C
L agar 37°C+++
MacConkey 37°C+++

The N-terminal amino acid sequence of CS6 yielded two amino acids at each position (except position 12), indicating that two proteins were present (Fig. 2). From the strength of the two signals, a primary and secondary sequence was assigned in a molar ratio of primary sequence (CssA) to secondary sequence (CssB) of approximately 3:1.

Figure 2

N-terminal amino acid sequences of CS6. A single band of approximately 16 kDa was excised after SDS-PAGE and blotting to PVDF and subjected to Edman degradation for protein sequencing. Two amino acids were detected at each position (except cycle 12, − indicates no amino acid was detected) and from the strength of the signals a primary and secondary protein sequence was determined. These proteins are designated CssA and CssB. The protein sequences were compared to the sequences deduced from DNA sequencing. Amino acids within boxes are identical from protein and DNA sequencing, circled amino acids are mismatches.

The DNA sequences of the CS6 operon from strains E8775 and E10703 are available from Genebank accession numbers U04846 and U04844. A stretch of DNA of 4219 base pairs was 98% identical and contained four ORFs, designated cssA, cssB, cssC and cssD for Embedded ImageEmbedded Image ix (Fig. 3). The DNA sequences diverge abruptly on both sides of the common region. All four genes include a signal sequence typical for exported proteins. The GC content of the DNA is 34% and the codon usage is consistent with E. coli genes that are expressed at low or very low levels [15]. This is common for virulence-associated genes of E. coli[1, 11]. It has therefore been suggested that these genes may have originated in other bacterial species [1]. The four open reading frames are preceded by consensus sequences for binding RNA polymerase and ribosomes. The first ORF, cssA (Fig. 3), was identified as the gene for the primary protein based on the amino acid N-terminal sequence (Fig. 2). The CssA proteins from E8775 and from E10704 differ in 11 amino acid residues (Fig. 4). The CssA sequence of strain E3440A (O25:H-LT+) which coexpresses CFA/III fimbriae is more similar to CssA from E10704 than from E8775 (Fig. 4). No homologous proteins were found in the databases. It is unclear whether these differences reflect antigenic variation.

Figure 3

Organization of the CS6 operon deduced from DNA sequencing from ETEC strains E8775 and E10703. A, B, C, D are genes cssA, cssB, cssC, and cssD. The second and fourth lines show restriction enzyme sites for BamHI (B), ClaI (C), EcoRI (E), EcoRV (R), HindIII (H), KpnI (K), PstI (P), PvuII (V), SacI (S), XbaI (X), and XhoI (O).

Figure 4

The deduced amino acid sequence of cssA (A) and cssB (B), the two CS6 structural proteins. Arrows indicate the site of cleavage of the signal sequence. DNA sequences of the entire operons are available from Genebank accession numbers U04844 (E8775) and U04846 (E10703).

The cssB gene begins 17 bases downstream from cssA and was identified as the gene for the secondary protein based on the amino acid N-terminal sequence. The deduced mature proteins from both clones differ in five residues (Fig. 4). A region of dyad symmetry, which commonly act as a transcription terminator, is present six bases downstream from CssB. The calculated free energy value of this structure is −14.8 kcal. Termination at this site would yield a transcript with cssA and cssB such that the CssA and CssB proteins would be translated in equal amounts. Other CF operons also have stem loops downstream of the gene for the major structural subunit. This is considered a regulatory mechanism for over expression of subunit genes relative to other genes in the operons [16, 17]. For CS6, this would allow over expression of both CssA and Css6B. The occurrence of two major structural proteins is unusual because normally CFs have a single major subunit and a number of minor subunits [1, 13]. CS3 also has two major subunits, but in contrast to CssA and CssB that show no identity, CS3 and CS3a are nearly identical [18].

The cssC gene starts 48 bases downstream from cssB. The deduced proteins from both clones have 212 residues with seven differences (not shown, see Genebank database). CssC belongs to the chaperone protein family that functions by protecting the fimbrial subunits from proteolytic degradation and transporting them across the periplasm to the outer membrane [1, 19, 20]. The 3D structure of the chaperone protein for P-fimbriae is known and its important structural domains have been identified [13]. CssC conforms to the consensus of these domains.

The cssD gene begins 14 bases upstream of the end of cssC. The protein from E8775 is truncated by 8 residues relative to the protein from E10703 (not shown, see Genebank database) and there are 28 differences between the proteins. The deduced protein from cssD is homologous to molecular ushers located in the outer membrane [2, 13, 19, 20] that accept subunits from the chaperone and escort them to the bacterial surface. CssD and other usher proteins are approximately 30% identical and 50% similar [19]. Apparently the entire cssD gene is not necessary for CS6 expression since CS6 is detected from clones carrying pDEP5 which only contains the N-terminal one-third of cssD. It is not known if bacteria carrying pDEP5 produce CS6 in amounts comparable to bacteria carrying an intact usher since our assay for CS6 expression is not quantitative. A region of dyad symmetry is present 347 bp into the CssD gene in both clones. The calculated free energy value of these structures is −7.2 kcal.

The DNA sequences of the two clones diverge immediately downstream from cssD and 96 bases upstream of cssA. This area contains sequences homologous to five insertion sequences, but no complete insertion sequences (Fig. 5). The homology to IS91 in E10703 and Iso-IS1 in E8775 continues beyond the sequenced clones and may be complete insertion sequences in the native plasmids. Insertion sequences are often found surrounding virulence-associated genes in ETEC and it was suggested that the surrounding DNA contains ‘hot spots’ for insertion sequences [21]. Pieces of insertion sequences are found around the operons encoding CFs, LT and ST of human and animal ETEC [1, 11, 2123]. Whether or not these sequences are remnants of transposition events that delivered these operons to ETEC at ‘hot spots’ that are especially susceptible to integration and loss of insertion sequences is not clear.

Figure 5

Organization of DNA flanking the CS6 operons. Regions with homology to insertion sequences are shown.

The structure of CS6 remains undefined. It may be a single fine fibrillar structure composed of CssA and CssB subunits, or CssA and CssB may comprise individual fine fibrillae, or CS6 may be afimbrial. In order to investigate the requirement for the CssB subunit for CS6 expression, the cssB gene in pDEP5 was mutated as described above at its PstI site. This results in a frame shift such that a truncated CssB protein of 52 residues is made. The mutant did not express CS6 as tested in an ELISA using monoclonal antibody (Anna Helander, personal communication) and Western blot using polyclonal rabbit antiserum. This suggests that both CssA and CssB are required for CS6 expression and implies that CssA and CssB are necessary components of one structure. Alternatively, the chaperone and usher may not be transcribed in the mutant so that CssA protein is not properly assembled into its usual structure. Hoschützky et al. [24] suggest nonfimbrial adhesins are in fact fibrillar-like polymers that cannot be resolved by electron microscopy so that all associate into linear structures, but of varying diameters, some very thin. Following this idea, we favor the hypothesis that CS6 is a very fine fibrillae composed of CssA and CssB subunits.

The characterization of the CS6 operon from two ETEC that is presented here yields information useful for constructing vaccines containing CS6 as well as of probes that can be used in epidemiological studies.


This work was supported by the U.S. Army Medical Research and Development Command. The opinions or assertions contained herein are those of the authors. They are not to be construed as official or as reflecting the views of the Department of the Army or the Department of Defense. We thank Myron M. Levine for providing E8775 and many helpful suggestions and discussions. We thank Sonya Smith-Lewis and Els van Gestel for their help in DNA sequencing. We also thank J. Mark Carter and Carolyn Deal for N-terminal amino acid sequencing and Jeffrey Nauss for helpful discussions of protein structure.


View Abstract