OUP user menu

Allelic subtyping of the intimin locus (eae) of pathogenic Escherichia coli by fluorescent RFLP

David W. Lacher, Hans Steinsland, Thomas S. Whittam
DOI: http://dx.doi.org/10.1111/j.1574-6968.2006.00328.x 80-87 First published online: 1 August 2006


Intimin is a highly polymorphic protein encoded by the eae gene and plays a crucial role in the attaching-effacing phenotype of diarrheagenic Escherichia coli and related pathogens. We have developed a method to quickly and accurately uncover allelic variation at the eae locus through the use of fluorescent RFLP (fRFLP). Application of fRFLP to 151 eae-positive strains (including the newly described Escherichia albertii) revealed 26 different fRFLP types that correspond to 20 of the 28 previously described eae alleles. Two sequence variants of the γ, ι, κ, and ζ alleles and three variants of ɛ were also observed. In addition to being reliable and accurate, the method can be easily adapted to accommodate new eae allelic sequences, as they become known.

  • intimin
  • genetic variation
  • enteropathogenic Escherichia coli
  • EPEC


Strains of attaching and effacing Escherichia coli (AEEC) are capable of intimately attaching to intestinal epithelial cells (Finlayet al,1992; Kaper, 1998). Once the bacteria attach to the host cells, they manipulate the host cytoskeleton to form pedestal structures beneath the bacterial cells and, in the process, efface the microvilli of the intestinal mucosa (Hartlandet al,2000; Clearyet al,2004). This process creates a characteristic intestinal histopathology, termed attaching effacing (AE) lesions. The ability of pathogenic E. coli to form AE lesions is encoded in a pathogenicity island referred to as the locus of enterocyte effacement or LEE (McDanielet al,1995; McDaniel & Kaper, 1997). The LEE island is ∼35kb in length and can be inserted into one of several chromosomal locations in AEEC clonal groups (Wieleret al,1997; Elliottet al,1998; Pernaet al,1998; Sperandioet al,1998). The LEE island comprises ∼40 genes including those that encode the structural components of a type III secretion system (TTSS) (Elliottet al,1998; Pernaet al,1998). The TTSS acts as a molecular delivery system to translocate bacterial effector proteins, many of which are also LEE-encoded, into the host cell (Denget al,2005).

Intimin plays a crucial role in the AE phenotype and is encoded by eae, one of the most highly polymorphic genes of the LEE island (Castilloet al,2005). Intimin is composed of six domains: a periplasmic domain, a transmembrane domain, three extracellular immunoglobulin-like domains, and an extracellular lectin-like domain (Luoet al,2000). Much of the transmembrane domain is homologous to the invasins of pathogenic Yersinia and this part of the molecule has been termed the central conserved domain (McGrawet al,1999). The extracellular domains of intimin interact with cellular receptors (Sinclairet al,2006) including Tir, the translocated intimin receptor that is encoded in LEE and moves to the eukaryotic cell via the LEE-encoded TTSS (Kennyet al,1997). An analysis of 27 intimin alleles from GenBank indicates that the four C-terminal extracellular domains contain more than 75% of the nucleotide variation present within those sequences (unpublished data). The AE strains represent a variety of enteric pathotypes of both humans and animals, including both typical and atypical enteropathogenic E. coli (EPEC), enterhemorrhagic E. coli (EHEC), Escherichia albertii, and Citrobacter rodentium (Donnenberg, 2002; Waleset al,2005). Among the AEEC, there is a striking association of different eae alleles with specific clones or clonal groups of pathogenic E. coli (Wieleret al,1997; Adu-Bobieet al,1998; Tarr & Whittam, 2002). This association of eae alleles with pathogenic lineages has been used to help classify strains into pathotypes, so a rapid and reliable intimin typing scheme is a valuable tool.

Here we describe a method to identify allelic variants of the eae locus through the use of fluorescent RFLP (fRFLP; Lazzaroet al,2002). In this method, the entire highly variable 3′ half of eae is amplified in a standard PCR reaction using primers that are located in the conserved central domain of eae and in the conserved downstream gene escD. The exact size of the amplicon depends on the specific eae allele, but is typically about 2kb in length. The PCR amplicon is then digested with one or more restriction enzymes that leave a 5′ overhang that acts as a template for the incorporation of a fluorescent dye-terminator nucleotide from a standard cycle sequencing kit. The multiple labeled restriction fragments are then separated on a capillary-based sequencer and their sizes estimated to within a few base pairs in length. This fRFLP system provides a rapid method for uncovering allelic variation in eae and for classifying eae subtypes based on complex restriction digests.

Materials and methods

Bacterial strains and DNA isolation

The strains in this study included 144 AEEC representing 38 O-serogroups, many of which comprise EPEC serotypes, and seven strains from the Escherichia albertii and Shigella boydii 13 clonal lineage (Huyset al,2003; Hymaet al,2005). All 151 AE strains were grown overnight at 37°C in 10mL of LB broth with moderate shaking. Genomic DNA was isolated using the Puregene DNA isolation kit (Gentra Systems Incet al, Minneapolis, MN). DNA concentrations were determined using a NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies Incet al, Rockland, DE). Working template concentrations of 50ngμL−1 were used for PCR.

PCR and eae-escD primer design

PCR primers were designed in the central conserved domain of eae (eae-F1: 5′-ACT CCG ATT CCT CTG GTG AC-3′) and the conserved downstream gene escD (escD-R1: 5′-GTA TCA ACA TCT CCC GCC CA-3′) based on available sequences for the α, β, γ, ɛ, ζ, and θ intimin alleles. The eae-F1 and escD-R1 primers are located at positions 25986–26005 and 27918–27937, respectively, of the complete LEE sequence from strain E2348/69 (GenBank accession number AF022236). Each 25μL reaction contained 2.5μL 10 × buffer II (Applied Biosystems, Foster City, CA), 2.5μL 2mM dNTP, 2.0μL 25mM MgCl2, 0.5μL 10μM eae-F1 primer, 0.5μL 10μM escD-R1 primer, 1.5U AmpliTaq Gold (Applied Biosystems), 1μL 50ngμL−1 genomic DNA template, and 15.7μL ddH2O. Amplification utilized an initial denaturing step at 94°C for 10min, followed by 35 cycles of 92°C for 1min, 55°C for 1min, and 72°C for 2min. A final step of 72°C for 5min was used for final completion of any partially extended product. PCR products (5μL) were visualized on ethidium bromide-stained 0.8% agarose gels by illumination with UV light.

fRFLP development

In silico digestions were performed to find suitable restriction endonucleases for fRFLP. To be considered for fRFLP, the restriction enzymes had to (1) leave a 5′ overhang, (2) mainly produce fragments in the range of the labeled size standard (60–640bp), and (3) produce fragments that could be labeled with fluorescent tags that were different than that of the size standard. Out of 532 restriction enzymes tested in silico, MseI was chosen for fRFLP because it meets all of the above criteria and was found to have the greatest power to discriminate known eae alleles (Figs 1 and 2).

Figure 1

Location of PCR primers and predicted MseI restriction sites for four major intimin alleles associated with human disease. The forward primer (eae-F1) is located in the central conserved domain of eae and the reverse primer (escD-R1) is located in the conserved downstream gene escD. The resulting amplicon is ∼1800–2100bp depending on the allele.

Figure 2

Fragment sizes of DNA resulting from MseI digestion of the eae-escD amplicon based on in silico analysis of known sequence of the eae locus. The number of restriction fragments ranges from 16 to 22 with an average of 19.1 fragments per amplicon. Two additional patterns were resolved for lane 5 (γ1.1 and γ1.2) and lane 7 (ɛ1.2 and ɛ1.3) by AseI/DdeI/SalI restriction digestions. The gray line denotes the 60nt length that is the lower limit of fragment size determined by fRFLP.

To confirm the eae allele assignments based on the MseI findings, a second restriction digestion was designed. Multiple enzymes were needed to give a similar amount of pattern variation as seen with MseI. For simplicity, restriction enzymes were selected that use the same reaction conditions (reaction buffer, incubation temperature, and labeled ddNTP). Three sets of triple-enzyme digests were found. A combination of AseI, DdeI, and SalI was chosen over the others based on lower enzyme cost and the ddNTP used is the same as that for the MseI digest. The two digests (MseI and AseI/DdeI/SalI) were used to subtype the intimin alleles from a diverse set of 151 AE strains.

PCR clean-up and restriction enzyme digestion

Samples that were positive for the eae-escD PCR were treated with ExoSAP-IT (USB Corporation, Cleveland, OH) to remove unincorporated dNTPs and PCR primers (5μL of PCR product and 2μL of ExoSAP-IT). All restriction endonucleases, reaction buffers, and bovine serum albumin (BSA) solutions were obtained from New England Biolabs Inc. (Ipswich, MA). MseI digests were set up so that each 30μL reaction contained 7.0μL ExoSAP-treated PCR product, 0.3μL 100 × BSA, 0.5μL 10UμL−1MseI, 3.0μL 10 × NEB2 buffer, and 19.2μL ddH2O. AseI/DdeI/SalI digests were set up so that each 30μL reaction contained 7.0μL ExoSAP-treated PCR product, 0.3μL 100 × BSA, 0.5μL 10UμL−1AseI, 0.5L 10UμL−1DdeI, 0.25μL 20UμL−1SalI, 3.0μL 10 × NEB3 buffer, and 18.45μL ddH2O. Reactions were incubated overnight at 37°C.


fRFLP was performed using the CEQ DTCS standard kit (Beckman Coulter Inc, Fullerton, CA). Each reaction contained 2.0μL of unpurified restriction enzyme digest, 1.5μL 10 × reaction buffer, 0.1μL ddUTP, 0.1μL Taq DNA polymerase, and 11.3μL ddH2O. Samples were incubated at 60°C for 1h, purified with Sephadex G-50 Fine columns (Amersham Pharmacia Biotech Incet al, Piscataway, NJ), dried under vacuum centrifugation (Savant Instruments Incet al, Holbrook, NY), and suspended in 10μL of deionized formamide. Of this, 2μL were mixed with 0.6μL CEQ DNA Size Standard 600 (Beckman Coulter Incet al, Fullerton, CA), and 39.4μL deionized formamide, and run on a CEQ2000XL (Beckman Coulter Inc.) using a capillary temperature of 50°C, a denature step at 90°C for 2min, injection at 2.0kV for 30s, and separation at 4.8kV for 65min (MseI digest) or 90min (AseI/DdeI/SalI digest). Fragment sizes were determined with the CEQ2000XL software, version 4.3.9.

DNA sequencing

The complete eae gene was sequenced in at least one representative strain for each fRFLP pattern identified. The 5′ half of eae was amplified using primers cesT-F9 (5′-TCA GGG AAT AAC ATT AGA AA-3′) and eae-R3 (5′-TCT TGT GCG CTT TGG CTT-3′) using the same PCR conditions described above. PCR products were purified using the QIAquick PCR purification kit (QIAGEN Incet al, Valencia, CA) and quantified with a NanoDrop ND-1000 spectrophotometer. Cycle sequencing reactions contained 6.0μL CEQ DTCS Quick Start premix (Beckman Coulter Inc.), 1.5μL 20μM primer, c. 180ng of cesT/eae product or 250ng of eae/escD product, and ddH2O to 15μL. Amplification utilized an initial denaturing step at 94°C for 1min, followed by 35 cycles of 96°C for 30s, 52°C for 30s, and 60°C for 2min. Upon completion of cycle sequencing, samples were purified with Sephadex G-50 Fine columns, dried under vacuum centrifugation, suspended in 40μL of deionized formamide, and sequenced using a Beckman CEQ2000XL DNA sequencer. Samples were analyzed using the CEQ2000XL software and then exported for further analysis with the SeqMan and MegAlign modules of the Lasergene software (DNASTAR Incet al, Madison, WI). Internal sequencing primers were designed as new sequence data were generated.


The ability of the eae fRFLP method to resolve intimin alleles was first tested on 53 strains based on the expected patterns from the in silico digestion of available eae sequences from GenBank. This initial study included 34 AE strains, most of which had previously been known to have the α, β, or γ intimin alleles, most often found in strains associated with human infection (Table 1). Nineteen additional AE strains were selected and tested by the fRFLP method because their intimin allele had been determined by other researchers or by previous work in our laboratory based on DNA sequencing or conventional RFLP (Table 2). The intimin alleles of all 53 strains were confirmed by both the MseI and AseI/DdeI/SalI digests.

View this table:
Table 1

Allelic variation in eae based on fRFLP

Accession numberStrain nameSerotypeeae fRFLP patternSource and pathotype (reference)
TW00588DEC 1a (572-56)O55:H6α 1.1Human EPEC (Rodrigueset al,1996)
TW07884E851/71O142:H6α 1.1Human EPEC (Craviotoet al,1979)
TW07923RN587/1O157:[h45]α 1.1Human EPEC (Blanket al,2003)
TW04262TB269CO145:[h34]α 2.1Human atypical EPEC (Boketeet al,1997)
TW01120B170O111:[h2]β 1.1Human EPEC (Moyenuddinet al,1989)
TW0535513180-25O111:H11β 1.1EHEC from food (Fenget al,1998)
TW001483448-87O114:H2β 1.1Human EPEC
TW0038929315 (AC-C12)O119:H2β 1.1Human EPEC (Goncalveset al,1997)
TW07099LT119-80O119:H2β 1.1Human EPEC (Goncalveset al,1997)
TW01266C342-62O126:H2β 1.1Human EPEC (Orskovet al,1990)
TW07896E56/54O128:H2β 1.1Human EPEC (Robins-Browneet al,1993)
TW01664DEC 10i (87-1713)O145:H16β 1.1Human EHEC
TW05149BCL73O145:[h-]β 1.1Bovine STEC
TW07860314-SO145:[h16]β 1.1Bovine STEC
TW0889402-3422O145:[h2]β 1.1Rabbit EPEC (Garciaet al,2002)
TW09153IH 16O145:[h-]β 1.1Human STEC
TW07924Z188-93O110:H6β 2.1Avian EPEC (Blanket al,2000)
TW012251396/69O119:H6β 2.1Human EPEC (Craviotoet al,1979)
TW03293ECOR-37O-:[h7]γ 1.1Marmoset atypical EPEC (McGrawet al,1999)
TW00947DEC 5d (C586-65)O55:H7γ 1.1Human atypical EPEC (McGrawet al,1999)
TW03064B6820-C1O145:[h28]γ 1.1Bovine STEC
TW07596GS G5578620O145:[h28]γ 1.1Human STEC (Feyet al,2000)
TW07865IHIT0304O145:H28γ 1.1Bovine STEC
TW08087MT#66O145:[h28]γ 1.1Human STEC (Jelacicet al,2003)
TW093564865/96O145:[h28]γ 1.1Human STEC (Unkmeir & Schmidt, 2000)
TW088813556-77Boydii 13γ 1.2Human atypical B13 (Hymaet al,2005)
TW088873557-77Boydii 13γ 1.2Human atypical B13 (Hymaet al,2005)
TW088893053-94Boydii 13γ 1.2Human atypical B13 (Hymaet al,2005)
TW0761898ST607O110:H28ζ 1.1Human STEC
TW0096475-83O145:[h25]ζ 1.1Human STEC (Boppet al,1987)
TW05307LTO55-43O55:H7θ 1.1Human atypical EPEC (Rodrigueset al,1996)
TW07960DA-34O103:[h25]θ 1.1Human STEC
TW00970DEC 8b (3030A-86)O111:H8θ 1.1Human EHEC (Tarr & Whittam, 2002)
TW07888010-311082O76:H51μ 1.1Human EPEC (Blanket al,2000)
  • * Lower case H-types in brackets were inferred from fliC allele determination.

  • fRFLP, fluorescent restriction fragment length polymorphism.

View this table:
Table 2

 Reference strains for defined intimin alleles and patterns determined by fRFLP and confirmed by DNA sequencing

Pattern no.Intimin alleleeae fRFLP patternReference strainSpecies and serotypeGenBank number (reference)
1αα 1.1TW06375 (E2348/69)E. coli O127:H6AF022236 (Elliottet al,1998)
2α2α 2.1TW01270 (C712-65)E. coli O125:H6DQ523600 (this study)
3ββ 1.1TW07862 (413/89-1)E. coli O26:[h11]AJ277443
4β2β 2.1TW07894 (0659-79)E. coli O119:H6DQ523605 (this study)
5γγ 1.1TW08264 (Sakai)E. coli O157:H7NC002695 (Hayashiet al,2001)
6γγ 1.2TW08888 (3092-94)S. boydii type 13DQ523608 (this study)
7ɛɛ 1.1TW08101 (MT#80)E. coli O103:H2DQ523606 (this study)
8ɛɛ 1.2TW08023 (MT#2)E. coli O121:H19AY186750 (Tarret al,2002)
9ɛɛ 1.3TW10363 (83F4)E. coli O-:[h8]DQ523612 (this study)
10ɛ2ɛ 2.1TW10371 (98B3)E. coli O116:[h9]DQ523614 (this study)
11ζζ 1.1TW07863 (537/89)E. coli O84:[h2]AJ298279 (Joreset al,2003)
12ζζ 1.2TW04892 (921)E. coli O111:H9AF449417 (Tarr & Whittam, 2002)
13ηη 1.1TW07892 (012-050982)E. coli O142:[h21]DQ523604 (this study)
14θθ 1.1TW01387 (CL-37)E. coli O111:H8AF449418 (Tarr & Whittam, 2002)
15ιι 1.1TW01933 (1252-59)E. coli O55:[h34]DQ523601 (this study)
16ιι 1.2TW04174 (TB227C)E. coli O86:[h8]DQ523602 (this study)
17ι2ι 2.1TW08839 (C-425)S. boydii type 13AY696842 (Hymaet al,2005)
18κκ 1.1TW06584 (C295-53)E. coli O86:H34DQ523603 (this study)
19κκ 1.2TW10337 (64B4)E. coli O49:[h10]DQ523611 (this study)
20λλ 1.1TW10327 (57A1)E. coli O33:[h34]DQ523609 (this study)
21μμ 1.1TW08260 (MA551/1)E. coli O55:[h51]DQ523607 (this study)
22νν 1.1TW10376 (106A5)E. albertiiDQ523615 (this study)
23ξξ 1.1TW10334 (60A3)E. coli O5:[h2]DQ523610 (this study)
24oo 1.1TW07627 (Albert 19982)E. albertiiAY696838 (Hymaet al,2005)
25ρρ 1.1TW10366 (93I4)E. coli O21:[h5]DQ523613 (this study)
26ττ 1.1TW08933 (K-1)S. boydii type 7AY696839 (Hymaet al,2005)
  • * Lower case H-types in brackets were inferred from fliC allele determination.

  • fRFLP pattern discovered in AEEC strains from Guinea-Bissau study (Valentiner-Branthet al,2003).

  • Variants of an intimin allele that are indistinguishable by fRFLP with the MseI digest.

  • § Variants of an intimin allele that are indistinguishable by fRFLP with the AseI/DdeI/SalI triple digest.

  • Benkel P. and Chakraborty T. (unpublished).

  • fRFLP, fluorescent restriction fragment length polymorphism; E. coli, Escherichia coli; S. boydii, Shigella boydii; E. albertii, Escherichia albertii; AEEC, attaching and effacing E. coli.

To further evaluate the new method, we then tested 98 eae-positive strains for which no allelic subtyping data existed. These strains were originally recovered from two separate populations: a pediatric population in Seattle, Washington (Dennoet al,2005) and a cohort study of childhood diarrheal disease in Guinea-Bissau, West Africa (Valentiner-Branthet al,2003). Among these 98 eae-positive strains, most strains (85%) exhibited known digestion patterns with both MseI and the triple digest, and therefore could be easily subtyped to an allele. The 15 strains for which the eae allele could not be determined had one of seven previously unobserved digestion patterns. A representative of each pattern was sequenced and the alleles were either previously described or variants of previously described alleles (ɛ1.3, ɛ2.1, κ1.2, λ1.1, ν1.1, ρ1.1, and ξ1.1).

Among the 151 strains examined, there were a total of 24 different fRFLP patterns observed for the MseI digests (Fig. 2). The AseI/DdeI/SalI digest also resolved 24 distinct fRFLP patterns (data not shown). For two alleles, ɛ1.2/ɛ1.3 and γ1.2/γ1.3 variants are resolved by the AseI/DdeI/SalI digests and are indistinguishable with MseI digestion. In contrast, two variants that are resolved by MseI (ι1.1/ι1.2 and ζ1.1/ζ1.2) are indistinguishable with the triple digest. Combined there are 26 distinct fRFLP patterns (see Table S1). In all, the new fRFLP method is able to identify 20 of the 28 previously described alleles, and differentiated two variants of the γ, ι, κ and ζ alleles, and three variants of ɛ.

To gauge the accuracy of the fragment size estimation, we compared the expected fragment sizes based on in silico digestions to those observed from the capillary sequencer. Overall the fragment scoring was accurate with most fragments over 100nt in length less than 2% different in size from their expected values (Fig. 3). Examination of the plot reveals that for fragments less than ∼100nt and greater than ∼550nt, the estimated fragments size deviates from the expected (Fig. 3). For the 83 fragments observed in the MseI digest, the average deviation from expected size is 1.76%. The triple digest was more accurate with an average deviation of 1.01% across the 70 distinct fragments.

Figure 3

Accuracy of fragments sizes estimated by fRFLP. The percentage observed difference from the expected fragment size is plotted against the expected size. The lines mark the percentage deviation for 1, 2, and 5nt, respectively.


Current eae typing schemes either focus on allele-specific PCR amplification (Adu-Bobieet al,1998; Reidet al,1999; Oswaldet al,2000; Zhanget al,2002; Blancoet al,2003; Joreset al,2003; Ordenet al,2003) or conventional RFLP analysis (Schmidtet al,1993; Oswaldet al,2000; Jenkinset al,2003; Ramachandranet al,2003). Sequence analysis of intimin alleles has revealed that many eae alleles are mosaics with segments having different evolutionary histories (McGrawet al,1999; Tarr & Whittam, 2002). Therefore, allele-specific PCR amplification can lead to erroneous typing results. For example, the μ allele would be erroneously typed as γ by the Reid method (Reidet al,1999) because the γ allele primer is located in a region that is shared between these two alleles (data not shown). In addition, as new alleles are discovered, new primers are necessary to amplify these alleles and the subtyping scheme becomes more complex.

Conventional RFLP analysis also has its limitations. The 5′ half of eae is relatively conserved among the different alleles, so there may not be sufficient variation in the amplicons to accurately and reliably differentiate the alleles. For example, EPEC strain 1396/69 (O119:H6) was typed by Jenkins (2003) as possessing the γ allele of eae instead of the β2 allele. In silico analysis revealed that the β2, γ, and λ alleles of eae are virtually indistinguishable under the Jenkins system. Another limitation of conventional RFLP is that a system based on the highly variable 3′ half of the gene may be difficult to score since small differences in the banding patterns of different alleles may not be easily discernible under standard electrophoretic conditions.

The fRFLP method described here addresses many of the limitations of the existing eae typing methods. A drawback of the devised method, however, is that it requires the location of escD relative to eae to be conserved. If this location changes such that escD is either upstream or far downstream of eae, PCR amplification will not occur and the strain will be nontypeable by this method. However, most of the known alleles for eae have been amplified and observed, so this situation does not appear to be common. We have tested the new subtyping method against a diverse panel of 87 eae-positive isolates originally recovered from children in West Africa and also uncovered seven additional variants. The remaining alleles (β3, ɛ3, ɛ4, η2, π, σ) still need to be tested. Another limitation is that some of the eae alleles in GenBank do not contain their associated downstream escD sequence, so the expected amplicon size and fRFLP profile cannot be fully determined in all cases. When new fRFLP patterns are observed, eae needs to be completely sequenced to verify the allele, but then the new pattern can be added to the fRFLP database for future reference.

The fRFLP method of eae allelic subtyping is easy to perform, cost-effective, and sensitive. The principal advantages are that it makes use of existing technology that is widely available, and can reveal previously unknown allelic variants. In the future, we expect more sensitive, high throughput methods for detecting nucleotide polymorphisms, such as those based on liquid microsphere suspension (Gilmouret al,2006), will provide quantitative technologies for rapid allelic identification of intimin and other crucial virulence factors.


This project has been funded in part with Federal funds from the NIAID, NIH, DHHS, under NIH Research Contract # N01-AI-30058.


View Abstract