OUP user menu

Comparison of partial tuf gene sequences for the identification of lactobacilli

Frederic Chavagnat, Monika Haueter, Juan Jimeno, Michael G. Casey
DOI: http://dx.doi.org/10.1111/j.1574-6968.2002.tb11472.x 177-183 First published online: 1 December 2002


Comparative analysis of partial tuf sequences was evaluated for the identification and differentiation of lactobacilli. Comparison of the amino acid sequences allowed differentiation between species and also between the subspecies of Lactobacillus delbrueckii. The nucleotide sequence comparison allowed differentiation between other subspecies and between some strains. Lactobacilli from several collections and isolates from dairy samples were clearly identified by comparison of short tuf sequences with those of the type strains. In evaluating the taxonomy of the Lactobacillus casei-related taxa, different tuf amino acid signatures are in favour of a classification into three distinct species. The type strain designation for the L. casei species is discussed.

  • Lactobacillus
  • tuf
  • Identification
  • Taxonomy

1 Introduction

Lactobacilli are very important in the food industry since they are used as starter cultures for fermentation processes and as probiotics in fermented milks or yoghurts. Therefore, precise identification of lactobacilli to the species level is required. Their classification into three groups based on metabolism and physiological characteristics [1] is not in agreement with the results of the phylogenetic studies by sequencing of 16S rRNA [2]. Schleifer et al. [3] reviewed the phylogeny of the genus Lactobacillus based on the 16S rRNA sequences but did not clarify the taxonomy of the L. casei group or the L. delbrueckii subspecies which are of special importance for the dairy industry. Recently, several systematic studies of lactobacilli have been reported [46]. Investigations by many authors using several molecular approaches focused on closely related species such as the L. casei-related taxa [714], the L. acidophilus group [13,1518], the L. delbrueckii subspecies [1922] and the L. plantarum-related species [23]. Nevertheless, the discrimination of very closely related species, and especially of subspecies, is often critical. Thus the sequencing of several other genes has been used for the discrimination of lactobacilli [2426].

The tuf gene has been used as a target gene for phylogenetic studies [27]. This gene encodes the elongation factor Tu, involved in protein biosynthesis, which facilitates the elongation of polypeptides from the ribosome and aminoacyl-tRNA during translation. It is universally distributed and in most Gram-positive bacteria only one tuf gene per genome has been found [28], thus it is ideally suited for phylogenetic studies. To discriminate between very closely related species and also between subspecies, DNA sequencing furnishes the greatest sensitivity and accuracy. In this paper, we describe the partial sequencing of the tuf gene from lactobacilli and the comparative analysis of the resulting sequences: the phylogeny of the Lactobacillus genus is investigated, the accurate identification of lactobacillus strains is accomplished using the sequences from the type strains to construct a reference database, and the classification of the L. casei-related taxa is discussed.

2 Materials and methods

2.1 Strains and media

Ninety-six bacterial strains from various collections were used in this study (Table 1). They were cultured in 10 ml of MRS [29] broth anaerobically for 24 h at either 30 or 37°C.

View this table:
Table 1

Strains of lactobacilli selected for this study

SpeciesCollection number
L. acidophilusATCC 4355, ATCC 314, ATCC 9224, ATCC 832, ATCC 11975, ATCC 4357, CIP 103601, CIP 103600
L. amylovorusATCC 33198, CIP 103610
L. caseiATCC 334, ATCC 4646, NCDO 173
L. caseiT or zeaeATCC 393, NCDO 161, JCM 1134
L. crispatusATCC 33197, CIP 103602, CIP 103603, CIP 103604, CIP 103606, CIP 103608, CIP 103605, DSM 20356, NCIMB 702172
L. gallinarumCIP 103650, CIP 103612
L. gasseriATCC 4963, ATCC 9857, ATCC 19992, ATCC 29601, CIP 103613, CIP 103614, CIP 103615, CIP 103616, CIP 103617, CIP 103618, CIP 103619, CIP 103784, CIP 103785, CIP 103786, NCIMB 8819
L. johnsoniiATCC 332, ATCC 11506, CIP 103654, CIP 103781, CIP 103782, DSM 20553
L. paracasei subsp. paracaseiATCC 335, ATCC 11582, ATCC 25180, ATCC 25302, ATTC 25303, ATCC 25598, ATCC 27092, ATCC 27216, ATCC 29599, NCDO 20006, NCDO 20207, NCFB 205, NCFB 206
L. paracasei subsp. toleransATCC 25599, NCDO 20012
L. pentosusFAM 1282, FAM 6408, FAM 1280, FAM 4023
L. plantarumATCC 8014, ATCC 10012, ATCC 10241, NCDO 1193, NCFB 963, NCFB 965, NCFB 1042, NCFB 1204, NCFB 1206, NCFB 1988, NCFB 2171
L. rhamnosusATCC 11981, ATCC 11443, ATCC 7469
L. zeaeTATCC 15820
NINCIMB 4504, NCIMB 8821, NCIMB 701360, NCIMB 701417, NCIMB 702173, NCIMB 702174, NCIMB 702470, NCIMB 702471, NCIMB 702472, NCIMB 702473, NCIMB 702658, NCIMB 702659, NCIMB 702660, NCIMB 702661, NCIMB 702662, NCIMB 702663, NCIMB 702664, NCIMB 702665
  • NI: clear identification currently not available.

2.2 DNA extraction

Genomic DNA was prepared from 1 ml of stationary-phase cultures. Bacterial cell lysis was performed by incubation first in 0.05 N NaOH for 15 min at room temperature, then in TES buffer (0.1 M Tris–HCl, 10 mM EDTA, 25% saccharose, pH 8.0) with 1 mg ml−1 lysozyme and 100 U ml−1 mutanolysin for 1 h at 37°C. Genomic DNA was purified using the High Pure PCR Template Preparation Kit (Roche) and resuspended in 200 µl final volume.

2.3 PCR amplification

Amplifications of 25 µl were performed using 1 µl of purified DNA solution with the AmpliTaq Gold PCR Master Mix of Roche. After pre-incubation at 95°C for 8 min, amplifications were carried out in a GeneAmp PCR system 2400 (Applied Biosystems) for 35 cycles, each with 30 s denaturation at 95°C, 1 min annealing at 55°C and 30 s extension at 72°C. The final elongation step at 72° was for 10 min. The tuf universal primers were the forward Keu1 (5′-AAY ATG ATI ACI GGI GCI GCI CAR ATG GA-3′ and reverse Keu2 (5′-AYR TTI TCI CCI GGC ATI ACC AT-3′) primers described by Ke et al. [30]. The PCR products were purified using the QIAquick PCR Purification Kit.

2.4 Nucleotide sequencing

Sequencing of PCR products was performed using the BigDye Terminator Cycle sequencing kit and analysed with a 47-cm capillary in an ABI Prism®310 Genetic Analyser (PE Applied Biosystems). The PCR primers were also used for the cycle sequencing. For complete double strand sequencing, two additional internal degenerate primers were used: the forward tuf371 (5′-CWG GTC GTG GKA CIG TTG-3′) and the reverse tuf384r (5′-NGT MCC ACG ACC WGT IAT-3′) primers were deduced from the alignment of sequences previously obtained with Keu1 and Keu2.

The alignments of the nucleotide and the translated amino acid sequences of the partial tuf gene sequences were performed with the program ClustalW 1.8 [31]. Phylogenetic trees were also constructed with ClustalW 1.8. Bootstrap confidence analysis was performed by generating 1000 replicates and trees were inferred with the neighbour-joining method of Saitou and Nei [32].

For the identification of strains, the sequences were identified by a FASTA search [33] against a database containing the tuf sequences of type strains.

3 Results and discussion

Ninety-six lactobacillus strains were tested in this study. An 802-bp fragment of the tuf gene (886 bp) was amplified by PCR with primers Keu1 and Keu2.

The sequences of 761 nucleotides (excluding primers) of the tuf gene were determined on both DNA strands for 37 lactobacillus strains, representing type strains and L. casei-related strains, using primers Keu1 and Keu2 as well as both internal primers tuf371 and tuf384r. The fragments did not reveal any overlapping or ambiguous peaks thus indicating the presence of a single gene per genome in these species. The 761-nt sequences were translated into 253-aa sequences.

Comparison of the similarity values (not shown) of the nucleotide sequences indicated that the tuf gene is slightly less conserved than the 16S rRNA gene in lactobacilli. The results of neighbour-joining analysis on either tuf-aa or tuf-nt sequences of the 37 species are shown in Fig. 1. The topology of the phylogenetic tree from tuf-aa sequences showed a distribution of lactobacillus species similar to that based on 16S rRNA gene sequence analysis: the closely related L. delbrueckii group (comprising L. acidophilus, L. amylovorus, L. crispatus, L. gallinarum, L. helveticus, L. gasseri, L. johnsonii, L. jensenii and the three subspecies of L. delbrueckii) is clearly differentiated from the large and heterogeneous L. casei group. The only topological differences concern L. hilgardii (included in the 16S rRNA L. casei/Pediococcus group of Collins et al. [2] whereas isolated in the tuf-aa phylogenetic tree), Weissella confusa [34] (included in the small 16S rRNA group 3 of Collins et al. [2] whereas included in the tuf-aa L. casei group) and L. amylophilus (a peripheral member of the 16S rRNA L. delbrueckii group of Collins et al. [2] whereas included in the tuf-aa L. casei group). In the tuf-aa L. casei group, closely related species formed small clusters very similar to those of the 16S rRNA L. casei/Pediococcus group of Collins et al. [2]. At lower bootstraps values the relative order of the clusters suffers from a rather large statistical uncertainty, similar to clusters of the 16S rRNA L. casei/Pediococcus group of Collins et al. [2]. The large similarities between tuf-aa and 16S rRNA trees suggest that the tuf gene evolves generally like the 16S rRNA gene in lactobacilli. This is supported by the fact that both the ribosomes and the elongation factor Tu are involved in amino acid chain elongation; their functional interaction is necessary for protein biosynthesis and the survival of lactobacilli. At the aa level partial tuf sequencing allows us to infer phylogenetic relationships between lactobacillus species and subspecies involved in this study. The phylogenetic tree, derived from tuf-nt sequences, showed that the clear division into three phylogenetic groups based on 16S rRNA gene [2] is not maintained with the tuf gene. The differences observed between phylogenetic trees may be explained by the synonymous nucleotide mutations, which may occur in the tuf gene without modifying the translated product, but nevertheless confer a higher degree of variability of nucleotide tuf sequences between species.

Figure 1

Neighbour-joining tree, showing the phylogenetic relationships between lactobacilli, based on a comparison of 253 tuf-aa sequences (a) and of 761 tuf-nt sequences (b). Bootstrap values based on 1000 replications are given at the branching points when they are above 70%. The bar represents 1% sequence divergence. EMBL accession numbers are given in parentheses.

The two subspecies, coryniformis and torquens, of L. coryniformis exhibit identical tuf-aa sequences, but have 0.26% tuf-nt sequence divergence. L. murinus and L. animalis present identical tuf-aa sequences. These two species have very similar characteristics, both phylogenetically (0.3% established divergence of their 16S rRNA sequences, accession numbers M58807 and M58826), and biochemically according to their fermentation profiles [35]. The 1.84% divergence between the tuf-nt sequences permits differentiation of both species.

In order to test the high potentiality of tuf gene sequence analysis for lactobacillus identification, the tuf-nt sequences were determined for other collection strains and for newly isolated strains at our institute. The sequences from one sequencing run with the Keu1 primer, of about 450 nucleotides, were translated into sequences of amino acids. The identification of strains was carried out by comparison of these sequences with those of the type strains. A sequencing run with the Keu2 primer was necessary for the unequivocal identification, at either the aa or nt level, of the three subspecies of L. delbrueckii because of their large sequence similarity. Complete double strand sequencing was carried out before some strains were proposed for a re-classification. For every strain tested in this study, the identification was based on a 100% tuf-aa sequence identity with that of the type strain. The species identity of 40 strains of the L. acidophilus group was confirmed and 18 unidentified strains were categorised (Table 2). The strains L. gasseri CIP 103614 and L. crispatus CIP 103605 should be re-classified as L. johnsonii and L. gasseri respectively. The identity of seven strains of the species L. delbrueckii was also confirmed, five as L. delbrueckii subsp. lactis and two as L. delbrueckii subsp. bulgaricus. The identity of 12 strains of L. plantarum and two strains of L. pentosus was confirmed whereas two L. pentosus strains from the collection at our institute were re-classified as L. plantarum and L. paraplantarum (Table 2). Comparison of the tuf-nt sequences yielded scores between 99.1% and 100% sequence identity with the type strain. These results are in favour of synonymous nucleotide mutations, which may occur in the nt sequence without changing the sequence of the translated product. As already discussed above, synonymous nucleotide mutations would result in a higher variability of tuf nucleotide sequences between the different lactobacillus species. They also allow the differentiation of subspecies (see L. coryniformis example) and sometimes permit differentiation between strains of a single species.

View this table:
Table 2

Identification of lactobacilli of the L. acidophilus group by tuf sequence comparison

Current nameIdentified asCollection number
L. crispatusL. gasseriCIP 103605
L. gasseriL. johnsoniiCIP 103614
L. pentosusL. plantarumFAM CC1280
L. pentosusL. paraplantarumFAM CC4023
NIL. crispatusNCIMB 4504
NIL. crispatusNCIMB 8821
NIL. acidophilusNCIMB 701360
NIL. crispatusNCIMB 701417
NIL. gasseriNCIMB 702173
NIL. gasseriNCIMB 702174
NIL. gasseriNCIMB 702470
NIL. crispatusNCIMB 702471
NIL. acidophilusNCIMB 702472
NIL. amylovorusNCIMB 702473
NIL. amylovorusNCIMB 702658
NIL. amylovorusNCIMB 702659
NIL. amylovorusNCIMB 702660
NIL. amylovorusNCIMB 702661
NIL. amylovorusNCIMB 702662
NIL. amylovorusNCIMB 702663
NIL. gasseriNCIMB 702664
NIL. johnsoniiNCIMB 702665
  • NI: clear identification currently not available.

Comparative analysis of the partial tuf-aa sequences is very reliable for the taxonomy and identification of lactobacillus species, and also for the differentiation of the three subspecies of L. delbrueckii. The comparative analysis of the partial tuf-nt sequences must be used in case of identical tuf-aa sequences, for example in order to differentiate between L. animalis and L. murinus or the two subspecies of L. coryniformis, or to detect variability between strains belonging to the same species.

In many of recent studies [714,25], the taxonomy of the L. casei-related taxa (L. casei, L. paracasei, L. zeae and L. rhamnosus) is still under discussion, particularly concerning three points: the classification of the current L. casei type strain (ATCC 393), the name of the current large L. paracasei species and the designation of its type strain.

The sequences of 761 nucleotides of the tuf gene were determined on both DNA strands for 23 strains of L. casei-related taxa. Both the nucleotide sequences and their translations into amino acid sequences were aligned. In Table 3, the strains have been classified based on the tuf-aa sequences. Three different tuf-aa signatures (representing the amino acids at positions 73, 85, 141, 155, 185, 188, and 249) are in favour of a classification into three distinct species.

View this table:
Table 3

tuf-aa and tuf-nt signatures of L. casei-related taxa

Collection numberOriginal nameCurrent nameProposed nameAccession numbertuf-aa signaturetuf-nt signature
ATCC 7469L. casei subsp. rhamnosusTL. rhamnosusTL. rhamnosusAJ418939LPVLIDD
ATCC 11443L. delbrueckiiL. rhamnosusL. rhamnosusAJ459828LPVLIDD
ATCC 11981L. casei subsp. rhamnosusTL. rhamnosusL. rhamnosusAJ459829LPVLIDD
ATCC 15820L. zeaeTL. zeaeTL. zeaeTAJ459387LKVLIDE
ATCC 393L. casei subsp. caseiL. caseiT or L. zeaeL. zeaeAJ418933LKVLIDE
NCDO 173L. casei subsp. caseiL. caseiL. zeaeAJ459390LKVLIDE
ATCC 27216L. casei subsp. alactosusTL. paracasei subsp. paracaseiL. caseiAJ418937IPIIVEDATGCG
ATCC 11582L. caseiL. paracasei subsp. paracaseiL. caseiAJ459394IPIIVEDATGCG
ATCC 25599L. casei subsp. toleransTL. paracasei subsp. toleransL. caseiAJ418922IPIIVEDGTGCA
DSM 20012L. casei subsp. toleransL. paracasei subsp. toleransL. caseiAJ459395IPIIVEDGTGCA
ATCC 334L. casei subsp. caseiTL. caseiL. caseiAJ418923IPIIVEDGTGTG
DSM 20207L. casei subsp. pseudoplantarumL. paracasei subsp. paracaseiL. caseiAJ459396IPIIVEDGTGTG
ATCC 25303L. caseiL. paracasei subsp. paracaseiL. caseiAJ459398IPIIVEDGTATG
ATCC 335L. caseiL. paracasei subsp. paracaseiL. caseiAJ459399IPIIVEDGAGTG
ATCC 25302L. casei subsp. caseiL. paracasei subsp. paracaseiTL. caseiAJ459393IPIIVEDGAGCG
ATCC 25598L. casei subsp. pseudoplantarumTL. paracasei subsp. paracaseiL. caseiAJ418912IPIIVEDGTGCG
NCFB 205L. casei subsp. alactosusL. paracasei subsp. paracaseiL. caseiAJ459388IPIIVEDGTGCG
NCFB 206L. casei subsp. alactosusL. paracasei subsp. paracaseiL. caseiAJ459389IPIIVEDGTGCG
ATCC 25180L. casei subsp. alactosusL. paracasei subsp. paracaseiL. caseiAJ459397IPIIVEDGTGCG
DSM 20006L. casei subsp. fusiformisL. paracasei subsp. paracaseiL. caseiAJ459391IPIIVEDGTGCG
ATCC 27092L. caseiL. paracasei subsp. paracaseiL. caseiAJ459391IPIIVEDGTGCG
ATCC 29599L. caseiL. paracasei subsp. paracaseiL. caseiAJ459385IPIIVEDGTGCG
ATCC 4646L. acidophilusL. caseiL. caseiAJ459386IPIIVEDGTGCG

The former L. casei subsp. rhamnosus type strain (ATCC 7469), containing the tuf-aa signature LPVLIDD, should be considered the type strain of the species L. rhamnosus which also includes the strains ATCC 11443 and ATCC 11981. This result is in agreement with all recent studies dealing with L. casei-related taxa.

The tuf-aa signature LKVLIDE is displayed by three strains: ATCC 15820, ATCC 393 and NCDO 173. Strain ATCC 15820, called Lactobacterium zeae by Kuznetsov [36], was proposed as the type strain of the species L. zeae by Dicks et al. [9] but was unfortunately not included in the study of Collins et al. [7]. Strain ATCC 393 is designated the neotype strain of the L. casei species and was found to show significant differences from the other L. casei strains [7,8]. On the other hand, strain ATCC 393 was found to be very similar to the L. zeae type strain (ATCC 15820) [4,811,25]. Because of this these authors are in favour of its classification as L. zeae. Strain NCDO 173 (basionym L. casei subsp. casei) was found to be very similar to ATCC 393 and was also classified as L. casei by Collins et al. [7]. The tuf-aa sequences indicate that these three strains should be classified in a single species. The name of this species should not be L. casei because these three strains are clearly different from all the other strains of the original large L. casei species that was renamed L. paracasei by Collins et al. [7]. In agreement with Dellaglio et al. [8,9,25], we prefer the name L. zeae.

The other 17 L. casei-related strains in this study, including the current L. paracasei type strain and the type strains of the four former L. casei subspecies (casei, tolerans, alactosus and pseudoplantarum re-assigned to L. paracasei by Collins et al. [7]), display the tuf-aa sequence signature IPIIVED. We propose that they should be assigned to their original species name L. casei in agreement with Dellaglio et al. [8,9,25] instead of L. paracasei as nominated by Collins et al. [7].

A type strain should also be selected for this species. Collins et al. [7] suggested the current L. paracasei subsp. paracasei type strain NCDO 151 whereas Dellaglio et al. [8,9,25] proposed L. casei subsp. casei ATCC 334. Comparative analysis of the tuf-nt sequences (761 nt) of the 17 strains exhibiting the IPIIVED tuf-aa signature showed divergence at positions 27, 57, 171, 232, and 627 resulting in six different sequence signatures. GTGCG is the 5-nt consensus signature deduced from 17 strains that represent less than the half of approximately 40 strains of the current L. paracasei species and of the former L. casei species available from international collections. The type strains of both the current L. paracasei species and the four original L. casei subspecies exhibit five distinct signatures, nevertheless, their consensus signature is also GTGCG. Thus GTGCG is truly the consensus sequence for the different tuf-nt signatures of this large species. Among the type strains of the four original L. casei subspecies, the only one that contains a signature identical to the consensus is L. casei subsp. pseudoplantarum ATCC 25598. The signatures of both the current L. paracasei subsp. paracasei NCDO 151 type strain and the former L. casei subsp. casei ATCC 334 type strain diverge from the consensus at positions 57 and 232 respectively. Thus tuf-nt sequence analysis indicates that the former L. casei subsp. pseudoplantarum ATCC 25598 type strain is the best candidate to be designated the type strain of the species L. casei.

The partial sequencing of the tuf gene of 37 species of the very heterogeneous genus Lactobacillus was carried out on over 200 strains. The comparative tuf-aa sequence analysis was in general agreement with that obtained with the 16S rRNA [2] and was very reliable for the differentiation and identification of the lactobacilli species and even of the three subspecies of L. delbrueckii. The slightly higher variability of the tuf-nt sequences allows the separation of very closely related species, subspecies and sometimes strains within a species. Due to its double — nt and aa — potential, we propose that the tuf gene be utilised as a reliable marker for inferring phylogeny and taxonomy of either distant or very closely related taxa of the genus Lactobacillus. This study has also confirmed most of the request for an opinion by Dellaglio et al. [37] concerning the taxonomy of the L. casei-related taxa. Only the choice of the L. casei type strain is different: the tuf-nt sequence analyses are in favour of L. casei subsp. pseudoplantarum ATCC 25598 rather than the former L. casei subsp. casei ATCC 334.


We thank Ms. I. Ryba and E. Wagner for their technical assistance.


American Type Culture Collection
Collection de bactéries de l'Institut Pasteur
Deutsche Sammlung von Mikroorganismen
Forschungsanstalt für Milchwirtschaft, Switzerland
Japan Collection of Microorganisms
National Collection of Dairy Organisms (UK)
National Collection of Food Bacteria (UK)
National Collection of Industrial and Marine Bacteria (UK)


  1. [1].
  2. [2].
  3. [3].
  4. [4].
  5. [5].
  6. [6].
  7. [7].
  8. [8].
  9. [9].
  10. [10].
  11. [11].
  12. [12].
  13. [13].
  14. [14].
  15. [15].
  16. [16].
  17. [17].
  18. [18].
  19. [19].
  20. [20].
  21. [21].
  22. [22].
  23. [23].
  24. [24].
  25. [25].
  26. [26].
  27. [27].
  28. [28].
  29. [29].
  30. [30].
  31. [31].
  32. [32].
  33. [33].
  34. [34].
  35. [35].
  36. [36].
  37. [37].
View Abstract