OUP user menu

Functional screening of antibiotic resistance genes from human gut microbiota reveals a novel gene fusion

Gong Cheng, Yongfei Hu, Yeshi Yin, Xi Yang, Chunsheng Xiang, Baohong Wang, Yanfei Chen, Fengling Yang, Fang Lei, Na Wu, Na Lu, Jing Li, Quanze Chen, Lanjuan Li, Baoli Zhu
DOI: http://dx.doi.org/10.1111/j.1574-6968.2012.02647.x 11-16 First published online: 1 November 2012


The human gut microbiota has a high density of bacteria that are considered a reservoir for antibiotic resistance genes (ARGs). In this study, one fosmid metagenomic library generated from the gut microbiota of four healthy humans was used to screen for ARGs against seven antibiotics. Eight new ARGs were obtained: one against amoxicillin, six against d-cycloserine, and one against kanamycin. The new amoxicillin resistance gene encodes a protein with 53% identity to a class D β-lactamase from Riemerella anatipestifer RA-GD. The six new d-cycloserine resistance genes encode proteins with 73–81% identity to known d-alanine-d-alanine ligases. The new kanamycin resistance gene encodes a protein of 274 amino acids with an N-terminus (amino acids 1–189) that has 42% identity to the 6′-aminoglycoside acetyltransferase [AAC(6′)] from Enterococcus hirae and a C-terminus (amino acids 190–274) with 35% identity to a hypothetical protein from Clostridiales sp. SSC/2. A functional study on the novel kanamycin resistance gene showed that only the N-terminus conferred kanamycin resistance. Our results showed that functional metagenomics is a useful tool for the identification of new ARGs.

  • metagenomics
  • antibiotic resistance
  • gut microbiology


The human gut microbiota is dominated by bacteria that are mainly in the phyla Firmicutes, Bacteroidetes and Actinobacteria (Rajilic-Stojanovic et al., 2007). These bacteria benefit human health by fermentating nondigestible dietary residues, breaking down carcinogens and synthesizing biotin, folate, and vitamin K (O'Hara & Shanahan, 2007). Since more than 80% of human gut microbiota are unculturable (Eckburg et al., 2005), culture-independent methods such as PCR and DNA microarrays are used to identify and isolate antibiotic resistance genes (ARGs) from human fecal metagenomes (Gueimonde et al., 2006; Seville et al., 2009; de Vries et al., 2011).

However, these sequence-based methods can only access genes that are similar to known sequences and provide little information on their activity. In contrast, functional metagenomics, which directly clones microbial DNA into a host organism followed by screening for a desired function, can identify completely new genes (Ferrer et al., 2009). In previous work, Sommer et al. (2009) characterized ARGs in the human microbiota using both culture-based and functional metagenomic methods; most ARGs identified through functional metagenomics had not been identified previously, whereas nearly half of the ARGs identified though the culture-based method had been characterized.

To further investigate the diversity of ARGs and mine novel ARGs in human gut microbiota, a metagenomic library of healthy human fecal samples was constructed and screened for ARGs using a functional approach. Instead of using a plasmid library, as in the work of Sommer et al. (2009), we apply the strategy of screening relative large inserts fosmid library first and then subcloning.

Materials and methods

Sample collection and DNA extraction

Fecal samples were obtained from four healthy unrelated volunteers who had not been treated with antibiotics for at least 6 months prior to sampling. Study information was given to the volunteers and informed consent for research was obtained. DNA was extracted from 1 g of each fecal sample < 24 h after collection, following the SDS-based extraction method described previously (Zhou et al., 1996). The rest of the samples were frozen at − 20 °C for future use.

Metagenomic library construction

Metagenomic DNA from the four fecal samples was combined together and loaded on a preparative pulsed-field gel [Bio-Rad CHEF DR®III; 0.1–40 s switch time, 6 V cm−1, 0.5 × Tris/Borate/EDTA buffer, 120° included angle, 16 h], and DNA of 36–48 kb was isolated, electroeluted, and dialyzed against 0.5 × Tris/EDTA (TE) buffer for 24 h. The resulting DNA was end-repaired and ligated into the pCC2FOS fosmid vector, packaged into phage, and introduced into the EPI300 strain of Escherichia coli using a CopyControl fosmid library production kit (Epicentre). The library was plated onto Luria–Bertani (LB) medium containing chloramphenicol (12.5 µg mL−1) and incubated at 37 °C for 24 h. All colonies were washed from the plates and combined into an amplified library stock.

Screening and subcloning antibiotic resistant clones

For screening, the metagenomic library was plated onto media containing inhibitory concentrations of amoxicillin (8 µg mL−1), cephalexin (16 µg mL−1), kanamycin (32 µg mL−1), amikacin (64 µg mL−1), tetracycline (4 µg mL−1), d-cycloserine (128 µg mL−1) or fosfomycin (128 µg mL−1). Concentrations that prevent growth of both E. coli EPI300 and E. coli DH5α were chosen. Plates were incubated at 37 °C for 24 h. Antibiotic-resistant clones were selected and fosmid DNA from each clone was purified and digested with EcoR I (Takara). Only clones with unique restriction fragment length polymorphism patterns were selected. For subcloning, fosmid DNA was extracted from selected resistant clones except for clones resistant to amoxicillin (E.Z.N.A. Plasmid Mini Kit I; Omega) and partially digested with Sau3A I (Takara). DNA fragments of 1–5 kb were recovered from an agarose gel and ligated into pUC118 BamH I/BAP (Takara). For amoxicillin-resistant fosmid clones, the kanamycin-resistance vector pHSG298 (Takara) cut with BamH I (Takara) and treated with alkaline phosphatase (Takara) was used instead of pUC118, which cannot be used for amoxicillin-resistant screening because of bearing the ampicillin resistance marker ampr. Ligation products were transformed into E. coli DH5α (Invitrogen) and spread onto LB agar plates containing either 100 µg mL−1 ampicillin for pUC118 or 50 µg mL−1 kanamycin for pHSG298 and another antibiotic as substrate: 8 µg mL−1 amoxicillin, 32 µg mL−1 kanamycin, 4 µg mL−1 tetracycline or 128 µg mL−1 d-cycloserine. After 24 h at 37 °C, a single resistant subclone from each plate was selected.

Subclone sequencing and bioinformatic analysis

Positive subclones were sequenced from two directions using M13 primers. Primers were designed from each read to close the insert sequence. Sequences were assembled with seqman software (DNAStar). Putative open reading frames (ORFs) were identified with ORF Finder (http://www.ncbi.nlm.nih.gov/projects/gorf/). All predicted ARGs were compared to exclude redundant ARGs (> 99% identity at nucleotide level), and the unique ARGs were analyzed as described previously (Sommer et al., 2009). Phylogenetic analysis was conducted with the neighbor-joining method using mega5 (Tamura et al., 2011). Bootstrapping (1000 replicates) was used to estimate the reliability of phylogenetic reconstructions (Felsenstein, 1985).

Kanamycin-resistance fused gene cloning and minimum inhibitory concentration determination

The kanamycin-resistance fused gene was amplified using the following primers: EcoRI-KM2-F, 5′-CCGGAATTCATGGAAAACAGGGCTGTG-3′ and XhoI-KM2-R, 5′-CGCTCGAGTTATTCTTCCT CCCCCGG-3′. The N-terminal domain of KM2 was amplified using primers EcoRI-KM2-F and XhoI-KM2-N-R, 5′-CCGCTCGAGTTACTTTCCTCCTAGTTTTTC-3′. The C-terminal domain of KM2 was amplified using primer XhoI-KM2-R with EcoRI-KM2-C-F, 5′-CCGGAATTCATGAATGACGTTAAGGCA-3′. The original fosmid DNA was used as the PCR template and products were cut with EcoRI and XhoI (Takara) and ligated into the expression vector pGEX-5X-3 (GE Healthcare) digested with EcoRI and XhoI and transferred into E. coli DH5α. The integrity of the cloned sequences in recombinant plasmids was confirmed by sequencing. Minimum inhibitory concentration (MICs) of kanamycin to the cloned whole length protein KM2 and its N-terminal and C-terminal domains were determined by broth microdilution according to Clinical & Laboratory Standards Institute (CLSI) (2010) guidelines. Escherichia coli DH5α carrying the vector pGEX-5X-3 was selected as negative control and E. coli ATCC 25922 was used as quality control strain.

Nucleotide sequence accession numbers

Sequence data from this work were deposited in GenBank with the following accession numbers: JN086157JN086173.


Identification of ARGs

One metagenomic library from four human fecal samples was created, containing c. 415 000 clones. The average insert size was c. 30 kb for about 12.5 Gb of metagenomic DNA. We identified 17 unique subclones, four resistant to amoxicillin, eight to d-cycloserine, two to kanamycin, and three to tetracycline (Table 1). All four resistance genes of the amoxicillin-resistant subclones (pAC1 to pAC4) encoded β-lactamases. The resistance genes of pAC2 and pAC3 were nearly identical to ARGs recently identified from human gut microbiota using functional metagenomics (Sommer et al., 2009). The pAC4 subclone harbored a new resistance gene, encoding a protein with only 53% identity to a β-lactamase from the newly sequenced pathogen Riemerella anatipestifer RA-GD (Yuan et al., 2011). All eight resistance genes in the d-cycloserine-resistant subclones (pCY1 to pCY8) encoded d-alanine-d-alanine ligases. Except for the resistance genes in pCY3 and pCY6, all other resistance genes were new, with identities ranging from 73% to 81% to known d-alanine-d-alanine ligases. Two kanamycin-resistant subclones (pKM1 and pKM2) were obtained. In pKM1, the resistance gene encoded a protein identical to the first reported bifunctional antibiotic-resistance enzyme 6′-aminoglycoside acetyltransferase-2″-aminoglycoside phosphotransferase from Enterococcus faecalis (Ferretti et al., 1986). In pKM2, a new fused resistance gene was identified, encoding a protein (designated KM2) of 274 amino acids. The N-terminus of KM2 (amino acids 1–189) exhibited 42% identity to a previously characterized AAC(6′) from Enterococcus hirae (Del Campo et al., 2005). The C-terminus (amino acids 190–274) was 35% identical to a hypothetical protein (GenBank accession number: CBL37632) from Clostridiales sp. SSC/2. Three different clades were reported previously in AAC(6′) enzymes and the N-terminus of KM2 was assigned to clade B with other proteins from this family (Fig. 1; Salipante & Hall, 2003; Mulvey et al., 2004; Riesenfeld et al., 2004; Donato et al., 2010; Partridge et al., 2011). Three tetracycline-resistant subclones (pTE1–pTE3) were obtained. All harbored known ribosomal protection-type resistance genes, including tet(O), tet(W), and tet(32). The tetracycline efflux gene tet(40) was also found in pTE1.

Phylogenetic tree of AAC(6′) enzymes. Clade definition is according to Salipante & Hall (2003). Numbers at the branching points represent the percent occurrence in 1000 random bootstrap replications. Only values of more than 50% are shown. For proteins with the AAC(6′) fused to other domains including AAC(6′)-Ie-APH(2″)-Ia (AAA88548) and ANT(3″)-Ih-AAC(6′)-Пd (AAL51021) derived from Salipante & Hall (2003), Kan4 (ACS83738) from Donato et al. (2010) and KM2 (AET35514) from this study, only the AAC(6′) portion was included. Scale bar: 0.1 expected amino acid replacements per site.

View this table:
Table 1

Annotation table of ARGs predicted in the antibiotic-resistant clones from the human gut metagenomic libraries

Subclone IDGenBank IDORF rangeORF (bp)Best match (accession no.)OrganismAa% identity
pAC1JN086157155–1030876ORN-1 β-lactamase (AAS88600)Raoultella ornithinolytica strain 55-199
pAC2JN0861581035–1919885HGA-1 β-lactamase (ACT97453)Uncultured organism96
pAC3JN086159566–1456891CblA-1 β-lactamase (ACT97374)Uncultured organism100
pAC4JN0861601583–2407825Class D β-lactamase (ADZ11543)Riemerella anatipestifer RA-GD53
pCY1JN0861612815–38701056d-alanine-d-alanine ligase (CBL14041)Roseburia intestinalis XB6B480
pCY2JN086162396–14511056d-alanine-d-alanine ligase (CBL20417)Ruminococcus sp. SR1/574
pCY3JN0861631406–24611056d-alanine-d-alanine ligase (CBK93148)Eubacterium rectale M104/1100
pCY4JN08616474–11321059d-alanine-d-alanine ligase (ACT97546)Uncultured organism74
pCY5JN0861651311–23691059d-alanine-d-alanine ligase (ACT97548)Uncultured organism74
pCY6JN0861662617–36721056d-alanine-d-alanine ligase (CBL14041)Roseburia intestinalis XB6B4100
pCY7JN0861671564–26281065d-alanine-d-alanine ligase (YP_003820787)Clostridium saccharolyticum WM181
pCY8JN086168132–11931062d-alanine-d-alanine ligase (YP_003820787)Clostridium saccharolyticum WM173
pKM1JN086169207–164614406′-aminoglycoside acetyltransferase-2″-aminoglycoside phosphotransferase (YP_004149647)Staphylococcus pseudintermedius HKU10-03100
pKM2JN086170564–13888256′-aminoglycoside acetyltransferase (CAE50925)Enterococcus hirae42
pTE1JN086171874–27931920Tet(O) (ADV69685)Streptococcus suis JS14100
pTE1JN0861712844–40641221Tet(40) (ADV69686)Streptococcus suis JS14100
pTE2JN0861721831–37501920Tet(W) (ADH00696)Bifidobacterium longum99
pTE3JN086173440–23591920Tet(32) (ABV82120)Eubacterium saburreum100
  • Nucleotide range of the ORF within the DNA insert of individual subclones most likely responsible for the antibiotic resistance phenotype.

  • Length of the predicted ARGs in base pairs.

  • Two tetracycline resistance genes were found in the same subclone pTE1.

Characterization of the kanamycin-resistance fused gene

To determine whether both domains of KM2 identified in this study were involved in kanamycin resistance, sequences encoding the two domains and the full-length protein were individually cloned and the MIC values of the three different recombinant strains were determined. The results showed that the N-terminal domain conferred kanamycin resistance, with the same MIC value as the full-length protein (256 µg mL−1), whereas the MIC value of the C-terminal domain was the same as the vector control strain (2 µg mL−1). These results indicated that only the N-terminal domain of the novel protein conferred kanamycin resistance.


To date, four classes of β-lactamase (A–D) encoding a large range of resistance have been recognized (Bush & Jacoby, 2010). Here, we identified four β-lactamase genes, three of which were assigned to Class A β-lactamase, and one to Class D; no genes belonging to Classes B (metallo β-lactamases) and Class C were found (Supporting Information Fig. S1). We cannot conclude from these results that there are no Class B or C β-lactamases presented in our gut; further efforts should be made to delineate the whole profile of β-lactamase genes in human gut.

The eight d-alanine-d-alanine ligase genes encoding resistance to d-cycloserine were assigned separately to two distinct groups in the phylogenetic tree but the genes in each group are very close to each other, which suggested that the d-cycloserine resistance genes we identified were probably derived from phylogenetically closely linked gut bacteria of two major taxa (Fig. S2).

Four bifunctional proteins with both domains involved in resistance to aminoglycoside antibiotics have been reported previously (Ferretti et al., 1986; Centron & Roy, 2002; Dubois et al., 2002; Mendes et al., 2004). In all cases, these bifunctional proteins had expanded substrate specificity. Pathogenic bacteria with these proteins would have a selective advantage in a clinical environment. Recently, the kanamycin-resistance protein Kan4, which has an AAC(6′) domain fused to an acetyltransferase domain, was identified from soil using functional metagenomics. Functional analysis showed that only the AAC(6′) domain conferred kanamycin resistance (Donato et al., 2010). In this study, we used a functional metagenomic method to characterize ARGs in human gut microbiota. A novel kanamycin-resistance protein with an AAC(6′) domain fused to a hypothetical protein domain was identified. The kanamycin resistance of the N-terminal domain of this novel protein was confirmed, but the function of the C-terminus was unknown. According to conserved domain searching through NCBI, the C-terminus just matched a domain of unknown function (DUF2007). Therefore, whether the C-terminus of this protein correlated to substrate specificity or others was unclear, and its exact function needs to be further investigated.

In our screen for tetracycline resistance, three known ribosomal protection-type genes were obtained: tet(O), tet(W), and tet(32). A tetracycline efflux gene tet(40) was also found in the same clone as tet(O). In a previous study using microarray analysis, tet(O) and tet(W) were the most prevalent tetracycline-resistance genes in fecal samples from adults from six European countries (Seville et al., 2009). In another study, numerous tet(W) sequences were uncovered through a functional metagenomic screen of antibiotic resistance in gut bacteria from two adult individuals in the USA (Sommer et al., 2009). The tetracycline efflux gene tet(40) was first identified in a human bacterial isolate and in a human gut metagenomic library. In both cases, it was linked to the mosaic tet(O/32/O) (Kazimierczak et al., 2008). The gene tet(40) was also found in a pig gut metagenome as a single tetracycline-resistance gene or linked to tet(W) (Kazimierczak et al., 2009). In our study, tet(40) was located in tandem with tet(O).

Sequence homology search showed that the ARGs we identified in this study were of diverse bacterial origin, including nonpathogenic species such as Bifidobacterium longum, as well as opportunistic pathogens such as Streptococcus suis and Staphylococcus pseudintermedius. Because the potential for gene transfer in the human gut is very high due to the dense microbial population (Kazimierczak & Scott, 2007), it is worth addressing in the future to what extent these bacteria serve as donors, disseminating the ARGs to other bacteria, especially the incoming pathogenic bacteria.

The fosmid-based method has some potential disadvantages in ARG screening. Genes on smaller plasmids (< 30 kb) might not be represented in the metagenomic library. Moreover, only ARGs that are properly expressed in E. coli with their own promoters will be identified. However, the fosmid-based method also has advantages. The larger insert size increases the likelihood of cloning complete ARGs. In fact, nearly one-third of resistant fosmid clones could not be subcloned, even after several trials. This could be because different vectors were used for cloning (pCC2FOS) and subcloning (pUC118 or pHSG298) or because some resistant determinants are out of the range of length chosen for subcloning (1–5 kb). Our further work will focus on whole-length sequencing to elucidate the resistance mechanisms conferred by the clones that failed to be subcloned.

It is worth noting that although the human subjects we used in this study were not exposed to antibiotic treatment for at least 6 months prior to sampling, we cannot exclude their antibiotic consumption history. As antibiotic-resistant strains can persist in the human host environment in the absence of selective pressure for a long time (Jernberg et al., 2010), the ARGs we identified cannot be considered intrinsic; they are probably the results of selective pressure conferred by antibiotics that the gut microbes previously encountered and somehow managed to maintain in the gut.

In summary, we constructed a metagenomic library from four human gut microbiota and screened for ARGs, uncovering diverse new genes, including a new kanamycin resistance gene fusion. This work helps us to further understand the ARG reservoir of the human gut microbiota, and we believe that other new ARGs will be mined from human gut in the near future. However, to what degree these ARGs in our gut are linked to the potential emergence and dissemination of antimicrobial resistance genes in human pathogens is unclear.

Supporting Information

Additional Supporting Information may be found in the online version of this article:

Fig. S1. Phylogenetic tree of β-lactamases.

Fig. S2. Phylogenetic tree of D-cycloserine resistant D-alanine-D-alanine ligases.


This work was supported in part by the National Basic Research Program of China (973 Program grants 2007CB513002 and 2009CB522605).


View Abstract