OUP user menu

Isolation of high molecular weight DNA from soil for cloning into BAC vectors

Andrew E. Berry, Claudia Chiocchini, Tina Selby, Margherita Sosio, Elizabeth M.H. Wellington
DOI: http://dx.doi.org/10.1016/S0378-1097(03)00248-9 15-20 First published online: 1 June 2003


Isolation of high molecular weight DNA fragments from soil, in excess of 1 Mb, and of sufficient quality for cloning into an Escherichia coli–streptomycete artificial chromosome vector is described. The combination of indirect extraction of cells, using a nycodenz extraction technique, followed by lysis of biomass immobilised in agarose plugs, allowed fragments in excess of 1 Mb to be purified.

  • Soil environment
  • DNA isolation
  • Nycodenz gradient
  • Bacterial artificial chromosome cloning
  • Escherichia coli–streptomycete artificial chromosome

1 Introduction

Isolation of high molecular weight (HMW) DNA from soil has facilitated the cloning of DNA isolated from environmental samples into BACs (bacterial artificial chromosomes). This technology has allowed the characterisation of large regions of the genomes of as yet uncultured bacteria, and in one case where a 16S rDNA gene and a bacteriorhodopsin gene were present on the same insert, phylogenetic affiliation to a phototrophic lifestyle could be linked [1]. This approach has been termed metagenomics and thus far is the only route to study the genomes of ‘unculturable’ bacteria [24].

Metagenomics has the inherent potential to uncover and study the biosynthesis of secondary metabolites. This is only made possible because the genes encoding the biosynthesis of secondary metabolites are frequently clustered. Soil microbes represent an important source of bioactive compounds: antibiotics, antitumorals, immunosupressors and other important bioactive agents used in human therapy and agriculture.

One approach to maximise the probability of finding novel bioactive compounds is to concentrate efforts on groups of bacteria that are known producers of bioactive metabolites, but have not been intensively screened in the past because of difficulties in isolation and cultivation rather than their scarcity in the environment [5]. Actinomycetes fulfil both of these criteria and devote a large part of their genomes to the synthesis of secondary metabolites. The average strain may well have the genetic potential to produce 10–20, or possibly more secondary metabolites [68]. Thus it has become important to develop tools to allow the identification of environmental samples that are rich in actinomycetes [9]. A second consideration is whether the gene clusters will be expressed in a genetically amenable host. To this aim, Escherichia coli–streptomycete artificial chromosome (ESAC) vectors have been developed [10].

The final requirement is a methodology to purify fragments of DNA from soil or sediments in excess of 100 kb. Methods to isolate such fragments directly from sediments and soils are hampered by the problems of mechanical shearing, due to physical forces imposed on the sample during isolation, e.g. bead beating or ribolysing, to ensure that most cells are lysed. Furthermore, nucleases released during cell lysis must be effectively inhibited to avoid DNA degradation. Even gentle lysis methods do not yield fragments in excess of 100 kb [11,12]. Secondly, direct lysis methods frequently suffer from the co-purification of humic acids, depending on the soil or sediment type. Although, in a recent description of an uncultivated crenarchaeote, methodology to isolate DNA fragments (30–100 Kb in size and acids sufficiently pure for BAC cloning) from a soil rich in humic and fulvic acids was outlined [28]. DNA isolated from cells that are first separated from the soil or sediment matrix before lysis (indirect DNA isolation) would allow much larger fragments to be obtained, because the cells could be immobilised in low melting point (LMP) agarose and lysed in situ, thus protecting them from physical forces that otherwise result in DNA shearing. This approach was successfully used to purify large DNA fragments from oceanic bacterioplankton [13]. The method described, which is a combination of direct cell extraction techniques using a nycodenz gradient [14], followed by lysis of collected biomass in agarose plugs, is, to our knowledge, the first description of extraction of HMW DNA from soil with evidence of fragment sizes in excess of 1 Mb. Furthermore, indirect methods are less likely to purify extracellular DNA from environmental samples.

In the current study the nycodenz extraction technique was quantitatively studied for two soil types: Cryfield soil [15] and Bormio 11709, an Italian alpine forest soil with a high content of humic matter. The cell yields from the nycodenz extractions and the effect of drying and reducing the ionic strength of the homogenate upon these yields was investigated. DNA extracted from a third soil from a forest habitat in Gerenzano (Italy) was used for cloning in ESAC.

2 Materials and methods

2.1 Sample collection

Soil from Cryfield, a site near the University of Warwick, was collected in early June 2000 and the Bormio sample was collected in September 2000. In both cases material from between 0 and 15 cm depth was sampled and passed through a 2-mm mesh. All visible roots were removed. The pH of the Cryfield sample was 6.5 and that of the Bormio sample was 6.3, determined by equilibrating 20 g of the soil sample with 50 ml of deionised water after stirring for 1 h [16].

2.2 Collection of cellular biomass from a soil homogenate via a nycodenz gradient

A 40-g sample of soil was added to 140 ml of disruption buffer (0.2 M NaCl, 50 mM Tris–HCl pH 8.0) in a Waring blender. The suspension was blended on a low speed setting for 3×1-min periods with cooling on ice for 1 min between blending. A 25-ml portion of the soil homogenate was transferred to a 38-ml Beckman ultracentrifuge tube and 9 ml of nycodenz (1.3 g ml−1; Nycomed Pharma AS) was carefully pipetted to form a layer below the homogenate.

The tubes were placed in a Beckman SW28 swing-out rotor and centrifuged at 8 700 rpm (10 000×g) in a Beckman L8 Ultracentrifuge for 20 min at 4°C. A faint whitish band containing bacterial cells was resolved at the interface between the nycodenz and the aqueous layer. This band was transferred into a sterile oakridge tube. Sufficient phosphate buffered saline (PBS), approximately 35 ml, was added to fill the oakridge tubes. The cells were pelleted by centrifugation in a Beckman JA-21 centrifuge at 10 000×g for 10 min. The cell pellet was resuspended in 0.5 ml PBS.

A small scale extraction method was also tested. One-ml samples of the soil homogenate were transferred to a 2-ml Eppendorf. 650 µl of nycodenz was pipetted beneath the soil homogenate, and the tube centrifuged for 6 min at 10 000×g in an Eppendorf swing-out centrifuge 5417. A white band of cells was visible at the interface. This band was removed and transferred to a fresh 2-ml Eppendorf. The transferred material was washed by adding 10 mM sodium pyrophosphate to fill the Eppendorf, before centrifugation at 10 000×g for 6 min to pellet the collected cells. The cells were then resuspended in 500 µl PBS. For counting purposes the cell suspension was diluted 100-fold in 0.01 M sodium pyrophosphate and 200 µl filtered onto a 25-mm-diameter Amicon filter.

2.3 Collection of cells prior to lysis in LMP agarose plugs

An aliquot of 0.5 ml of cells were mixed with 0.5 ml of 1% (w/v) molten LMP agarose in 0.5×TBE. The mixture was cast into plug moulds of 100 µl. These plugs were stored at 4°C in 50 mM EDTA pH 8.0. Lysis of the immobilised cells was achieved according to the protocol described in [13]. A plug was placed in the well of a 1.2% agarose gel in 0.5×TBE and the well sealed with molten 1.0% (w/v) LMP agarose.

2.4 Cloning into ESAC

2.4.1 Vector preparation

Vector DNA was prepared from a pPACS1 containing E. coli using Qiagen maxi columns. Preparation was performed according to [17] with only minor modifications as follows. pESAC (20 µg) was digested with 20 units of BamHI (Roche) for 2 h. A further 20 units were added and the vector was incubated for a further 2 h. In order to test the extent of BamHI digestion and the integrity of the vector ends, about 10 ng of the vector was self-ligated and used to electroporate 25 µl of ElectroMax DH10B electrocompetent cells. The preparation was dephosphorylated using CIP and resolved by pulsed field gel electrophoresis (PFGE). A 19-kb fragment, the pPACS1 vector with a pUC19-containing region excised by the restriction digestion, was excised and eluted from the gel at 3 V cm−1 for 3 h. The polarity of the current was inverted for 30 s to release DNA from the wall of the dialysis bag. The solution was then dialysed against TE pH 8.0 at 4°C for 2 h, recovered and concentrated using a Y-30 Centricon column (Amicon). To further test the dephosphosphorylation efficiency a self-ligation using 50 ng of the treated vector was set up and used for electroporation as previously described.

Since the cloning efficiency is seriously affected by the quality of the vector preparation, it was assessed before proceeding with the ligation step. The integrity of the cloning site and the dephosphorylation efficiency were both examined. Two self-ligations were performed using 50 ng of the final preparation. One reaction was used to transform E. coli DH10B ElectroMax; the other was analysed by electrophoresis in a 0.8% agarose gel.

An absence of amp-R colonies indicated the absence of uncut vector, and the number of kan-R, sac-R, amp-S colonies obtained from the electroporation was used to assess the efficiency of the dephosphorylation step. Electrophoresis was also used to test the whether the vector remained linearised after self-ligation. To test the integrity of the cloning site, the same amount of vector was rephosphorilated with kinase, self-ligated and electrophoresed. The profile obtained was comparable with that obtained for uncut vector, indicating that the cloning site had not damaged during digestion with BamHI. The pESAC preparation (about 40 ng µl−1) was used for all subsequent cloning experiments.

2.4.2 Preparation and ligation of insert DNA

Agarose plugs, containing biomass and treated using the lysis procedure described, were thoroughly washed with sterile water. Each plug was transferred to a sterile 1.5-ml Eppendorf tube and incubated for 4 h at 4°C with 160 µl of buffer M, and 4 U of Sau3A. MgCl2 was added to a final concentration of 15 mM, and the digestion was performed at 37°C for 15 min. The reaction was stopped by incubating for 1 h at 37°C with 70 mM EDTA pH 8.0 and 1.5 mM Proteinase K. The plugs were then washed with buffer A (10 mM Tris–HCl, EDTA 1 mM, NaCl 50 mM) prior to size-fractionation. Alternatively plugs could be stored in 10 mM Tris–HCl, 1 mM EDTA pH 8.0 at 4°C. Partially digested DNA was loaded onto a 1% PFGE agarose gel and DNA markers were applied to the flanking wells. The initial direction of the field causes the DNA towards the upper edge of the gel, about 1 cm away from the wells. The electric field is then reversed so the remaining DNA migrates back toward the original starting well. A new marker was loaded to the gel and the DNA was resolved at 6 V cm−1 with 0.1–40 s pulse time. The flanking regions of the gel were removed and stained with EtBr to indicate the location of the size ranges. Gel slices were cut at 0.5-cm intervals, ranging from 50 to 500 kb. The gel slices were stored in 0.5 M EDTA pH 8.0 at 4°C until use. Recovery and concentration of HMW DNA was performed using electroelution.

2.4.3 Ligation conditions

The concentration and average size of partially digested DNA were estimated by elctrophoresis and ligations were set up using an insert to vector ratio of 1:10.

2.5 Cell counting

Direct microscopic counting was used to enumerate the total cell population in Cryfield and Bormio soils [18]. The number of cells recovered from the nycodenz extraction technique was determined by first diluting the cells to the appropriate density to result in less than 20 cells per field of view. Fifteen randomly chosen fields of view were counted.

3 Results

During the physical separation of the bacterial fraction using a nycodenz cushion a whitish band of microbial biomass was observed at the interface between the nycodenz and aqueous layers (Fig. 1). Both large and small scale methods using air-dried Cryfield soil were found to result in a cell yield, as measured by direct counting, of around 0.9% (Table 1). In both cases the ratio of sample to disruption buffer was 0.1 g ml−1. The small scale extraction method was applied to both fresh and dried Bormio soil using homogenates of 0.1 and 0.01 g ml−1, resulting in yields of c. 16% and 30% respectively, for the fresh sample, and c. 2% and 15% respectively for the dried sample (Table 1).

Figure 1

Resolution of bacterial biomass by a nycodenz gradient. A: Pre-centrifugation a layer of nycodenz is pipetted below the soil homogenate. B: A photograph and schematic to show the appearance of a large scale nycodenz gradient post-centrifugation. A whitish band containing cell biomass is observed at the interface between the nycodenz and aqueous layers.

View this table:
Table 1

Cell yield data from nycodenz extraction from Cryfield and Bormio soils

Cryfield soil (dried)Bormio 11709 (fresh)Bormio 11709 (dried)
Small scaleLarge scaleSmall scaleLarge scaleSmall scaleLarge scale
Homogenate g−1ml−
Total cells g−11.86±0.64×1010(n=3)2.66±1.46×1010(n=5)1.09±0.49×1010(n=6)
Yield g−11.68±1.20×108(n=5)1.70±1.64×108(n=5)4.34±0.44×109(n=5)8.09±3.21×109(n=5)1.74±0.67×108(n=5)1.59±0.88×109(n=5)
Yield %0.91±0.44 (n=5)0.90±0.52 (n=5)16.3±1.7 (n=3)30.4±12.1 (n=3)1.6±0.61 (n=3)14.6±8.7 (n=3)
  • ±1 S.D.

PFGE analysis of the size of DNA fragments obtained from the bacterial fraction immobilised in agarose plugs and subjected to a lysis procedure revealed an upper band that that was in excess of 500 kb, although some smearing down to lower molecular weights particularly around 50 kb was observed (Fig. 2B). Further gels showed that the upper bands co-migrated approximately with size markers that were 1.6 and 1.9 Mb in size, suggesting that this upper band contained DNA fragments in excess of 1 Mb (Fig. 2A).

Figure 2

PFGE analysis of biomass-containing plugs subjected to a gentle lysis procedure. A: Lane 1, Sigma pulse markers; lanes 2–4, DNA from air-dried Cryfield soil. B: Lane 1, yeast chromosome PFGE marker (NEB); lane 2, DNA isolated from air-dried soil Bormio; lane 3, chromosomal DNA prepared from Streptomyces coelicolor; lane 4, Lambda PFGE marker (NEB).

DNA isolated from forest soil in Gerenzano (Italy) was cloned into pESAC vector. Initially five different ligations were performed with various molar ratios of vector to insert. The frequency of clones showing a phenotype appropriate to that expected for transformants containing recombinants, kanR, sacR, ampS, was extremely variable, ranging from 15 to 200 clones per ligation. This corresponded to a transformation frequency of about 3×102–4×103µg−1 of insert DNA. In the first experiment the digested DNA was resolved with a single size-fractionation and the DNA used for the ligation was purified from an agarose slice cut in the range of 50–100 kb. Inserts and vector were ligated using a molar ratio of vector to insert of 1:5. Approximately, 50 colonies of kanR, sacR, ampS were obtained; 40 were analysed by digestion with DraI, revealing that just 17% contained inserts with an average size of 30 kb, while the highest fraction (83%) were non-insert clones.

In order to improve the insert size the second cloning was carried out using DNA from agarose slices cut in the range of 200–300 kb and 300–400 kb and a vector to insert ratio of 1:10 was tested. Lower frequencies of positive clones were achieved and no significant improvements were achieved.

In the third experiment environmental DNA was resolved by double size-fractionation [17], and the ligations were set up using a vector to insert ratio of 1:10. From each ligation c. 1×102 tranformants were obtained, corresponding to a transformation frequency of c. 4×103µg−1 of DNA. Around 20–25% of the colonies with the correct phenotype (kanR, sacR, ampS) were analysed by restriction digestion with DraI. A high number of kanR, sacR, ampS clones (about 70%) did not carry any insert. The average insert size was 20–30 kb, and few colonies carrying large inserts (50–70 kb) were also found (Fig. 3). It is remarkable that none of the large inserts contained any DraI, sites suggesting a high frequency of high-GC DNA in the clones. This observation was confirmed by end-sequencing of five ESAC clones containing inserts of c. 50 kb.

Figure 3

Analysis of environmental ESAC clones. DraI restriction profiles of eight randomly picked clones obtained after ligation of DNA isolated from a forest soil in Gerenzano (Italy) with vector pPAC-S1. Lane 1: Lambda pulse markers; lane 2: Sigma PFGE markers; lanes 3–11: pPAC-S1 digested with DraI; lane 12: Sigma pulse markers.

4 Discussion

To separate bacterial cells from the soil matrix a nycodenz gradient method was used. This is an established method for physically separating bacterial cells from soils and sediments by means of their buoyant density. The potential for the nycodenz extraction technique to bias towards particular cellular morphologies and particular phylogenetic groupings exists [19]. This issue has been quantitatively addressed by probing DNA isolated by both direct techniques and via the nycodenz procedure [20].

Two types of soil have been quantitatively examined with respect to yield; those obtained for the Bormio soil are comparable to previous measurements where yields of 20–30% and 30–50% were observed for clay loams and peat soils respectively [14]. However, the observed yield from air-dried Cryfield soil was considerably lower (0.9%; Table 1). This difference is likely to be due to the effect of drying and is predicted by the colloid stability theory. In dried samples stronger attachments are formed because the desiccation process forces the cells into close contact with particles against the electrostatic repulsion barrier where stronger attachments develop [2024]. This data suggests that fresh soil samples should be used in preference over dried ones to obtain higher cell yields when performing nycodenz extractions. The yield obtained for dried Cryfield soil and dried Bormio soil are relatively low, 0.9% and 1.6% respectively. Other described methods such as dispersion followed by shaking in distilled water or dispersion by mild ultracentrifugation report yields of 10–20%[2527]. However, previous work in this area, which compared different procedures on the same clay loam, found dispersion by Waring blender to be superior to dispersion by shaking or ultracentrifugation since it resulted in cell yields between 20% and 30%[27].

We examined the effect of reducing the ionic strength of the homogenate on cell yield. The results for both soils are also consistent with the colloid stability theory: that a reduction in the ionic strength of the soil homogenate, by decreasing the amount of soil per ml of distilled water, results in a significant increase in cell yield by promoting dissociation of the cells from soil particles (Table 1).

Analysis of the size of DNA fragments obtained using this method indicates that a significant and distinct part of the size distribution appears to be in excess of one Mb. This is a satisfactory starting size for restriction digestion to obtain fragments suitable for ligation in the range of 100–300 kb for cloning into ESAC. Some clones contained inserts in the region of 70 kb.

In conclusion we have demonstrated a method to extract HMW DNA from soil suitable for cloning into ESAC. It may provide a viable strategy for the analysis and exploitation of secondary metabolite gene clusters originating from ‘unculturable’ soil bacteria with particular emphasis on actinomycetes.


We would like to thank Jane Green for her invaluable technical assistance. We are also grateful to Professor Lars Bakken for his advice. C.C. was supported by a FEMS scholarship. The research was supported by the European Union under ACTAPHARM, project number 01783.


  1. [1].
  2. [2].
  3. [3].
  4. [4].
  5. [5].
  6. [6].
  7. [7].
  8. [8].
  9. [9].
  10. [10].
  11. [11].
  12. [12].
  13. [13].
  14. [14].
  15. [15].
  16. [16].
  17. [17].
  18. [18].
  19. [19].
  20. [20].
  21. [21].
  22. [22].
  23. [23].
  24. [24].
  25. [25].
  26. [26].
  27. [27].
  28. [28].
View Abstract