OUP user menu

A genome sequence survey of the mollicute corn stunt spiroplasma Spiroplasma kunkelii

Xiaodong Bai, Saskia A. Hogenhout
DOI: http://dx.doi.org/10.1111/j.1574-6968.2002.tb11153.x 7-17 First published online: 1 April 2002


The mollicute corn stunt spiroplasma (Spiroplasma kunkelii) is a leafhopper-transmitted pathogen of maize. Sequencing of the ∼1.6-Mb genome of S. kunkelii was initiated to aid understanding the genetic basis of spiroplasma interactions with their plant and leafhopper hosts. In total, 144 712 nucleotides of non-redundant, high-quality S. kunkelii genome sequence were obtained. Sequence tags were searched against the Mycoplasmataceae and Bacillus/Clostridium databases. Results showed that, in addition to spiroplasma phage SpV1 DNA insertions, spiroplasma genomes harbor more purine and amino acid biosynthesis, transcription regulation, cell envelope and DNA transport/binding genes than Mycoplasmataceae genomes. This investigation demonstrates that survey sequencing is an efficient procedure for gene discovery and genome characterization. The results of the S. kunkelii sequencing project are available at the Spiroplasma WebPage at http://www.oardc.ohio-state.edu/spiroplasma/genome.htm.

Key words
  • Genome sequence
  • Leafhopper
  • Mollicute
  • Gene loss
  • Evolution
  • Gram-positive bacterium
  • Plant pathogen
  • Spiroplasma kunkelii

1 Introduction

The mollicute Spiroplasma kunkelii is a member of the family Spiroplasmataceae within the order Mycoplasmatales. Spiroplasmas are primarily associated with insects and plants in epiphytic, symbiotic or pathogenic interactions. Three spiroplasma species evolved as plant pathogens: the citrus stubborn spiroplasma Spiroplasma citri, the corn stunt spiroplasma (CSS) S. kunkelii, and the periwinkle yellowing spiroplasma Spiroplasma phoeniceum. Plant-pathogenic spiroplasmas are restricted to the sieve tubes of their plant hosts and are transmitted from plant to plant by phloem-feeding leafhoppers in a persistent propagative manner [13]. CSS is one of the most important threats to maize. Typical symptoms of CSS infection include chlorosis, stunted plants with reduced internode length and proliferation of ears that do not mature[1].

Mollicutes are thought to have diverged from a Gram-positive Clostridium-like ancestor and differ phenotypically from other bacteria in their minute size (0.3–0.5 μm) and lack of cell wall [4,5]. Genomes of Mollicutes are smaller in size than those of most other prokaryotes as a result of degenerative or reductive evolution. However, gene loss in spiroplasmas was not as extensive as in other members of the class Mollicutes [4,6]. Interestingly, the spiroplasma morphology differs from that of other Mollicutes. All members within the genus Spiroplasma have pleomorphic shapes varying from spherical or slightly ovoid, 100–250 nm, to helical fragments that are about 120 nm in diameter and 2–4 μm long during active growth and up to 15 μm in later stages of growth, whereas members of other mollicute genera typically have a spheroidal to ovoid shape and commonly do not have a helical elongated stage in their life cycle.

The genomes of Mycoplasma genitalium, Mycoplasma pneumoniae, Ureaplasma urealyticum and Mycoplasma pulmonis of the family Mycoplasmataceae within the order Mycoplasmatales have been sequenced to completion [710], and close to seven full genome sequences from the Bacillus/Clostridium group, the most closely related walled bacteria to the class Mollicutes, are available as well. The sample-sequencing project of the S. kunkelii genome and subsequent comparison of S. kunkelii sequence data with Mycoplasmataceae species and Bacillus/Clostridium sequence databases as described herein revealed interesting differences in gene content between S. kunkelii and members of the Mycoplasmataceae.

2 Materials and methods

2.1 Selection of the S. kunkelii strain

The S. kunkelii strain CSS-M was selected for genome sequencing. The strain was originally isolated from infected corn plants in Tlaltizapan, Mexico in 1992[11]. It has been maintained at the Ohio Agricultural Research and Development Center (OARDC) by serial transfers with the CSS leafhopper vector (Dalbulus maidis) as described in Ebbert and Nault[12]. CSS-M was isolated from infected corn stems, propagated in liquid LD8A3 medium, and plated onto LD8A3 agar plates[13]. A culture derived from a single colony was used for genomic DNA isolation. The S. kunkelii clone was transmitted to maize seedlings (Zea mays L. ‘Early Sunglow’) by D. maidis[1], indicating the clone kept its characteristics of leafhopper transmission and pathogenesis of plants.

2.2 Construction of S. kunkelii genomic DNA libraries

Genomic DNA was isolated from S. kunkelii using the Qiagen (Valencia, CA, USA) Whole Genomic DNA isolation kit following the manufacturer's procedures. The isolated genomic DNA was used for library construction. The DNA was digested to completion with Eco RI or Hin dIII, ligated into appropriately digested, phosphatase-treated pUC18, and transformed into electrocompetent XLblue Escherichia coli (Stratagene, La Jolla, CA, USA) cells. For construction of a random sheared DNA library, DNA was fragmented with the Hydroshear™ (GeneMachines) into pieces with a distribution centered on 1.5 kb. Blunt-ended DNA was then ligated into pPCR-Script Amp Sk(+) plasmid (Stratagene), and plasmid DNA was introduced into chemically competent XL10-Gold Kan E. coli following the manufacturer's procedures (Stratagene). Insert-carrying plasmids were identified in transformants by detecting white colonies after growth on X-Gal/IPTG[14].

2.3 Sequencing and sequence analysis

Colonies were grown overnight at 37°C in single wells of 96-well microtiter plates containing 150 μl LB freeze (4 mM MgSO4, 360 mM K2HPO4, 132 mM KH2PO4, 17 mM Na-citrate, 68 mM (NH4)2SO4, 4.4% glycerol in LB, pH 7.0) and 100 μg ml−1 ampicillin and transferred to LB agar plates containing 100 μg ml−1 ampicillin after 18 h using a 96-well plate replicator. The inoculated agar plates were sent to MWG-Biotech (High Point, NC, USA) for one-pass sequencing of the inserts using the M13 forward and reverse primers for pUC18, and T7 and T3 primers for pPCR-Script Amp Sk(+) plasmids on an ABI377 automatic sequencer. Trace files were analyzed with the PHRED and CROSS_MATCH algorithms of MacPhred/Phrap [15,16] to translate the ABI377 chromatogram data of the sequence files into accurate quality information for each base call and detection of plasmid sequences, respectively. Plasmid sequences were removed from each sequence tag and high quality sequence data (>20 phred score) were collected into a database and searched against the non-redundant (nr) database at National Center for Biotechnology Information (NCBI) using nucleotide–nucleotide BLAST (blastn) or the translating BLAST (blastx) algorithms[17]. To screen for redundant sequence tags, the S. kunkelii sequence database was also searched against itself with the blastn algorithm. Nucleotide sequences with significant similarities (E-value ≤10−5) to sequences in the NCBI database were collected, translated into proteins and searched against the full non-redundant protein database and non-redundant databases of Mycoplasmataceae and Bacillus/Clostridium at NCBI with the protein–protein BLAST (blastp) algorithm. All sequence analyses were performed on local Linux workstations.

3 Results and discussion

3.1 Library construction, sequencing and sequence analysis

To confirm the identity of the isolated DNA, the spiralin gene was amplified using primers described by Foissac et al.[18]. The nucleotide sequence of the spiralin gene amplification product was identical to that reported earlier[18], thus confirming the identity of the CSS-M clone of S. kunkelii (data not shown). Insert sizes of clones from the shotgun library ranged from 0.5 to 4 kb and clones from the Eco RI or Hin dIII libraries contained fragment sizes ranging from 150 bp to 10 kb.

In total, 94 inserts from the Eco RI and Hin dIII libraries, and 188 inserts from the sheared DNA library were sequenced from flanking primer sites after which the sequences were collected into a database. Low quality and cloning vector sequences were removed from the database. The database was then searched against itself with the blastn algorithm to analyze redundancy. Mollicute genome sequences show the presence of highly repeated regions and spiroplasma genomes have many copies of spiroplasma phage SpV1 DNA[19]. Therefore, redundant sequence tags were not assembled into contigs but within each set of redundant clones one sequence tag from the forward and reverse direction with best phred quality scores were kept in the database whereas others were removed. This resulted in a database of 144 712 nucleotides (396 sequence tags) of non-redundant high-quality (>20 phred score) S. kunkelii genome sequences representing 9% of the S. kunkelii genome, based on an estimated genome size of 1600 kb[6]. All sequences were deposited in the random single pass read genome survey sequence database (dbGSS) of GenBank (accession Nos. BH234783 to BH235178).

The 396 sequence tags were searched against the complete NCBI nr database with the blastn and blastx algorithms. The overall percentage of sequence tags with significant similarity (E-value ≤10−5) to open reading frames (ORFs) in the NCBI nr database was ∼40% (150/396), which is in agreement with previous findings that biological functions can be assigned to ∼50% of the ORFs in completed genome sequencing projects[20].

3.2 DNA phage sequences

Unlike the Mycoplasmataceae, spiroplasma genomes harbor many spiroplasma phage SpV1 DNA insertions [710,19,21,22]. In this survey ∼5% (17/396) of the S. kunkelii sequence tag database had significant similarity to spiroplasma virus SpV1 DNA (Table 1). The percentage of phage sequences in the S. kunkelii sequence tag database is comparable to the 7% DNA phage sequences found in the genome of the Gram-negative leafhopper-transmitted vascular plant pathogen, Xylella fastidiosa[20].

View this table:
Table 1

Sequence tags with significant similarity (E-value ≤10−5) to spiroplasma virus SpV1 and S. citri putative virulence proteins

IdentitySequence tag IDAcc. No. of best hitE-value
Spiroplasma virus SpVI ORFs
ORF1, capsid proteinMSAC_D10.y96261132e-19
ORF3, transposase geneMEAA_B07.y11430212e-39
S. citri putative virulence proteins
  • Deduced protein sequences were searched against the non-redundant database at NCBI. Identity and sequence tag identity (ID) are indicated and for each sequence tag the accession number and E-value of entry with highest similarity are listed.

3.3 Spiroplasma-specific sequences

In total, 133 sequence tags had significant similarities to prokaryotic and/or eukaryotic protein sequences in the NCBI nr database. Included were four sequences unique to spiroplasmas with similarity to putative S. citri virulence genes encoding P123, P58, P54, or P18 (Table 1) [23,24]. These genes are part of a 9.5-kb S. citri genome segment that is deleted from a non-transmissible line of S. citri.

3.4 Comparative genome analysis

As a preliminary assessment of to what extent the S. kunkelii genome content differs from those of Mycoplasmataceae species, sequence tags were translated into proteins to ensure that the sequences were part of ORFs and, subsequently, the protein sequences were searched against the Mycoplasmataceae, Bacillus/Clostridium and complete nr protein databases of GenBank (Table 2). The Mycoplasmataceae database was selected because it contains the full genome sequences of three mycoplasma and one ureaplasma species [710], whereas the Bacillus/Clostridium database was selected because it contains many completed genome sequences and Bacillus/Clostridium species are thought to be closest walled relatives to Mollicutes [4,25]. Interesting gene content differences among S. kunkelii, and Mycoplasmataceae and Bacillus/Clostridium species are discussed below.

View this table:
Table 2

Sequence tags with significant similarities (E-value ≤10−5) to GenBank nr protein sequences

Sequence tag IDIdentitySequence length (aa)Best entry Mycoplasmataceae, accession No. (E-value)Best entry Bacillus/Clostridium, accession No. (E-value)Best entry GenBank organism, accession No. (E-value)
Amino acid biosynthesis
MEAA_A03.yThymidylate kinase9614089465 (5e-08)2632295 (3e-12)B. subtilis, 16077096 (2e-10)
MSAC_C05.yFolylpolyglutamate synthase/dihydrofolate synthetase (folC)1274930039 (3e-09)Str. pneumoniae, 15900133 (4e-09)
MSAD_B10.xMethionine aminopeptidase (MAP) (peptidase M)17914089981 (3e-37)11131429 (3e-41)B. halodurans, 15612719 (1e-35)
PE_10.xSerine hydroxymethyl transferase (glyA)2391673936 (2e-78)12723496 (5e-84)Str. pneumoniae, 15902972 (2e-64)
Cell envelope
MHAA_E07.xCell shape determining protein (MreB-like protein)17210176363 (2e-33)B. halodurans, 15616301 (4e-07)
PH_05.xGcpE protein1021730252 (3e-26)B. subtilis, 16079569 (1e-27)
Fatty acid and phospholipid metabolism
MEAA_D03.xProbable N-acetylglucosamine-6-phosphate deacetylase7314089782 (1e-06)A69664 (4e-08)St. aureus, 15893481 (9e-06)
MHAA_C12.x1-Acyl-sn-glycerol-3-phosphate acyltransferase11213508038 (2e-10)10174252 (0.008)M. pulmonis, 15828585 (2e-6)
MHAA_H07.xorfa6514089575 (1e-09)2633961 (8e-12)S. citri, 1143008 (3e-29)
MSAD_G09.y1-Acyl-sn-glycerol-3-phosphate acyltransferase11713508038 (5e-22)2633289 (3e-05)M. pneumoniae, 13508038 (6e-20)
Cellular processes
MEAA_C12.yDnaK protein (hsp 70)1898920287 (2e-51)P45554 (5e-58)E. rhusiopathiae, 1169374 (4e-58)
MHAA_B03.xGTP-binding membrane protein (LepA)806899301 (2e-17)12724067 (1e-18)La. lactis, 15673090 (2e-05)
Energy metabolism
MEAA_B07.xDihydrolipoamide dehydrogenase431674136 (1e-10)12722900 (3e-07)Pseudomonas putida, 1706442 (7e-16)
MEAA_C10.xPhosphomannomutase (PMM)1331352196 (3e-19)C69835 (8e-29)B. halodurans, 15613669 (8e-19)
MHAA_E06.yGlycerol-3-phosphate dehydrogenase16714089537 (2e-15)1146220 (8e-27)St. aureus, 15924464 (2e-24)
MHAA_A06.yATP synthase β chain21314089679 (1e-94)10176378 (1e-86)M. pulmonis, 15828737 (1e-76)
MHAA_D08.yFructose-biphosphate aldolase10712044873 (2e-23)10944298 (1e-21)Cl. acetobutylicum, 15894114 (2e-25)
MHAA_F07.yTransketolase6314089925 (6e-08)7328298 (6e-06)M. pulmonis, 14089925 (7e-08)
MHAA_G06.xPyruvate kinase8214089653 (1e-07)nsS. citri, 2384686 (3e-17)
MSAC_G02.yATP synthase B chain precursor882146068 (1e-11)12061042 (0.001)M. pneumoniae, 13508341 (1e-04)
MSAD_B06.xPhosphopyruvate hydratase/enolase5514089932 (2e-17)8670811 (8e-16)A. aeolicus, 6015091 (1e-14)
Purines, pyrimidines, nucleosides, and nucleotides
MEAA_B12.xAdenylosuccinate lyase188WZBSDS (2e-37)Lactobacillus sakei, 15217116 (1e-32)
MEAA_C08.xAdenylosuccinate synthetase12310176653 (1e-30)S. citri, 1709937 (1e-49)
MEAA_D08.yDeoxyguanosine kinase131586859 (1e-08)M. mycoides, 16040925 (7e-36)
MHAA_B06.xThymidine kinase10614089558 (4e-16)2636243 (4e-26)B. subtilis, 16080759 (9e-21)
MHAA_C05.yCytidine deaminase113D53312 (2e-24)10173982 (3e-19)M. pirum, 1345713 (3e-22)
MHAA_H10.yDeoxyguanosine kinase1444033719 (3e-11)M. mycoides, 16040925 (2e-32)
MSAC_D04.yAdenine phosphoribosyltransferase14914089736 (2e-29)12723536 (9e-33)T. maritima, 15644136 (2e-33)
MSAC_F08.xGMP synthetase (glutamine amidotransferase)893483135 (2e-50)X. fastidiosa, 15839020 (6e-56)
Regulatory functions
MHAA_A11.yRNA polymerase σ factor (rpoD)11412045103 (2e-06)O66381 (1e-07)Cl. acetobutylicum, 15894582 (3e-06)
MSAC_A08.xTranscriptional regulator involved in nitrogen regulation (NifR3 family)12110172709 (1e-29)B. halodurans, 15612660 (3e-28)
MSAD_H07.yPredicted transcription regulator SinR8010174744 (3e-06)Cl. acetobutylicum, 15894128 (3e-19)
MEAA_B05.yDNA-directed DNA polymerase I193A32949 (3e-35)Str. pyogenes, 15674390 (2e-32)
MEAA_F12.xDNA gyrase subunit B10414089786 (4e-13)2558946 (4e-16)M. capricolum, 17008093 (2e-19)
MHAA_G05.yChain A, helicase product complex17114090113 (7e-11)2781090 (4e-17)G. stearothermophilus, 9257172 (1e-18)
MHAA_H04.xDNA-directed RNA polymerase β subunit96600226 (5e-21)12724825 (3e-19)S. citri, 1350848 (3e-46)
MSAC_A09.xParA family protein14112045330 (2e-06)9968459 (4e-09)S. citri, 10432498 (7e-16)
MSAC_B02.yGlucose-inhibited division protein A7814089666 (2e-22)P25812 (7e-18)M. pulmonis, 14089666 (3e-20)
MSAC_C06.yCell division protein FtsH17414090194 (2e-43)S66099 (4e-42)M. pulmonis, 15829250 (9e-44)
MSAD_B12.yDNA primase9613508092 (1e-15)664755 (6e-25)L. innocua, 16800560 (4e-24)
MSAD_G04.yATP-dependent helicase PcrA11214090183 (4e-15)P56255 (6e-16)My. tuberculosis, 15840373, (2e-12)
MHAA_E11.xTranscription antitermination factor (nusG)11214089595 (1e-06)O08386 (2e-19)L. monocytogenes, 16802292 (2e-07)
MHAA_H05.yPolynucleotide phosphorylase (PNPase)2061184680 (3e-82)B. subtilis, 16078732 (4e-71)
MSAC_H02.yDNA-directed RNA polymerase α chain1146601578 (5e-22)12725120 (1e-18)M. capricolum, 629301 (1e-28)
MSAD_B09.xTranscription antitermination protein NusG68ns12725158 (5e-09)Str. coelicolor, 1709420 (2e-06)
MEAA_B02.xATP-dependent protease (lon-protease)1221674198 (8e-14)B42375 (1e-15)V. cholerae, 15641922 (1e-13)
MEAA_B09.x50S ribosomal protein L213814089744 (8e-05)12724034 (3e-08)Str. pyrogenes, 15674860 (4e-08)
MEAA_B09.yValine-tRNA ligase1211351181 (3e-20)10175660 (9e-28)St. aureus, 15927242 (2e-21)
MEAA_C04.yCysteinyl tRNA synthetase761351147 (5e-12)12724882 (6e-12)Cl. stricklandii, 6899996 (4e-10)
MEAA_C06.x50S ribosomal protein L34014090004 (4e-08)P42920 (2e-05)M. capricolum, 132957 (2e-06)
MEAA_D03.y50S ribosomal protein L210914090000 (7e-36)P04257 (3e-36)M. capricolum, 71083 (4e-20)
MEAA_D09.xTranslation initiation factor 2 (infB)1082497279 (5e-40)10175033 (3e-35)M. genitalium, 12044994 (7e-49)
MEAA_G12.yProlyl-tRNA synthetase21714089596 (7e-63)13633967 (5e-64)B. burgdorferi, 15594747 (1e-50)
MHAA_A08.yTranslation elongation factor G (EF-G)18014089842 (5e-78)10172743 (3e-92)B. halodurans, 15612694 (2e-80)
MHAA_A11.xHypothetical protein similar to O-sialoglycoprotein endopeptidase6814089531 (2e-18)1945110 (1e-19)St. aureus, 15927624 (8e-15)
MHAA_B08.y50S ribosomal protein L4382766504 (9e-36)S24364 (3e-52)M. capricolum, 132981 (2e-45)
MHAA_C07.xAsparaginyl-tRNA synthetase11914090186 (1e-28)12724857 (4e-14)Cl. acetobutylicum, 15896505 (1e-32)
MHAA_D07.ySeryl-tRNA synthetase901361847 (8e-26)12724729 (9e-28)A. aeolicus, 15605830 (7e-24)
MHAA_C09.x50S ribosomal protein L5883844757 (2e-29)4512416 (3e-33)B. halodurans, 15612709 (5e-27)
MHAA_C09.y30S ribosomal protein S3593914904 (1e-04)nsS. citri, O31161 (7e-26)
MHAA_C11.x30S ribosomal protein S88514089988 (4e-17)P56209 (7e-17)M. capricolum, 134021 (2e-23)
MHAA_C11.y50S ribosomal protein L17 (fragment)11914089975 (8e-27)P07843 (3e-31)M. capricolum, 7674204 (1e-38)
MHAA_D12.y50S ribosomal protein L196614089881 (1e-14)10175098 (6e-20)B. halodurans, 15615041 (1e-18)
MHAA_E03.yIsoleucyl-tRNA synthetase10114090082 (3e-12)437916 (2e-23)St. aureus, 1174521 (1e-24)
MHAA_E05.yThreonyl-tRNA synthetase13213508292 (1e-38)143766 (8e-42)U. urealyticum, 13358098 (5e-45)
MSAC_A11.xGlutamyl-tRNA synthetase13113508417 (4e-16)289282 (4e-20)B. subtilis, 16077160 (2e-15)
MSAC_B10.xPhenylalanyl-tRNA synthetase β chain175ns40054 (3e-06)C. pneumoniae, BAA98801.1 (4e-10)
MSAC_C04.yIsoleucyl-tRNA synthetase18014090082 (2e-32)10175165 (3e-41)La. sakei, 15487790 (5e-27)
MSAC_H02.yDNA-directed DNA polymerase (α chain)1076601579 (2e-25)10172773 (2e-30)M. capricolum, 629301 (1e-28)
MSAD_B06.yTryptophanyl-tRNA synthetase14514090160 (7e-26)10175491 (2e-34)B. halodurans, 15615433 (1e-28)
MSAD_B12.xGlycyl-tRNA synthetase2016899491 (1e-63)4584090 (1e-30)St. aureus, 15924555 (1e-55)
MSAD_E10.xHeat shock protein GroEL8312045254 (4e-23)12723267 (5e-35)En. faecalis, 15625350 (1e-33)
MSAD_H08.xRibosomal large subunit pseudouridine synthase B9514089751 (7e-04)410137 (1e-15)B. subtilis, 466190 (1e-31)
PE_05.xHistidyl-tRNA synthetase22012044885 (7e-29)3915057 (4e-45)L. innocua, 16800623 (2e-07)
PE_14.yPeptide chain release factor 1 (RF-I)1441350577 (5e-45)S55437 (2e-44)M. capricolum, 2500137 (6e-55)
PE_21.y50S ribosomal protein L219514090000 (5e-66)P04257 (5e-72)M. capricolum, 71083 (7e-77)
PH_04.xGlycyl-tRNA synthetase6814089865 (1e-17)4584090 (2e-24)B. cereus, 4584090 (2e-17)
PS_02.y30S ribosomal protein S178514089993 (4e-22)P23828 (7e-31)S. citri, 3122807 (3e-39)
Transport and binding proteins
MEAA_F10.xPhosphotransferase EII (PTS system)13814089430 (8e-14)2633144 (6e-15)M. capricolum, 530422 (9e-15)
MEAA_G05.yPhosphate ABC transporter, permease protein1771361743 (4e-25)4530449 (1e-30)V. cholerae, 15600843 (1e-23)
MEAA_H04.xABC transporter1682146659 (1e-51)12723139 (8e-46)U. urealyticum, 13358103 (2e-47)
MSAC_A07.yMethylgalactosidase permease ATP-binding protein2294914644 (3e-47)12724309 (6e-47)U. urealyticum, 13357571 (2e-36)
MSAC_A08.yABC transporter, ATP-binding protein7012044917 (3e-15)12724060 (1e-12)Cl. acetobutylicum, 15894109 (2e-13)
MSAC_C02.xHighly similar to phosphotransferase system (PTS) fructose-specific enzyme IIABC component871045736 (1e-05)2633811 (4e-09)L. innocua, 16801491 (5e-07)
MSAC_D03.yABC transporter, ATP-binding protein11514089609 (2e-11)10173618 (3e-21)T. maritima, 15643786 (5e-19)
MSAC_D06.xHighly similar to Mg(2+) transport ATPase14014089568 (2e-15)12724231 (8e-31)L. monocytogenes, 16804726 (5e-20)
MSAC_F10.ySimilar to ABC transporter (ATP-binding protein)13114090034 (7e-36)D70009 (5e-39)B. subtilis, 16080207 (2e-38)
MSAD_A04.ySimilar to ABC transporter ATP-binding protein – oligopeptide transport10814089827 (9e-17)S11153 (5e-31)Str. pneumoniae, 15903745 (3e-30)
MSAD_C02.yPhosphotransferase system, glucose-specific IIABC component11114089430 (4e-32)66867 (2e-21)M. pulmonis, 14089430 (4e-32)
MSAD_D03.xOligopeptide permease (ATP-binding protein)5013507956 (8e-07)1420862 (2e-10)Str. pyogenes, 15674468 (6e-09)
MSAD_D12.yTransfer complex protein trsK protein (traK)756470167 (1e-09)B. anthracis, 6470167 (1e-07)
MSAD_E06.yCation-transporting P-ATPase12514089568 (3e-05)12724231 (4e-21)La. lactis, 15673239 (3e-16)
MSAD_F05.xPhosphate ABC transporter, permease protein9613508349 (7e-14)4530449 (2e-20)Str. pneumoniae, 15901902 (8e-19)
Other categories
MEAA_A02.xAmidase622146059 (1e-06)nsM. capricolum, 530426 (6e-10)
MEAA_A03.xConserved hypothetical16514090195 (1e-08)467456 (6e-04)U. urealyticum, 13357633 (2e-10)
MEAA_A06.ySpoE family protein/cell division protein141S09411 (2e-31)Str. pneumoniae, 15900761 (2e-33)
MEAA_B03.yProbable GTP-binding protein5613508214 (5e-12)1146219 (9e-17)B. subtilis, 1730915 (6e-15)
MEAA_D11.xNitrogen fixation protein NifU6910176042 (6e-13)B. halodurans, 15615981 (7e-11)
MEAA_D12.yPredicted SAM-dependent methyltransferase1451045939 (7e-10)12724027 (1e-16)Cl. acetobutylicum, 12724027 (1e-16)
MEAA_E07.xRNA-binding Sun protein1002633946 (4e-08)B. subtilis, 16078637 (4e-04)
MEAA_E08.yConserved hypothetical protein9810173873 (1e-21)St. aureus, 10173873 (1e-21)
MEAA_E09.xHypothetical protein112Chl. tepidum, 10039641 (2e-24)
MEAA_E12.x199 aa long conserved hypothetical protein125S73881 (3e-05)P54501 (1e-08)B. subtilis, P54501 (1e-08)
MEAA_F10.ySimilar to putative phosphoprotein phosphatase567109691 (5e-06)10175125 (1e-06)L. monocytogenes, 10175125 (1e-06)
MHAA_A09.xProbable thiol peroxidase7714090123 (6e-09)P72500 (9e-12)Str. pneumoniae, 15901486 (1e-09)
MHAA_A10.yHypothetical protein1521674179 (7e-10)2634923 (7e-13)A. aeolicus, 7451802 (2e-13)
MHAA_B12.xConserved GTP-binding protein12414089767 (2e-09)12724592 (2e-08)L. monocytogenes, 14089767 (2e-09)
MHAA_B12.ytRNA δ (2) isopentenylpyrophosphate transferase7513701103 (3e-11)St. aureus, 15924294 (2e-11)
MHAA_C06.yP115-like (Mycoplasma hyorhinis) ABC transporter ATP-binding protein16814090129 (8e-52)10175107 (2e-46)M. pulmonis, 14090129 (8e-52)
MHAA_D09.xProbable thiol peroxidase56P31307 (5e-07)Cl. acetobutylicum, 15896549 (5e-07)
MHAA_E04.xAcyl carrier protein phosphodiesterase (ACP phosphodiesterase)11814089726 (2e-39)2619052 (6e-09)M. pulmonis, 14089726 (2e-39)
MHAA_F06.yPartitioning or sporulation protein (ParA) (soj protein)109ns9968459 (3e-12)L. monocytogenes, 9968459, (3e-12)
MHAA_F12.xConserved hypothetical protein8714089574 (3e-09)12723043 (2e-10)Str. pyogenes, 12723043 (2e-10)
MHAA_G03.yProbable type I restriction enzyme restriction chain13414090092 (7e-06)13700111 (2e-24)St. aureus, 15923185 (8e-27)
MHAA_G08.yExodeoxyribonuclease V (α subunit)8414090197 (2e-07)2635193 (4e-12)C. pneumoniae, 15835659 (9e-13)
MHAA_H02.xConserved hypothetical protein992635763 (6e-29)St. aureus, 15923836 (2e-26)
MSAC_A06.xConserved hypothetical protein19512724713 (3e-34)Y. pestis, 16121243 (2e-29)
MSAC_B11.yHypothetical protein1403845056 (6e-16)12724031 (2e-19)La. lactis, 12724031 (2e-19)
MSAC_C09.yConserved hypothetical protein17413508006 (5e-11)13027335 (5e-21)St. aureus, 13027335, (5e-21)
MSAC_E03.xConserved hypothetical protein437429432 (3e-05)Synechocystis sp. PCC 680 7444728, (4e-06)
MSAC_H01.xConserved hypothetical protein128150165 (5e-09)10175107 (3e-10)Str. pneumoniae, 10175107 (3e-10)
MSAD_E03.xNitroreductase1517432647 (2e-05)Ca. jejuni, 15792391 (2e-08)
MSAD_F02.yHypothetical protein133P75273 (2e-06)5420109 (3e-12)Str. thermophilus, 5420109 (3e-12)
MSAD_F01.xHypothetical 35.3 kDa protein, SLR181991P37497 (1e-05)Synechocystis sp. PCC 6803, P73709, (2e-07)
MSAD_G02.xConserved hypothetical protein9914090099 (3e-17)7328260 (8e-15)M. pulmonis, 7328260 (8e-15)
PH_01.yBH2415 – unknown conserved protein8210175035 (4e-19)B. halodurans, 10175035 (4e-19)
PH_05.yHypothetical protein in fibril gene 3′ region31512045292 (6e-14)S. citri, P27712 (e-101)
  • Deduced protein sequences were blastp searched against the non-redundant database and the Mycoplasmataceae and Bacillus/Clostridium protein databases at NCBI. Sequence tag identity (ID) and deduced amino acid (aa) length are indicated and for each sequence tag accession numbers and E-values of entries with highest similarities are listed. The organism of entry with highest similarity is listed for the non-redundant (nr) database search results. A., Aquifex; B., Bacillus; C., Chlamydia; Ca., Campylobacter; Chl., Chlorobium; Cl., Clostridium; E., Erysipelothrix; En., Enterococcus; G., Geobacillus; L., Listeria; La., Lactococcus; M., Mycoplasma; My., Mycobacterium; S., Spiroplasma; St., Staphylococcus; Str., Streptococcus; T., Thermotoga; U., Urealyticum; V., Vibrio; X., Xylella; Y., Yersinia. −, no significant hit and sequence is absent; ns, E-value >10−5 but sequence is present in genomes of one or more members of the Mycoplasmataceae or Bacillus/Clostridium group.

3.5 Amino acid, purine, pyrimidine, nucleoside and nucleotide metabolism

Mycoplasmataceae species lack most genes involved in de novo biosynthesis of pyrimidines, purines and amino acids [710]. However, in contrast to mycoplasmas and U. urealyticum, the S. kunkelii genome seems to harbor the nucleotide and/or amino acid biosynthesis genes encoding adenylosuccinate lyase, adenylosuccinate synthase, GMP synthase, deoxyguanosine kinase, and folylpolyglutamate synthase/dihydrofolate synthetase (folC) (Table 2). Adenylosuccinate lyase is a tetrameric enzyme involved in de novo synthesis of inosine monophosphate (IMP) and adenosine monophosphate[26], adenylosuccinate synthase catalyzes the first step in de novo biosynthesis of AMP[27], and guanine monophosphate (GMP) synthase catalyzes the last step from IMP into GMP[26]. Deoxyadenosine/deoxyguanosine kinase and deoxyadenosine/deoxycytidine kinase are required, together with thymidine kinase, for deoxynucleotide synthesis in Lactobacillus acidophilus[28]. Interestingly, the deoxyguanosine kinase gene is present in the mollicute Mycoplasma mycoides. Within the order Mycoplasmatales, M. mycoides belongs to the Entomoplasmataceae, a family more closely related to the Spiroplasmataceae than the Mycoplasmataceae[4]. The folC gene product is essential for production of glycine, methionine, purine and thymidine[29]. These data suggest that S. kunkelii can synthesize more amino acids and nucleotides de novo than Mycoplasmataceae species do, which is in agreement with experimental evidence that spiroplasma culturing media are less complex than those of the culturable mycoplasmas [13,25].

3.6 Cell envelope

The sequence data indicate that S. kunkelii harbors at least two cell envelope biosynthesis genes that are absent from members of the Mycoplasmataceae. The gcpE gene is involved in the acetylation of peptidoglycans and isoprenoid biosynthesis and is broadly distributed in eubacteria and plants [30,31]. MreB is a cytoskeletal protein and forms a filamentous helical structure close to the cell surface of eubacteria, and has an actin-like role in bacterial cell morphogenesis[32]. The clear morphological differences between spiroplasmas and Mycoplasmataceae and our finding that the mreB gene is absent from Mycoplasmataceae genomes but present in S. kunkelii suggest that MreB may have a critical role in the unique helical cell structure of spiroplasmas.

3.7 Regulatory functions

Our sequence data show that the S. kunkelii regulatory mechanisms are more complex that those of the Mycoplasmataceae. Three genes were identified encoding the regulatory proteins NifR3, SinR and PNPase that were absent in the three sequenced mycoplasmas and U. urealyticum but present in Firmicutes. NifR3 is important for the regulation of the dormant and vegetative cell stages of the ciliate Sterkiella histriomuscorum[33]. The function of NifR3 in bacteria is not known. SinR is involved in the transition of a vegetative stage to sporulation in Bacillus subtilis in response to nutrient depletion[34]. Spiroplasmas do not make spores, but are extremely pleomorphic. It is tempting to speculate that NifR3 and SinR may be involved in S. kunkelii cell shape regulation as a response to nutrient availability. A third regulatory protein, polynucleotide phosphorylase (PNPase) is responsible for mRNA decay, translation activation and transcript stabilization in B. subtilis[35,36]. The loss of PNPase is lethal for E. coli, but affects only competence development in B. subtilis[37,38] and may affect competence of S. kunkelii as well. The discovery of a these regulatory factors in S. kunkelii is surprising as, thus far, members of the Mycoplasmataceae are known to lack major regulators of gene expression [7,8,39,40].

3.8 Replication

One surprising finding was that the DNA polymerase I protein of S. kunkelii did not match the DNA polymerases of Mycoplasmataceae, whereas it had significant similarity to the DNA polymerase I proteins of Streptococcus species (E-values: 2e−35 and 2e−32, sequence tag MEAA_B05.y, Table 2). Closer analysis revealed that the 193 amino acid sequence tag of S. kunkelii was similar to the C-terminal polymerase domain of DNA polymerase I. In contrast, putative DNA polymerases I of M. genitalium (GenBank accession No. I64228), M. pneumoniae (S73784), U. urealyticum (C82895) and M. pulmonis (CAC13893) are ∼300 amino acids in size and consist of the N-terminal 5′-3′ exonuclease part (proofreading) part but lack the C-terminal 3′-5′ exonuclease and polymerase domains (Klenow fragment) of the enzyme[10]. This finding suggests that, unlike mycoplasmas and U. urealyticum, the S. kunkelii polA gene may encode the full-length DNA polymerase I protein including the proofreading and Klenow domains similarly to that of Streptococcus pneumoniae[41].

3.9 Transport and binding proteins

In contrast to Mycoplasmataceae, the S. kunkelii genome harbors at least one copy of a traK homologue. S. kunkelii traK has the highest similarity to traK of the B. anthracis virulence plasmid pX02.09 (Table 2)[42]. This conserved protein family binds DNA and couples plasmid to membrane proteins for transport to the mating cell and/or are pathogenicity factors involved in transport of virulence factors to the extracellular environment of bacteria [43,44]. The function of S. kunkelii TraK protein remains to be investigated.

Two S. kunkelii sequence tags (MSAC_C02.x and MSAD_C02.y, Table 2) harbor sequences similar to fructose permease of the phosphoenolpyruvate:fructose phosphotransferase system (fructose PTS). Mutagenesis of the operon encoding fructose PTS proteins in another leafhopper-transmitted plant-pathogenic spiroplasma, S. citri, significantly decreases plant pathogenicity[45]. The most likely explanation is that utilization of fructose in the plant sieve tubes by S. citri may interfere with the normal physiology of the plant causing chlorosis, stunting and wilting[45]. This may be true for S. kunkelii in sieve tubes of corn plants as well. Homologues of fructose PTS proteins were also identified in Mycoplasmataceae and other Firmicutes (Table 2).

3.10 Genes in other categories

The S. kunkelii genome harbors at least one copy of a spoIIIE homologue that is not found in the Mycoplasmataceae genome sequenced so far (sequence tag MEAA_A06.y, Table 2). In B. subtilis, the spoIIIE gene product is involved in the coordination of chromosome segregation and clearing DNA from the site of division during septum formation[46] and, therefore, is likely to be involved in S. kunkelii cell division.

A nifU-like gene of 228 nucleotides in length was identified in this sequence project and harbors solely the C-terminal conserved domain containing two conserved cysteines, whereas functional iron–sulfur cluster-binding NifU proteins contain additional middle domains with four conserved cysteines (sequence tag MEAA_D11.x, Table 2) [4749]. Several smaller NifU-like genes are also found in the nitrogen fixing Rhodobacter and Azotobacter species and single gene mutagenesis studies show that they are not essential for survival or nitrogen fixation of bacteria[50]. The functions of these shorter NifU-like genes are not known.

A sequence similar to the oxygen-insensitive NAD(P)H nitroreductase was found in the S. kunkelii database (sequence tag MSAD_E03.x, Table 2). This enzyme catalyzes the reduction of a variety of nitroaromatic compounds to highly toxic metabolites[51]. Although absent from the mycoplasmas and U. urealyticum genomes, it is found in the small (∼650 kb) genome of the insect vectored apple proliferation phytoplasma (gi405516)[52]. It is noteworthy that phytoplasmas are the only other group of Mollicutes that infect plants causing characteristic chlorosis and stunting symptoms.

Two sequence tags have identity to the 20 kDa PsaD thiol peroxidase proteins of Streptococcus species [53,54]. Tag MHAA_A09.x contains the N-terminal part of this protein, whereas MHAA_D09.x harbors the C-terminal end. In Str. pneumoniae, the psaD gene is located downstream from the psa locus with the psaA, psaB and psaC genes encoding an ABC-type Mn permease complex[54]. Mutagenesis of each of four psa genes resulted in penicillin tolerance, defective adhesion and reduced transformation efficiency of Str. pneumoniae[54]. The psaA gene encodes an adhesin-like surface protein, and psaA and psaD related genes were identified in Streptococcus sanguis, Streptococcus parasanguis and Streptococcus gordonii[53].

Several sequence tags have identity to conserved hypothetical proteins that are lacking from the mycoplasmas and U. urealyticum genomes sequenced thus far (Other categories, Table 2). We found only one sequence tag with identity to Mollicute sequences but not those of the Bacillus/Clostridium group (sequence tag PH_05.y, Table 2). The deduced protein sequence of this tag is a homologue of a hypothetical protein encoded by a gene in the downstream region of the fibril gene region of S. citri[55]. The fibril protein is important for the helical cell shape and motility of spiroplasmas [56,57] and the gene encoding it is lacking from the genomes of the oval-shaped mycoplasmas and U. urealyticum[710]. Because the hypothetical protein gene is localized near the fibril protein gene[55] and is unique to Mollicutes (Table 2), this hypothetical protein may be an important constituent of the mollicute cytoskeleton.

3.11 Ribosomal RNA genes

Clones MEAA_E09 and MHAA_F02 contained part of the 16S and 23S ribosomal RNA (rRNA) genes and the 16S–23S internal spacer with closest similarity to rRNA gene regions from S. citri, as is expected from the S. kunkelii phylogenetic position[4] (Table 3). S. kunkelii rRNA genes have not been sequenced previously.

View this table:
Table 3

S. kunkelii sequence tags with similarity to rRNA genes

Sequence tag IDIdentityAccession No., organismE-value
MEAA_E09.x16S rDNA46914, S. citri0.0
175961, S. poulsonii0.0
175965, S. citri0.0
175964, S. apis0.0
175967, S. mirum0.0
175969, S. monobiae0.0
175962, S. taiwanensee-180
175970, S. diabroticaee-179
175963, S. gladiatorise-171
175473, Entomoplasma melaleucaee-166
MHAA_F02.y16S rDNA, 16S/23S spacer region, 23S rDNA46914, S. citri0.0
4456860, Spiroplasma sp.e-151
2707198, S. citrie-125
5821442, M. putrefaciens2e-19
  • S. kunkelii sequence tags were searched against the full GenBank nr nucleotide database with the blastn algorithm. Identities, accession numbers and organisms, and E-values of the first 10 and four entries of respectively MEAA_E09.x and MHAA_F02.y, the only sequence tags with similarities to rRNA genes, are listed.

3.12 Conclusions

In summary, our data show that, in addition to the large number of spiroplasma phage DNA insertions, S. kunkelii also harbors more amino acid and nucleotide biosynthesis, transcription regulation, cell envelope and DNA transport/binding genes than the genomes of the Mycoplasmataceae species do. Our data also demonstrate that genome comparisons among Mollicutes are extremely informative because of their small genome sizes, broad host range, differences in morphology, and well-defined biology. In addition to the already completed genome sequences of four Mycoplasmataceae species, several genome sequence projects of Mollicutes in other families are ongoing including those of M. mycoides and Mycoplasma capricolum in the family Entomoplasmataceae (http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/bact.html), and S. kunkelii (http://www.genome.ou.edu/spiro.html) and S. citri (http://www.cwu.edu/verheys/s.citri/). Genome comparison of species within a family, among families within the class Mollicutes and between Mollicutes and Firmicutes should prove extremely valuable.


The authors thank Dr. Margareth Redinbaugh for carefully reading the manuscript and Dr. Robert Davis for help with establishing S. kunkelii in vitro cultures at the OARDC. This research was funded by the OARDC research enhancement and competitive grants program.


  1. [1].
  2. [2].
  3. [3].
  4. [4].
  5. [5].
  6. [6].
  7. [7].
  8. [8].
  9. [9].
  10. [10].
  11. [11].
  12. [12].
  13. [13].
  14. [14].
  15. [15].
  16. [16].
  17. [17].
  18. [18].
  19. [19].
  20. [20].
  21. [21].
  22. [22].
  23. [23].
  24. [24].
  25. [25].
  26. [26].
  27. [27].
  28. [28].
  29. [29].
  30. [30].
  31. [31].
  32. [32].
  33. [33].
  34. [34].
  35. [35].
  36. [36].
  37. [37].
  38. [38].
  39. [39].
  40. [40].
  41. [41].
  42. [42].
  43. [43].
  44. [44].
  45. [45].
  46. [46].
  47. [47].
  48. [48].
  49. [49].
  50. [50].
  51. [51].
  52. [52].
  53. [53].
  54. [54].
  55. [55].
  56. [56].
  57. [57].
View Abstract