OUP user menu

Comparative genomics identifies genes shared by distantly related insect-transmitted plant pathogenic mollicutes

Xiaodong Bai , Jianhua Zhang , Ian R. Holford , Saskia A. Hogenhout
DOI: http://dx.doi.org/10.1111/j.1574-6968.2004.tb09596.x 249-258 First published online: 1 June 2004

Abstract

Phytoplasmas and spiroplasmas are distantly related insect-transmitted plant pathogens within the class Mollicutes. Genome sequencing projects of phytoplasma strain Aster Yellows-Witches' Broom (AY-WB) and Spiroplasma kunkelii are near completion. Complete genome sequences of seven obligate animal and human pathogenic mollicutes (Mycoplasma and Ureaplasma spp.), and OY phytoplasma have been reported. Putative ORFs predicted from the genome sequences of AY-WB and S. kunkelii were compared to those of the completed genomes. This resulted in identification of at least three ORFs present in AY-WB, OY and S. kunkelii but not in the obligate animal and human pathogenic mollicutes. Moreover, we identified ORFs that seemed more closely related between AY-WB and S. kunkelii than to their mycoplasma counterparts. Phylogenetic analyses using parsimony were employed to study the origin of these genes, resulting in identification of one gene that may have undergone horizontal gene transfer. The possible involvement of these genes in plant pathogenicity is discussed.

Keywords
  • Comparative genomics
  • Plant pathogen
  • Mollicutes
  • Spiroplasma
  • Phytoplasma
  • Mycoplasma

1 Introduction

Mollicutes, characterized by small genomes and no cell wall, are believed to have diverged from a Gram-positive bacterial ancestor in the lactobacillus group [1,2]. Within the class Mollicutes, an early evolutionary split occurred between the AAA (Asteroleplasma, Anaeroplasma, and Acholeplasma) branch and the SEM (Spiroplasma, Entomoplasma, and Mycoplasma) branch, both of which independently underwent genome reductions [2]. Apparently, the conversion of UGA from a stop codon to a tryptophan codon in the SEM branch occurred shortly after the split of the two branches. The SEM branch contains several genera, including Spiroplasma, Entomoplasma, Mesoplasma, Mycoplasma, and Ureaplasma. Spiroplasmas are believed to be evolutionary early mollicutes and did not undergo as many gene loss events as members of other genera [2].

At the start of the genomic era, mollicutes have attracted much attention because of their small genomes and their clinical and agricultural impact. Six mollicutes genomes, five Mycoplasma spp., and one Ureaplasma sp., have been fully sequenced, representing obligate human and mammal pathogens of the genus Mycoplasma of the SEM branch. At the time of preparation of this manuscript, genome sequencing projects of three other mycoplasmas were in progress: the rodent polyarthritis pathogen Mycoplasma arthritidis, the contagious caprine pleuropneumonia (CCPP) pathogen Mycoplasma capricolum, and the contagious bovine pleuropneumonia (CBPP) pathogen Mycoplasma mycoides subsp. mycoides SC (small colony). Later, the complete genome sequences of M. mycoides subsp mycodes SC and Onion Yellows (OY) phytoplasma were released and published [3,4].

Genome sequencing projects are in progress for Spiroplasma kunkelii (http://www.genome.ou.edu/spiro.html) and the phytoplasma strain Aster Yellows-Witches' Broom (AY-WB, http://www.oardc.ohio-state.edu/phytoplasma). S. kunkelii and phytoplasmas are insect-transmitted plant pathogens that replicate in both insect vectors and plant hosts. Interestingly, S. kunkelii and phytoplasmas are strikingly similar in their infection patterns of insects and plants. Both are restricted to phloem tissues of plant hosts, from where they are acquired by phloem-feeding insects, and subsequently invade and replicate in the cells of insect gut and other tissues. Interestingly, although Spiroplasma species and all phytoplasmas described so far share similar infection patterns and environmental niches, they are distantly related within two branches of the class Mollicutes. Based on phylogenies of 16S rDNA and tuf genes, membrane composition, codon usage, and metabolism [2], spiroplasmas were grouped in the SEM branch with Mycoplasma and Ureaplasma spp., while phytoplasmas were grouped in the AAA branch with Acholeplasma spp.

This study was initiated based on the hypothesis that genes shared by evolutionarily divergent insect-transmitted plant pathogens but absent from obligate human and animal pathogens are likely important for insect transmission and/or plant pathogenicity. Using computer-assisted analysis, we have identified at least three open reading frames (ORFs) that were present in S. kunkelii and AY-WB but absent from mycoplasmas. We have also identified ORFs that do not match the 16S rDNA and tuf phylogenies. The involvement of the ORFs in pathogenicity is discussed.

2 Materials and methods

2.1 Genome sequences

The 16 contigs totaling 695 kb of the estimated 800 kb AY-WB genome were obtained from the phytoplasma genome sequencing project website (http://www.oardc.ohio-state.edu/phytoplasma). The 46 contigs totaling 1.5 Mb of the estimated 1.6 Mb S. kunkelii CR2–3x genome were obtained from the publicly accessible S. kunkelii genome sequencing project website (http://www.genome.ou.edu/spiro.html). Complete mycoplasma genome sequences were obtained from GenBank, including Mycoplasma genitalium (NC_000908) [5], Mycoplasma pneumoniae (NC_000912) [6], Ureaplasma urealyticum (NC_002162) [7], Mycoplasma pulmonis (NC_002771) [8], Mycoplasma penetrans (NC_004432) [9], and Mycoplasma gallisepticum (NC_004829) [10].

2.2 Comparative genome analysis

Genome comparisons were conducted as illustrated in Fig. 1. Genome sequences were downloaded onto a Linux workstation and used as input files for the ORF Extractor (http://www.oardc.ohio-state.edu/mcic/bioinformatics/bio_software/bio_software.html). ORFs were defined as starting with ATG and ending with in-frame TAG, TAA, or TGA for AY-WB, or TAG and TAA for S. kunkelii and all Mycoplasma and Ureaplasma spp. [2]. ORFs longer than 90 bp were extracted in FASTA format. Subsequently, only the longest ORF within a set of ORFs having stop codons at the same positions was extracted. ORFs in nucleotide sequences were translated into amino acid (aa) sequences using a Perl translation program, using translation table 11 (bacterial code) for AY-WB and translation table 4 (Mold mitochondria code) for all others [11,12]. This generated datasets AYdb for AY-WB, Skdb for S. kunkelii, and mycoprotdb for the five Mycoplasma spp. and U. urealyticum. Subsequently, AYdb and Skdb were compared using stand-alone BLAST (Basic Local Alignment Search Tool) package [13] with the expectation (E)-value threshold of 10−8. Proteins having significant similarity (E<10−8) were extracted from AYdb to generate AY_Skdb and from Skdb to generate Sk_AYdb. Subsequently, AY_Skdb and Sk_AYdb were compared to mycoprotdb and proteins with non-significant hits (E>10−8) or no hits were extracted from AY_Skdb to generate AY_Sk-mycoprotdb and from Sk_AYdb to generate Sk_AY-mycoprotdb. Proteins within AY_Sk-mycoprotdb and Sk_AY-mycoprotdb were annotated based on sequence similarity searches against NCBI non-redundant (nr) database and compared manually to identify common protein sequences. Identified proteins were validated by manual comparison with the annotated genome sequences of mycoplasmas and ureaplasma. After the finish of this study, the genome sequenes of OY phytoplasma [4] and M. mycoides subsp. mycoides SC strain [3] were released. The identified proteins were searched against the annotated proteins of these organisms using the BLAST algorithm [13].

Figure 1

Algorithms employed to extract proteins that are common between the insect-transmitted plant pathogens AY-WB phytoplasma and S. kunkelii but are absent from five Mycoplasma spp. and U. urealyticum. See Section 2 for details. Similar dataset consists of proteins that are similar in AY-WB and S. kunkelii, while unique dataset consists of proteins that are similar between AY-WB and S. kunkelii but absent from five Mycoplasma spp. and U. urealyticum. Shaded text boxes are operations with the programs indicated in parentheses. Open text boxes are datasets either as input or output of the operations. Bacterial and mold mitochondria genetic codes are from NCBI taxonomy databases [11,12].

Negative logistic plots of best E-values for each query were generated for searches of: (i) AYdb against Skdb and mycoprotdb and (ii) Skdb against AYdb and mycoprotdb. For comparable quantitative assessment, an E-value of 0.0 was set to 10−200, and proteins with no significant hits were assigned E-values of 1000.

2.3 Phylogenetic analysis

Protein sequences for phylogenetic analysis were extracted from NCBI Entrez database. Sequence alignments were produced using ClustalW [14] and used as inputs for phylogenetic analysis using PAUP (Phylogenetic Analysis Using Parsimony) program [15].

2.4 Accession numbers

AY-WB amino acid sequences identified in this study were deposited in GenBank with the Accession Numbers as follows: AAA type ATPase (AtA), AY533109; cmp-binding factor (CBF), AY533110; cytosine deaminase, AY533111; hypothetical protein, AY533112; cation transport P-ATPase, AY533113; polynucleotide phosphorylase (PNPase), AY533114; ppGpp synthetase, AY533115; YlxR protein, AY533116.

3 Results

3.1 Extraction of ORFs

As expected, ORF Extractor generated more putative ORFs than currently annotated for the completed mollicute genomes or predicted based on the estimation that ORFs have an average length of 1 kb [16]. However, for this study ORF Extractor was preferred, because, although it generates more false ORFs, it decreases the chance of omitting putative ORFs [17]. Further, since most ORFs starting with alternative start codons have an in-frame ATG elsewhere, the ORF database generated by ORF Extractor included most of the annotated ORFs (complete or partial) of the completed mollicute genomes currently present in GenBank. Of the 4332 mycoplasma and ureaplasma ORFs downloaded from GenBank, 991 ORFs (22.7%) start with an alternative start codon. Of the ORFs starting with an alternative start codon, 984 ORFs (99.3%) contained an in-frame ATG somewhere in the ORF. Thus, only 0.7% of the putative ORFs starting with alternative start codons present in the GenBank database have been excluded from the ORF Extractor database. This is only 0.2% of all the 4332 annotated mycoplasma and ureaplasma (i.e. members of M. pneumoniae and Mycoplasma hominis groups and U. urealyticum) ORFs present in GenBank.

To minimize the number of false-positives produced by the method, only the longest ORF within a set of ORFs having stop codons at the same position was extracted for subsequent analysis. Translation of the ORFs into amino acid sequences generated AYdb for AY-WB, Skdb for S. kunkelii, and mycoprotdb for the five Mycoplasma species and U. urealyticum.

3.2 Identification of four proteins that are present in AY-WB and S. kunkelii but absent from mycoplasmas

Amino acid sequence similarity searches were employed to identify proteins shared between AY-WB and S. kunkelii. AYdb and Skdb were searched against each other using stand-alone BLAST package [13]. Two hundred and ninety proteins within AYdb had significant similarity (E-value <10−8) to proteins within Skdb, whereas 260 proteins within Skdb had significant similarity to proteins within AYdb. E (expectation)-value in BLAST search is defined as “the number of different alignments with scores equivalent to or better than S that are expected to occur in a database search by chance”, and it depends on the size of the search database and the scoring system [18]. Thus, it was expected that the number of proteins with significant similarity for the two independent searches would differ because of different database sizes.

To identify shared AY-WB and S. kunkelii proteins that are not present in animal and human pathogenic mycoplasmas and ureaplasmas, AY_Skdb and Sk_AYdb were searched against mycoprotdb using the blastp algorithm. Sequences that had non-significant similarity (E-value > 10−8) or no similarities were extracted from Skdb and AYdb. This resulted in two datasets of AY_Sk_-mycoprotdb with 14 entries and Sk_AY_-mycoprotdb with seven entries.

Plotting the negative logs of the blastp E-values showed that the majority of the predicted protein sequences shared by AY-WB and S. kunkelii had homologs in the five Mycoplasma spp. and U. urealyticum (Fig. 2). However, 9 AY-WB and 8 S. kunkelii proteins did not have significant similarity to proteins in mycoprotdb. Among these, four proteins present in both the AY_Sk_-mycoprotdb and Sk_AY_-mycoprotdb datasets were analyzed because they were similar in length and had significant similarity to proteins in NCBI nr database (closed diamonds, Fig. 2). The four proteins were identified as polynucleotide phosphorylase (PNPase), cmp-binding factor (CBF), cytosine deaminase, and YlxR protein (Table 1). The PNPase protein sequences of AY-WB and S. kunkelii were 62% (452/719) similar, the CBFs 59% (138/231), cytosine deaminases 60% (86/141), and YlxR proteins 61% (46/74). To ensure that the sequences are not present in the genomes of mycoplasmas and ureaplasmas, sequences in common between AY-WB and S. kunkelii were searched against the mycoplasma and ureaplasma GenBank databases. Further, the annotated protein databases of Mycoplasma spp. and U. urealyticum were searched by keywords. Both analyses showed that no proteins for these organisms were annotated as PNPase, CBF, cytosine deaminase, or YlxR protein. Thus, these data suggested that these four genes are present in AY-WB and S. kunkelii but absent from Mycoplasma spp. and U. urealyticum genomes. All these four proteins have homologs in OY phytoplasma genome. However, all but PNPase have homologs in M. mycoides subsp. mycoides SC strain.

Figure 2

Graphical representation of comparative analysis results. (a) Negative logistic plots of the top E-values of the BLAST search using AY-Skdb as query and Skdb (x-axis) and mycoprotdb (y-axis) as databases. (b) Negative logistic plots of the top E-values of the BLAST search using Sk_AYdb as query and AYdb (x-axis) and mycoprotdb (y-axis) as databases. Based on criteria described in Section 2, data points in diamonds (♦ or ◊) are proteins shared between AY-WB and S. kunkelii but absent from five Mycoplasma spp. and U. urealyticum, and data points in open triangles (△) are proteins present in AY-WB, S. kunkelii and Mycoplasma spp. and U. urealyticum. Data points in solid diamond (♦) are proteins having similar lengths and annotations, which are detailed in Table 1. Data points in open circles (^) are AY-WB or S. kunkelii proteins that are more similar to each other than to counterparts in five Mycoplasma spp. and U. urealyticum, which are detailed in Table 2.

View this table:
Table 1

Four AY-WB and S. kunkelii homologues that were absent from mycoprotdb consisting of the whole genome sequences of M. genitalium, M. pneumoniae, U. urealyticum, M. pulmonis, M. penetrans, and M. gallisepticum

IDAY-WB and Spiroplasma kunkelii homologues absent from mycoprotdbBest hit against NCBI nr databaseCellular locationa
SourceORF IDaLengthaAccession #, HomologyOrganismE-value
1AY-WB246_1F71629377522, PNPaseEnterococcus faecalis1e − 180Cytoplasm
Spiroplasma kunkelii100_74F71915902560, PNPaseStreptococcus pneumoniae0Cytoplasm
2AY-WB247_200F32127468441, cmp-binding factor 1Staphylococcus aureus1e − 39Cytoplasm
Spiroplasma kunkelii109_633F31316078057, cmp-binding factor 1Bacillus subtilis3e − 59Cytoplasm
3AY-WB247_187F16120806575, cytosine/adenosine deaminasesThermoanaerobacter tengcongensis4e − 26Cytoplasm
Spiroplasma kunkelii98_127R15920806575, cytosine/adenosine deaminasesThermoanaerobacter tengcongensis1e − 16Cytoplasm
4AY-WB247_205R85541414, conserved hypothetical protein YlxRBacillus subtilis2e − 14Cytoplasm
Spiroplasma kunkelii107_113R91541414, conserved hypothetical protein YlxRBacillus subtilis2e − 06Cytoplasm
  • a ORF ID identified by ORF Extractor.

  • b Length of deduced amino acid sequence.

  • c Cellular location was determined by pSORT [37].

3.3 Identification of proteins more closely related between AY-WB and S. kunkelii than to other mollicutes

Four proteins were identified from the negative logistic plots that were more similar between AY-WB and S. kunkelii than to mycoplasmas (open circles, Fig. 2). These proteins were identified as ppGpp synthetase, HAD hydrolase, AtA (AAA type ATPase), and P-type Mg2+ transport ATPase (Table 2). Amino acid sequence similarities between AY-WB proteins and S. kunkelii proteins were ppGpp synthetase, 59% (305/503); HAD hydrolase, 59% (449/750); AtA, 88% (362/407); and P-type Mg2+ transport ATPase, 56% (512/902). All proteins have homologs in the genomes of OY phytoplasma and M. mycoides subsp. mycoides SC strain, except for the AtA sequence that is lacking from OY phytoplasma.

View this table:
Table 2

Identities of AY-WB and S. kunkelii proteins that are more similar to each other than to proteins in mycoprotdb

IDProteins shared between AY-WB and Spiroplasma kunkeliiBest hit against NCBI nr databaseCellular locationc
OrganismORF IDaLengthbAccession #, HomologyOrganismE-value
1AY-WB235_4R41415613820, BH1257 unknown conservedBacillus halodurans1e − 94Cytoplasm
Spiroplasma kunkelii94_78R41415613820, BH1257 unknown conservedBacillus halodurans2e − 87Cytoplasm
2AY-WB248_157F88915673239, cation-transporting P-ATPase (EC 3.6.3.2)Lactococcus lactis2e − 179Membrane
Spiroplasma kunkelii77_20F91030022224, Mg2+ transport ATPase, P type (EC 3.6.3.2)Bacillus cereus0Membrane
3AY-WB247_48R74510443847, ppGpp synthetaseGeobacillus stearothermophiluse − 154Cytoplasm
Spiroplasma kunkelii106_196R7496647842, ppGpp synthetaseSpiroplasma citri0Cytoplasm
4AY-WB246_186F52828378886, Hypothetical exported protein/HAD hydrolaseLactobacillus plantarum1e − 116Membrane or outside
Spiroplasma kunkelii96_41R509401696, Hypothetical exported protein/HAD hydrolaseMycoplasma mycoides1e − 92Membrane or outside
  • a ORF ID identified by ORF Extractor.

  • b Length of deduced amino acid sequence.

  • c Cellular location was determined by pSORT [37].

3.4 Phylogenetic analysis of proteins present in AY-WB and S. kunkelii but absent from mycoplasmas

Phylogenetic analyses were performed to investigate the origin of the proteins identified in this study. The PNPases from AY-WB and S. kunkelii clustered with those from the Gram-positive Bacillus and Streptococcus spp. and were clearly distinct from those of Gram-negative bacteria (Fig. 3(b)). Thus, the PNPase phylogenetic trees are consistent with the proposed evolutionary status of mollicutes as descendents of Gram-positive bacterial ancestors [1,19]. Phylogenetic analysis of CBFs (Fig. 3(c)) resulted in a tree different from the phylogenetic tree based on 16S rDNA sequences (Fig. 3(a)) with the CBF sequences of AY-WB and S. kunkelii separated by CBF sequences of Gram-positive bacteria. Phylogenetic analyses of cytosine deaminases and YlxR proteins resulted in trees with most branches having low bootstrap values (data not shown).

Figure 3

Phylogenetic analyses of proteins that are present in insect-transmitted plant pathogenic AY-WB and S. kunkelii but absent from animal and human pathogenic mycoplasmas.Phylogenetic trees were generated following the procedure described in Materials and methods.Bars under the trees represent evolutionary distances.(a) Phylogenetic tree derived from 16S rDNA sequences.(b) Phylogenetic tree derived from polynucleotide phosphorylase (PNPase).(c) Phylogenetic tree derived from cmp-binding factor (CBF). Protein sequences were obtained from GenBank and aligned with ClustalW [14].The alignments were used for parsimony analysis in PAUP version 4.0 [15].Trees were bootstrapped 1000 times and the bootstrap values above 50% are indicated as a percentage at the branches.Accession Numbers for protein sequences were as follows:(a) Acholeplasma laidlawii, M23932;Anaeroplasma abactoclasticum, M25050;Asteroleplasma anaerobium, M22351;Bacillus subtilis, AB042061;Mesoplasma entomophilum, AF305693;Mycoplasma capricolum, U26048;M. gallisepticum, M22441;M. genitalium, X77334;M. hominis, AJ002268;M. mycoides, U26050;M. pulmonis, AF125582;Mycoplasma sualvi, AF412988;Streptococcus pneumoniae, AY281083;U. urealyticum, U06098.(b) Actinobacillus pleuropneumoniae, ZP_00134571;Bacillus halodurans, NP_243273;B. subtilis, NP_389551;Buchnera aphidicola, NP_777952;Deinococcus radiodurans, NP_295786; Escherichia coli, NP_312072; Haemophilus influenzae, NP_438401; Mycobacterium bovis, CAD94991; S. enterica, NP_806878; S. typhimurium, AAL22154; Shigella flexneri, NP_708965; Streptococcus agalactiae, CAD45842; Streptococcus mutans, NP_720625; Streptococcus pyogenes, BAC64773; Thermotoga maritime, NP_229146; Thermus thermophilus, CAB06341; Vibrio vulnificus, NP_935490; Xylella fastidiosa, NP_778440; Xanthomonas axonopodis, NP_642994; Yersinia enterocolitica, CAA71697; Yersinia pestis, NP_668031.(c) B. subtilis, CAB12833; Bacillus cereus, NP_830807; Clostridium perfringens, NP_560939; Clostridium tetani, NP_783025; Lactococcus lactis, NP_268079; Methanococcus jannaschii, NP_247831; S.aureus, NP_374949; Sta. Epidermidis, NP_765078; Str. mutans, NP_720807; Str. pneumoniae, NP_359386; Str. pyogenes, NP_268621.

3.5 Phylogenetic analysis of proteins more closely related between AY-WB and S. kunkelii to other mollicutes

Phylogenetic analysis was employed to analyze the possible origins of the four proteins that were more closely related between AY-WB and S. kunkelii than to other mollicutes. Most branches of the phylogenetic trees generated using ppGpp synthetase, HAD hydrolase, and P-type Mg2+ transport ATPase had bootstrap values lower than 50% (data not shown). However, bootstrap values of the phylogenetic tree based on AtA sequences were statistically significant. Interestingly, in the AtA phylogeny, the phytoplasma AtA sequence clustered together with the AtA sequence of S. kunkelii in a cluster of AtA sequences of mycoplasmas belonging to the SEM branch (Fig. 4). Thus the AtA phylogeny is different from the 16S rDNA phylogeny (Fig. 3(a)). The AtA homolog is present in M. mycoides subsp. mycoides SC, which is also a member of the SEM branch of mollicutes, but it is absent from the OY phytoplasma genome.

Figure 4

Phylogenetic analysis for AtA (AAA type ATPase). Phylogenetic trees were generated following the procedure described in Material and Methods. Bars under the trees stand for evolutionary distances. Protein sequences were obtained from GenBank and aligned with ClustalW [14]. The alignments were used for parsimony analysis in PAUP version 4.0. Trees were bootstrapped 1000 times and the bootstrap values above 50% are indicated as a percentage at the branches. Accession Numbers for protein sequences follows. B. halodurans, NP_242123; B. subtilis, NP_390631; Clostridium acetobutylicum, NP_348297; Enterococcus faecalis, NP_815655; Enterococcus faecium, ZP_00036045; Listeria innocua, NP_470885; Listeria monocytogenes, NP_465039; Mycobacterium leprae, CAA19102; M. gallisepticum, NP_853308; M. penetrans, NP_757529; S. aureus, NP_646394; Str. mutans, NP_722348; Str. pneumoniae, NP_346223; Ureaplasma urelyticum, NP_078028.

4 Discussion

In this study, we have identified several proteins that appear to be present in AY-WB and S. kunkelii but absent from Mycoplasma spp. and U. urealyticum. These proteins are PNPase, CBF, cytosine deaminase, and YlxR. These proteins are also present in the genome of OY phytoplasma, another insect-transmitted plant pathogenic mollicute closely related to AY-WB.

PNPase is an exoribonuclease belonging to the PDX family that also includes RNase PH [20]. Most prokaryotes have PNPase homologs, however, thus far, none have been sequenced from mycoplasmas and U. urealyticum. PNPase genes are also present in the genomes of plants [21] and Drosophila[22]. PNPases are highly conserved proteins that are involved in mRNA degradation and regulation of gene expression [23]. PNPase has been shown to be a global regulator of virulence factors of Salmonella enterica, because a single point mutation of the PNPase gene resulted in a significant decrease in efficiency of invasion and intracellular replication of this bacterium [24]. Both AY-WB and S. kunkelii invade and replicate cells of insects and plants [25] and, consequently, have to adjust their gene expression patterns continuously to different environments. In contrast, the Mycoplasma and Ureaplasma spp. are restricted to animal hosts in which they are able to attach to and mostly invade epithelial cell layers [2]. Thus, PNPases in plant pathogenic bacteria, AY-WB, S. kunkelii, and OY phytoplasma, could be important for gene expression regulation allowing adaptation to multiple environmental niches, including insect gut lumen, insect cells, plant phloem, and plant cells. However, the involvement of PNPase in regulation of virulence of plant pathogenic mollicutes, AY-WB and S. kunkelii, remains to be investigated. At this time, spiroplasmas are more suitable candidates for such an investigation, because, unlike phytoplasmas, they can be cultured [26] and transformed [27].

CBF is a protein identified in Staphylococcus aureus. It binds to the cmp sequence, a replication enhancer identified in the pT181 plasmid of S. aureus, to stimulate plasmid replication [28]. Spiroplasmas and phytoplasmas have plasmids [2], whereas plasmids have not been reported in members of M. pneumoniae and M. hominis groups and U. urealyticum that do not have CBF. Interestingly, a CBF homolog is present within the recently released complete genome of M. mycoides subsp. mycoides SC strain [3]. Although plasmids have not been reported in the SC type strain, plasmids are common in M. mycoides spp. mycoides[29,30]. It is possible that CBF is required for regulation of plasmid replication in spiroplasmas and phytoplasmas. Interestingly, spiroplasma and phytoplasma plasmid appear to harbor virulence factors [31,32].

Cytosine deaminase is an enzyme involved in nucleotide metabolism and can affect protein synthesis if transiently expressed in human cells [33]. Thus, apparently, S. kunkelii, AY-WB, and OY have an additional housekeeping gene that is absent from other mollicutes sequenced so far. YlxR protein is expressed from the nusA/infB operon in bacteria and proposed to be an RNA-binding protein [34].

We also identified four AY-WB and S. kunkelii ORFs that appear to be more closely related to each other than their mycoplasma counterparts. Of these, the AtA sequence is most interesting, because the phylogenetic tree suggests that phytoplasmas might have obtained the AtA sequence from spiroplasmas, possibly S. kunkelii, by horizontal gene transfer. This hypothesis is supported by additional data. First, AtA is absent from the OY phytoplasma genome [4]. OY phytoplasmas is a plant pathogen in Japan where there is no occurrence of S. kunkelii. But, in the American continent, S. kunkelii and AY-WB co-occur and occasionally share similar insect and plant host ranges [35]. Secondly, AtA sequences of both AY-WB and S. kunkelii are flanked by insertion sequences that often part of mobile elements [36]. AY-WB AtA is flanked by a truncated transposase gene at its 5′ end and an intact transposase gene at its 3′ end, and S. kunkelii AtA is located in an IS (insertion sequence) element-rich region.

In summary, the comparative genomics study presented herein successfully identified proteins that are common among insect-transmitted plant pathogenic mollicutes. Further studies of these proteins may elucidate their roles in insect transmission and plant pathogenicity.

This research was supported by OSU-OARDC Research Enhancement Competitive Grants Program and MCIC.

Acknowledgements

The authors thank Dr. Sophien Kamoun in the Department of Plant Pathology, OSU-OARDC, for constructive advice; Dr. Tea Meulia for setup of the Linux workstation and design of ORF Extractor; and B.A. Roe, S.P. Lin, H.G. Jia, H.M. Wu, D. Kupfer, and R.E. Davis and the S. kunkelii Genome Sequencing Project funded by US Department of Agriculture, Agricultural Research Service Project Number: 1275-22000-144-02 for the S. kunkelii genome sequences.

References

  1. [1].
  2. [2].
  3. [3].
  4. [4].
  5. [5].
  6. [6].
  7. [7].
  8. [8].
  9. [9].
  10. [10].
  11. [11].
  12. [12].
  13. [13].
  14. [14].
  15. [15].
  16. [16].
  17. [17].
  18. [18].
  19. [19].
  20. [20].
  21. [21].
  22. [22].
  23. [23].
  24. [24].
  25. [25].
  26. [26].
  27. [27].
  28. [28].
  29. [29].
  30. [30].
  31. [31].
  32. [32].
  33. [33].
  34. [34].
  35. [35].
  36. [36].
  37. [37].
View Abstract