OUP user menu

Species-specific PCR detection of the food-borne pathogen Vibrio parahaemolyticus using the irgB gene identified by comparative genomic analysis

Shuijing Yu, Wanyi Chen, Dapeng Wang, Xiaohua He, Xinna Zhu, Xianming Shi
DOI: http://dx.doi.org/10.1111/j.1574-6968.2010.01952.x 65-71 First published online: 1 June 2010


Vibrio parahaemolyticus is an enteric pathogen, which can cause acute gastroenteritis in humans after consumption of raw or partially cooked seafood, and specific molecular markers are necessary for its accurate identification by PCR methods. In the present study, 23 protein-coding sequences were identified by the comparative genomics method as V. parahaemolyticus-specific candidate markers. We targeted the irgB gene (vp2603), coding for iron-regulated virulence regulatory protein IrgB, in order to develop a PCR method for the detection of V. parahaemolyticus. PCR specificity was identified by amplification of 293 V. parahaemolyticus templates and by the loss of a PCR product with 11 strains from other Vibrio species and 35 non-Vibrio bacterial strains. The PCR assay had the 369-bp fragment and the sensitivity of 0.17 pg purified genomic DNA from V. parahaemolyticus. Furthermore, a multiplex PCR assay for the detection of total and virulent strains of V. parahaemolyticus was developed by targeting irgB, tdh and trh genes. These data indicated that the irgB gene is a new and effective marker for the detection of V. parahaemolyticus. In addition, this study demonstrates that genome sequence comparison has a powerful application in identifying specific markers for the detection and identification of bacterial pathogens.

  • comparative genomics
  • food-borne pathogen
  • irgB
  • PCR detection of Vibrio parahaemolyticus
  • pathogenicity


Vibrio parahaemolyticus is a Gram-negative bacterium commonly found in marine and estuarine environments around the world (Daniels et al., 2000). This organism may lead to acute gastroenteritis characterized by diarrhea, headache, vomiting, nausea and low fever, after consumption of raw or partially cooked fish or shellfish (Tuyet et al., 2002; DePaola et al., 2003). Outbreaks of V. parahaemolyticus have been reported from many countries and regions such as China (Liu et al., 2004b), Japan (Alam et al., 2002), the United States (McLaughlin et al., 2005) and some European countries (Martinez-Urtaza et al., 2005). Therefore, early detection and identification of V. parahaemolyticus strains in clinical and food samples is essential for diagnosis and implementing timely risk management decisions. However, the detection of V. parahaemolyticus using conventional culture- and biochemical-based assays is time consuming and laborious, requiring more than 3 days. Those strains that produce thermostable direct hemolysin (TDH) and/or TDH-related hemolysin (TRH) are considered virulent for humans (Dileep et al., 2003; Zhang & Austin, 2005). It is difficult to detect virulent strains in clinical and food samples by traditional culture methods because virulent strains have no obvious growth characteristics that differ from those of nonvirulent strains, and their populations are at very low levels compared with those of other bacteria (Takahashi et al., 2005). PCR has been used for rapid identification of this species and detection of its virulence genes (Bej et al., 1999; Kim et al., 1999; Bauer & Rorvik, 2007). Major virulence genes, the tdh gene encoding TDH or the trh gene encoding TRH, or both of them, have been widely used as diagnostic markers to identify pathogenic isolates of V. parahaemolyticus by PCR methods (Bilung et al., 2005; Marlina et al., 2007; Nordstrom et al., 2007). However, all strains of V. parahaemolyticus cannot be accurately identified by the PCR assays based on these virulence genes because they are absent in some strains such as some nonpathogenic strains. This means that these virulence genes are unable to be used as V. parahaemolyticus-specific targets.

There is a need for specific molecular markers to identify accurately V. parahaemolyticus by PCR methods. The genes encoding the thermolabile hemolysin (tl), the transcriptional regulator (toxR) and pR72H fragment have been reported as target genes to develop specific detection methods (Lee et al., 1995; Bej et al., 1999; Kim et al., 1999). However, there are still both false-positive and false-negative results in PCR assay targeting tl, toxR and pR72H fragments for identification of V. parahaemolyticus (Croci et al., 2007). Therefore, accurate identification of V. parahaemolyticus requires newer and more specific targets to reduce the risk of both false-positive and false-negative results in PCR assays. High-throughput basic local alignment search tool (blast) (Altschul et al., 1990) search is an example of comparative genomics methods which have been applied to mine new specific targets for some bacteria, including Salmonella enterica Paratyphi A (Ou et al., 2007) and Streptococcus pneumoniae (Oggioni & Pozzi, 2001). Kim et al. (2008b) successfully employed 70 mer-specific oligonucleotide probes identified by comparative genomics for microarray detection of 11 major food-borne pathogens. Recent advances in sequencing technology have enriched genomic sequence resources; complete or partial genome sequences of more than 900 microorganisms are publicly available at the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nih.gov/genomes/lproks.cgi). Such abundant sequence information makes it much more convenient and accurate to identify specific markers of bacterial pathogens using comparative genomics.

The aim of this study was to identify new potential species-specific markers using a comparative genomics method for rapid identification of V. parahaemolyticus, and to evaluate one candidate marker by PCR assay.

Materials and methods

Bacterial strains

There were 339 bacterial strains used in this study, among which 293 were strains of V. parahaemolyticus, 11 were other Vibrio species and 35 were non-Vibrio strains (Table 1). Vibrio parahaemolyticus isolates were obtained from Shanghai Entry-Exit Inspection and Quarantine Bureau, Shanghai, China; other bacterial strains were kept in our laboratory. Bacteria were grown at their optimum temperatures on brain heart infusion (Difco), heart infusion (Difco) or Luria–Bertani (Difco) agars.

View this table:
Table 1

List of bacterial strains tested in the study

Bacterial species (n)Strain(s)PCR results
Vibrio parahaemolyticusATCC 17802+
Vibrio parahaemolyticusATCC 33846+
Vibrio parahaemolyticus (291)Isolates+
Vibrio alginolyticusATCC 33787
Vibrio vulnificusATCC 27562
Vibrio vulnificusATCC 33816
Vibrio campbelliiATCC 33863
Vibrio damselaATCC 33539
Vibrio group/freshwater subgroupSGL 85-4-2
Vibrio harveyiATCC 33842
Vibrio fluvialisATCC 33810
Vibrio anguillarumE-3-11
Vibrio mimicusATCC 33653
Vibrio choleraeIsolates
Staphylococcus aureusATCC 6538
Staphylococcus aureus (15)Isolates
Salmonella entericaATCC 14028
Klebsiella pneumoniaeATCC 27736
Enterobacter cloacaeATCC 13047
Enterobacter cloacaeATCC 700323
Escherichia coliATCC 43888
Kocuria rhizophilaATCC 9341
Listeria monocytogenes (3)ATCC 7644
Pseudomonas aeruginosaATCC 15442
Proteus mirabilisATCC 12453
Enterococcus faeciumATCC 27270
Enterococcus faecalisATCC 49452
Shigella flexneriATCC 51333
Bacillus subtilisATCC 6633
Serratia liquefaciensATCC 27592
Citrobacter freundiiATCC 8090
Enterococcus aviumATCC 14025
Proteus vulgarisATCC 33420
  • * Vibrio parahaemolyticus isolates from 184 clinical, 30 environmental and 77 seafood samples.

  • Clinical isolates from Peking University Health Science Center, Beijing, China.

  • n, number of strains tested; +, PCR positive; −, PCR negative.

Sources of sequence data

All 3080 annotated protein-coding sequences (CDSs) of V. parahaemolyticus RIMD 2210633 chromosome 1 were obtained from GenBank (accession number BA000031). The other 811 non-V. parahaemolyticus bacterial genomes used in this study were downloaded from the NCBI bacterial genome resource on January 11, 2009 (ftp://ftp.ncbi.nih.gov/genomes/bacteria/).

Vibrio parahaemolyticus-specific target mining and primer design

The workflow for selection of V. parahaemolyticus-specific CDSs is illustrated in Fig. 1. To determine V. parahaemolyticus-specific markers, 3080 CDSs of V. parahaemolyticus were searched against the database of all of the 811 non-V. parahaemolyticus bacterial genome sequences using blastn (version 2.2.18). CDSs with the lowest e-value ≥0.1 from blastn output were identified as V. parahaemolyticus-specific markers. One V. parahaemolyticus-specific CDS with a length of 800–1000 bp was used to design a primer set using the software primer premier 5.0 (Premier Biosoft International, Palo Alto, CA). All primers used in this study were synthesized by Shanghai Sangon (Shanghai, China).

Figure 1

Scheme for mining Vibrio parahaemolyticus-specific CDSs using genome sequence and comparative genomic analysis.

DNA extraction and PCR analysis

Bacterial DNA was extracted as previously described by Liu et al. (2007). PCR was performed in a 20-μL volume using an Eppendorf PCR system (Eppendorf AG22331, Germany). Each reaction contained 1 U Taq DNA polymerase (Tiangen Biotechnology, Beijing, China), 1 × PCR buffer, 1.875 mmol L−1 MgCl2, 0.1 mmol L−1 of each dNTP, 0.25 μmol L−1 of each primer for the irgB gene, approximately 0.1 ng genomic DNA and sterile distilled water up to 20 μL. The reaction mixture with no template DNA was used as a negative control. The thermal cycling conditions consisted of an initial denaturation at 94 °C for 5 min, followed by 30 amplification cycles (94 °C for 30 s, 62 °C for 30 s and 72 °C for 30 s), and a final extension step at 72 °C for 10 min. The PCR products were examined by 1.5% agarose gel electrophoresis.

PCR specificity and sensitivity

Specificity of the primer was tested against a total of 293 strains of V. parahaemolyticus and 11 bacterial strains from other Vibrio species and 35 bacterial strains from non-Vibrio species. Some irgB amplicons were sequenced using an automated DNA sequencer (ABI 3730XL DNA Analyzer). Two primers for 16S rRNA gene were selected for PCR amplification of 46 non-Vibrio bacterial strains (Table 2). For sensitivity testing, purified genomic DNA from V. parahaemolyticus ATCC 17802 was serial diluted 10-fold and tested by PCR.

View this table:
Table 2

Sequences of primers used in this study

Target geneSequenceSource
tdh (F)5′-TCCCTTTTCCTGCCCCC-3′Nordstrom et al. (2007)
tdh (R)5′-CGCTGCCATTGTATAGTCTTTATC-3′Nordstrom et al. (2007)
trh (F)5′-TTGGCTTCGATATTTTCAGTATCT-3′Bej et al. (1999)
trh (R)5′-CATAACAAACATATGCCCATTTCCG-3′Bej et al. (1999)
16S rRNA (F)5′-AGAGTTTGATCMTGGCTCAG-3′Kim et al. (2008a)
16S rRNA (R)5′-TACGGYTACCTTGTTACGACTT-3′Kim et al. (2008a)
  • F, Forward primer; R, Reverse primer.

Multiplex PCR amplification

A multiplex PCR detection of 293 V. parahaemolyticus was carried out by the simultaneous addition of primer pairs for irgB, tdh and trh in a single reaction system (Table 2). Optimum primer concentrations were obtained by tests among the concentrations of 0.125, 0.25 and 0.25 μmol L−1 for irgB, tdh and trh genes, respectively. Other conditions for PCR amplification remained as described above.


The mining of V. parahaemolyticus-specific target

To identify V. parahaemolyticus-specific markers, 3080 CDSs were screened for nucleotide sequence similarity against the 811 non-V. parahaemolyticus bacterial genomes available at NCBI. For convenience in the subsequent primer design, we selected V. parahaemolyticus-specific CDSs with the length of 800–1000 bp as candidate targets from blastn output. As a result, 23 V. parahaemolyticus-specific CDSs with the lowest e-value ≥0.1 were identified. The accession numbers of 23 V. parahaemolyticus-specific candidate CDSs and their gene products are provided as supporting data (Supporting Information, Table S1). Among these candidate-specific CDSs, the irgB gene and the Ocd2 gene are known for their functions, and the others encode hypothetical proteins of unknown function. The irgB gene (vp2603) had been characterized for its function coding for the iron-regulated virulence regulatory protein IrgB, and it has not been reported as a detection target in previous research. In this study, the irgB gene was selected as a target gene for PCR identification of V. parahaemolyticus, and a pair of primers was designed according to this gene (Table 2).

The specificity and sensitivity of irgB-based primers

To evaluate the specificity of the PCR assay, PCR amplifications using irgB-specific primers were performed with 293 V. parahaemolyticus strains and 46 non-V. parahaemolyticus bacterial strains using purified genomic DNA as templates. Amplification of genomic DNA isolated from all 293 V. parahaemolyticus strains resulted in a product with the predicted length of 369 bp, whereas no products were obtained from the 46 non-V. parahaemolyticus bacterial strains. Typical data are shown in Fig. 2a. In the case of PCR with 16S rRNA gene-based primers, as a positive control, the amplicon of 1466 bp could be seen in all 46 non-V. parahaemolyticus strains tested in this study (Fig. 2b). A minimum of 0.17 pg of purified genomic DNA generated a detectable level of an amplified irgB with the expected length of 369 bp (Fig. 3). These results suggested that the irgB gene is a new species-specific marker for rapid identification of V. parahaemolyticus.

Figure 2

(a) Agarose gel electrophoresis of Vibrio parahaemolyticus-specific DNA PCR products amplified using irgB primers. M, 100-bp DNA ladder. (b) Agarose gel electrophoresis of 16S rRNA gene PCR products amplified using 16S rRNA gene primers as a positive control. M, 200-bp DNA ladder. Lane 1, V. parahaemolyticus ATCC 17802; lane 2, Vibrio alginolyticus ATCC 33787; lane 3, Vibrio vulnificus ATCC 27562; lane 4, Vibrio campbellii ATCC 33863; lane 5, Vibrio damsela ATCC 33539; lane 6, Vibrio group/Freshwater subgroup SGL 85-4-2; lane 7, Vibrio harveyi ATCC 33842; lane 8, Vibrio fluvialis ATCC 33810; lane 9, Vibrio anguillarum E-3-11; lane 10, Vibrio mimicus ATCC 33653; lane 11, Citrobacter freundii ATCC 8090; lane 12, Staphylococcus aureus ATCC 6538; lane 13, Salmonella enterica ssp. enterica serovar Typhimurium ATCC 14028; lane 14, Klebsiella pneumoniae ssp. pneumoniae ATCC 27736; lane 15, Enterobacter cloacae ATCC 13037; lane 16, Escherichia coli ATCC 43888; lane 17, Shigella flexneri ATCC 1333; lane 18, Listeria monocytogenes ATCC 7644; lane 19, Pseudomonas aeruginosa ATCC 15442; lane 20, Proteus mirabilis ATCC 12453; lane 21, Enterococcus faecium ATCC 27270; lane 22, Enterococcus faecalis ATCC 49452; lane 23, negative control.

Figure 3

Sensitivity evaluation of PCR primers for the irgB gene as a Vibrio parahaemolyticus-specific marker at various DNA concentrations. M, 100-bp DNA ladder; lane 1, 17 ng; lane 2, 1.7 ng; lane 3, 0.17 ng; lane 4, 17 pg; lane 5, 1.7 pg; lane 6, 0.17 pg; lane 7, 17 fg; lane 8, 1.7 fg; lane 9, 0.17 fg; lane 10, negative control.

Development of multiplex PCR method

Amplicons of irgB (369 bp), tdh (233 bp) and trh (500 bp) were simultaneously generated in a multiplex reaction system from genomic DNA of V. parahaemolyticus. This multiplex PCR was applied to 291 V. parahaemolyticus isolates from 184 clinical, 30 environmental and 77 seafood samples. All 291 V. parahaemolyticus isolates showed PCR amplification of the irgB gene, 215 isolates showed amplification of tdh gene and 70 isolates showed amplification of trh gene. In addition, 63 isolates showed simultaneous amplification of both tdh and trh genes (Fig. 4). If irgB and either or both tdh and trh amplicons were generated simultaneously in a single reaction system, it could be concluded that those strains were virulent strains of V. parahaemolyticus. Thus, the multiplex PCR assay for simultaneous amplification of irgB, tdh and trh genes should be of considerable value in detecting the total number of strains and the virulent strains of V. parahaemolyticus of clinical and environmental origins.

Figure 4

Multiplex PCR assay for detection of Vibrio parahaemolyticus targeting irgB, tdh and trh that produces amplicons of 369, 233 and 500 bp, respectively. M, 100-bp DNA ladder; lane 1, V. parahaemolyticus E206; lane 2, V. parahaemolyticus E128; lane 3, V. parahaemolyticus E95; lane 4, V. parahaemolyticus E87; lane 5, V. parahaemolyticus F6; lane 6, V. parahaemolyticus W2; lane 7, V. parahaemolyticus W3; lane 8, V. parahaemolyticus E769; lane 9, V. parahaemolyticus E921; lane 10, negative control.


PCR methods have been applied to the detection of bacterial pathogens for decades (Bej et al., 1999; Liu et al., 2004a, 2005; Bauer & Rorvik, 2007; Kim et al., 2008a). The specificity of target sequences is crucial for their accurate identification. Specific genes or universal genes, including toxin genes and 16S rRNA gene, have been used as target markers for PCR assays (Martinez-Picado et al., 1994; Bej et al., 1999). Unfortunately, there is often significant nucleotide sequence similarity among toxin genes in bacterial species, especially within the same genus, and this sequence similarity has prevented these toxin genes from being useful targets for species-specific identification of bacterial pathogens (Chizhikov et al., 2001). The 16S rRNA gene sequences among the Vibrionaceae family showed >90% nucleotide sequence similarity when analyzing this gene of 35 Vibrio strains (Urakawa et al., 1997). It seems that the high degree of sequence identity does not allow reliable discrimination of specific strains using PCR methods. Computational genomics has led the way to efficient and customized mining of genomes for species-specific nucleotide sequences. The blast program, a frequently used tool for nucleotide sequence comparisons, has been applied to identify specific targets for the detection and identification of bacterial pathogens (Oggioni & Pozzi, 2001; Kim et al., 2006, 2008b). To mine targets with a high level of specificity, we identified 23 V. parahaemolyticus-specific candidate CDSs by standalone blast searching against the local database. Among the 23 V. parahaemolyticus-specific candidate CDSs, seven were designated hypothetical proteins, 14 were identified as putative genes and two were characterized by their function. Revealing the specificity of CDSs might be helpful in understanding the metabolic behaviors unique to V. parahaemolyticus.

The specificity in silico is largely determined by the screening criteria. If blastn searching of a query sequence returns a best-match sequence with the lowest e-value ≥0.001, the query sequence is considered to share little or no sequence similarity to any nucleotide sequence in the database, and, for our purposes, should be considered a specific sequence target (LaGier & Threadgill, 2008). Here, we chose the lowest e-value ≥0.1 as a standard to select V. parahaemolyticus-specific CDSs. In general, the process of identifying specific sequences will be made more reliable by the addition of more bacterial genomes to the database used for blast comparison. In this study, genome sequences of 811 non-V. parahaemolyticus bacteria proved to be sufficient for identifying V. parahaemolyticus-specific CDSs.

In the present study, the specificity of the irgB gene was verified by PCR amplification of 293 V. parahaemolyticus and 46 non-V. parahaemolyticus bacterial strains templates. The iron-regulated virulence regulatory protein IrgB associated with iron utilization may have profound influences, besides iron acquisition, on the pathogenesis of V. parahaemolyticus (Wong et al., 1996). Therefore, the identification of the irgB gene in V. parahaemolyticus will not only provide a species-specific target for diagnostic application but may also lead to a better understanding of the genetic mechanisms of its survival in its niche environments as well as its pathogenicity.

The detection of tdh and trh genes in V. parahaemolyticus is necessary to determine the real risk posed to human health by the presence of this microorganism. The species-specific irgB gene can be used as a tool to identify accurately V. parahaemolyticus species by PCR methods, and toxin genes may inform the pathogenic properties of pathogens. Thus, we developed a multiplex PCR assay targeting species-specific marker irgB and toxin genes tdh and trh to detect total and virulent strains of V. parahaemolyticus, which has the potential to reduce V. parahaemolyticus-associated illness in humans.

In conclusion, our results demonstrated the successful application of comparative genomics to mine specific markers used in PCR methods for accurate detection and identification of V. parahaemolyticus. The irgB gene was validated as a new V. parahaemolyticus-specific marker. A multiplex PCR assay targeting irgB, tdh and trh genes was successfully developed to detect total and virulent strains of V. parahaemolyticus, which has a potential to be applied in food industries, diagnostics and taxonomic studies. Using this comparative genomics method, it is conceivable the specific targets could be identified for the detection of any bacterium for which a genome sequence is available.

Supporting Information

Table S1. List of 23 CDSs with the lowest e-value ≥0.1 from Vibrio parahaemolyticus.


This work was jointly supported by the grant No. 2009BAK43B31 from the Ministry of Science and Technology of China, the grant Nos. 08391911000, 08142200700 and 08DZ0504200 from Science & Technology Commission of Shanghai Municipality.


  • Editor: Jeff Cole


View Abstract