DNA Research Advance Access published online on April 27, 2008
DNA Research, doi:10.1093/dnares/dsn006
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Short Communication |
A Variable Gene in a Conserved Region of the Helicobacter pylori Genome: Isotopic Gene Replacement or Rapid Evolution?
1 INSERM U853, Laboratoire de Bactériologie, Université Victor Segalen Bordeaux 2, 146 rue Léo Saignat, F-33076 Bordeaux cedex, France
2 Université Victor Segalen Bordeaux 2, Laboratoire de Bactériologie, Bordeaux F-33076, France
3 Institut Pasteur, Génétique des Génomes Bactériens - CNRS URA2171, Paris F-750015, France
Received 6 November 2007; accepted 3 April 2008.
| Abstract |
|---|
|
|
|---|
The present study concerns the identification of a novel coding sequence in a region of the Helicobacter pylori genome, located between JHP1069/HP1141 and JHP1071/HP1143 according to the numbering of the J99 and 26695 reference strains, respectively, and spanning three different coding DNA sequences (CDSs). The CDSs located at the centre of this locus were highly polymorphic, as determined by the analysis of 24 European isolates, 3 Asian, and 3 African isolates. Phylogenetic and molecular evolutionary analyses showed that the CDSs were not restricted to the geographical origin of the strains. Despite a very high variability observed in the deduced protein sequences, significant similarity was observed, always with the same protein families, i.e. ATPase and bacteriophage receptor/invasion proteins. Although this variability could be explained by isotopic gene replacement via horizontal transfer of a gene with the same function but coming from a variety of sources, it seems more likely that the very high sequence variation observed at this locus is the result of a strong selection pressure exerted on the corresponding gene product. The CDSs identified in the present study could be used as strain specific markers.
Key words: Helicobacter pylori; coding DNA sequence; genetic diversity; diversifying selection
Comparative analyses conducted on Helicobacter pylori genome sequences, i.e. from H. pylori strain J99 associated with peptic ulcer,1
Subtractive hybridization is a powerful tool for comparative prokaryotic genomics and was validated on H. pylori by several authors.10
,11
In a previous study, we used subtractive hybridization to compare the genetic content of one H. pylori strain isolated from a gastric MALT lymphoma strain (strain B34) and one chronic gastritis only strain.12
One original 1092 bp sequence was identified, with no significant nucleotide similarity in comparison to the H. pylori reference strains 26695 and J99 genomes which were available. The aim of the present study was to localize this sequence in the H. pylori genome, to determine its prevalence, and to analyze its genetic diversity in H. pylori.
Using an in-house genome walking method as previously described,13
the original region was localized in the H. pylori genome and a new CDS was subsequently identified using the CDS finder website (http://www.ncbi.nlm.nih.gov/gCDS/CDSig.cgi). This new CDS, called CDS2, is located between two CDS homologous to JHP1069/HP1141 and JHP1071/HP1143 according to the numbering of the J99 and 26695 reference strains, respectively.1
,3
CDS2 replaced JHP1070/HP1142, called CDS1, in H. pylori reference strains J99 and 26695.
The percentage of identity between the nucleotide sequences of CDS1 and CDS2 was determined using the LALIGN software,14
which identifies multiple matching subsegments in two sequences (http://www.ch.embnet.org/software/LALIGN_form.html). CDS2 showed 54.9% identity in a 2046 nucleotides overlap with JHP1070 and 55.5% identity in a 2083 nucleotides overlap with HP1142. CDS2 encodes a putative polypeptide of 820 residues (Genbank accesion number EF492441
[GenBank]
, EMBL Nucleotide Sequence AM902682
[GenBank]
). Regarding the protein homology, CDS2 shared 23.6% identity with JHP1070 in a 628 amino acid overlap and 24.4% identity with HP1142 in a 630 amino acid overlap. Finally, a strong nucleotide identity was found with the HPAG1_1080 sequence4
with 89.3% identity in a 2469 nucleotides overlap.
The prevalence and the genetic diversity of the identified genomic locus were first determined for 24 H. pylori strains: 13 H. pylori strains isolated from gastric MALT lymphoma patients obtained from two multicentre French protocols and 11 strains isolated from French chronic gastritis only patients, as previously described12
,15
by PCR amplification using primers hybridizing to the conserved sequence of the flanking genes (JHP1069/HP1141 and JHP1071/HP1143) according to the numbering of the J99 and 26695 strains, respectively. The primers were designed using the web Primer3 software (http://www.broad.mit.edu/cgi-bin/primer/primer3_www.cgi).16
Direct sequencing was carried out on both strands, and nucleotide and deduced protein sequences were compared with the NCBI Blast program (http://www.ncbi.nlm.nih.gov/BLAST/). A CDS was always present at this locus: CDS1 was found in 54% of the strains, CDS2 in 29% of the strains, and an additional CDS, called CDS3, was identified in 17% of the strains. In the chronic gastritis only H. pylori strain G2, CDS3 had a 53.4% identity in a 2005 nucleotide overlap with CDS1 and a 52.9% identity in a 2063 nucleotides overlap with CDS2, and it encodes a putative polypeptide of 861 residues (GenBank accesion number EF492442
[GenBank]
, EMBL Nucleotide Sequence AM902683
[GenBank]
). CDS3 still has no counterpart in databases. Considering the three CDSs, no significant association with a virulence factor was found, nor with a pathology (data not shown). The presence or absence of these CDSs was also verified by dot blot hybridization, as previously described.12
It showed that the presence of one of these three CDSs was exclusive (no local duplication, data not shown).
We first focused on the role of the genes present around the polymorphic locus. According to the revised annotation of the H. pylori genome,17
JHP1069/HP1141 encodes a methionyl-tRNA formyltransferase (fmt) and JHP1071/HP1143, a conserved hypothetical protein. fmt is considered to be an essential gene which links general metabolism with the translation process (protein biosynthesis).18
,19
As shown in Fig. 1, JHP1069/HP1141 and JHP1071/HP1143 are surrounded by genes of hypothetical function. Considering the G + C% content of the region, all of the CDSs contained a G + C% similar to the rest of the H. pylori genome (
39%) except for these variable regions: CDS1, CDS2, and CDS3 had 29, 30, and 31% G + C% content, respectively. The lower G + C% content suggests an external origin of these CDSs or a rapid adaptation.20
Indeed, Saunders et al.21
, using a tetranucleotide and hexanucleotide signature analysis, identified substantial differences between JHP1070 and HP1142 genes and hypothesized that they were horizontally transferred.
|
CDS1 has been annotated as a predicted coding region JHP1070 with no homolog in the databases. It codes for a putative polypeptide of 759 residues. Using a Blastp search, significant homologies were found with (i) Rlo proteins (R-linked ORF) from Campylobacter (e.g. RloG, E = e-11 in Campylobacter jejuni strain RM1167, or RloC, E = 7e-13 in C. jejuni strain RM11221),22
How can one explain the apparent variability of the locus identified in the present study? One potential hypothesis is that the region is a hot spot for gene insertion/deletion, with a specific selection pressure maintaining a particular function at that precise location in the genome. Suerbaum and Josenhans26
recently reviewed the current data on the genetic diversity of H. pylori and argued that this bacterium uses mutation and recombination processes to adapt to its individual host by modifying molecules that interact with the host.26
Because the three CDSs retain the same similarities, it is likely that (i) these proteins share the same function or (ii) the gene is submitted to specific selection pressure making it evolve at a very rapid rate. We proposed that such a protein could be a phage receptor/translocator or that it could allow the DNA phage to enter host cells by remodelling the cell wall.27
–29
Indeed, as already described in Escherichia coli, this kind of protein is subjected to a strong positive selection.30
Helicobacter pylori genotypes vary markedly with their geographical region, and this is particularly the case for genes under positive selection. Therefore, the corresponding genes were looked for in three East Asian strains and three African strains. All three CDS were found: CDS1 was found in one Asian (strain 8038) and one African strain (strain TALLAN), CDS2 in two Asian strains (strains 12001 and strain 8033), and CDS3 in one Asian (strain 19A) and one African strain (strain BAPOOI) (Fig. 2). A phylogenetic analysis was conducted on the deduced amino acid sequences of CDS. Phylogenetic and molecular evolutionary analyses were conducted using MEGA version 4.31
Phylogenetic trees were generated by the neighbour-joining method.32
Molecular distances were determined using the Kimura two-parameter model.33
The tree showed three independent clusters which were clearly separated and corresponded to CDS1, CDS2, and CDS3, respectively (Fig. 2). However, the exact organization of these different CDS cannot be determined since this consensus tree cannot be rooted to other species. Indeed, no CDS with significant homology has ever been found in other species (in databases). Interestingly, even though the testing was performed on a limited number of non-European strains, these results indicate that the presence of one of the three CDS cannot be restricted to the geographical origin of the strains.
|
The type of selection operating at the amino acid level was also evaluated by comparing non-synonymous substitutions (Ka) and synonymous substitutions (Ks).34
|
Finally, we propose that the very high variation observed in the protein sequences reflects the permanent selection pressure exerted by phages or other elements interacting with the organism's cell envelope. If this is the case, this locus could be used as a marker for constraints operating in the environmental niches in which particular H. pylori strains evolve. The presence of phages in H. pylori has been rarely described.37
In summary, a novel polymorphic locus comprised of a single gene was identified in the H. pylori genome. Although this variation could be explained by isotopic gene replacement via horizontal transfer of a gene with the same function but coming from a variety of sources, it seems more likely that the very high sequence variation observed at this locus is the result of a strong selection pressure exerted on the corresponding gene product. We propose that the evolution of CDS1, CDS2, and CDS3 is due to the occurrence of a specific environmental event, such as interaction with a biological structure, e.g. bacteriophage which are involved in surface cell secretion. The genes identified in the present study could be used as strain specific markers for particular niches. The predicted function of the gene products, although highly speculative, should encourage investigators to explore the presence of phages in the H. pylori environment and study their relationship regarding pathogenicity.
| Acknowledgments |
|---|
The authors want to thank Dr Monica Oleastro from the Departamento de Doenças Infecciosas of the Instituto Nacional Saúde Dr Ricardo Jorge (Lisbon, Portugal) for the comparison of non-synonymous substitutions and synonymous substitutions among the three CDSs described in this study and Dr Jorge M. B. Vítor and Dr Vale Filipa from the Faculty of Pharmacy in Lisbon for strains. The study was financially supported by the Institut de Recherche des Maladies de l'Appareil Digestif (IRMAD), the Association pour la Recherche contre le Cancer (ARC), and the Conseil Régional d'Aquitaine, France.
| Footnotes |
|---|
* To whom correspondence should be addressed. Tel. +33 5-57-57-12-86. Fax. 33 5-56-51-41-82. E-mail: philippe.lehours{at}labhel.u-bordeaux2.fr
| References |
|---|
|
|
|---|
- Alm R. A., Ling L. S. L., Moir D. T., et al. Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature (1999) 397:176–180.[CrossRef][Medline]
- Alm R. A., Trust T. J. Analysis of the genetic diversity of Helicobacter pylori: the tale of two genomes. J. Mol. Med. (1999) 77:834–846.[CrossRef][ISI][Medline]
- Tomb J. F., White O., Kerlavage A. R., et al. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature (1997) 388:539–547.[CrossRef][Medline]
- Oh J. D., Kling-Backhed H., Giannakis M., et al. The complete genome sequence of a chronic atrophic gastritis Helicobacter pylori strain: evolution during disease progression. Proc. Natl. Acad. Sci. USA (2006) 103:9999–10004.
[Abstract/Free Full Text] - Salama N., Guillemin K., McDaniel T. K., Sherlock G., Tompkins L., Falkow S. A whole-genome microarray reveals genetic diversity among Helicobacter pylori strains. Proc. Natl. Acad. Sci. USA (2000) 97:14668–14673.
[Abstract/Free Full Text] - Falush D., Wirth T., Linz B., et al. Traces of human migrations in Helicobacter pylori populations. Science (2003) 299:1582–1585.
[Abstract/Free Full Text] - Raymond J., Thiberge J. M., Chevalier C., et al. Genetic and transmission analysis of Helicobacter pylori strains within a family. Emerg. Infect. Dis. (2004) 10:1816–1821.[ISI][Medline]
- Gressmann H., Linz B., Ghai R., et al. Gain and loss of multiple genes during the evolution of Helicobacter pylori. PLoS Genet. (2005) 1.
- Chanto G., Occhialini A., Gras N., Alm RA., Megraud F., Marais A. Identification of strain-specific genes located outside the plasticity zone in nine clinical isolates of Helicobacter pylori. Microbiol. Sgm. (2002) 148(11):3671–3680.
- Akopyants N. S., Fradkov A., Diatchenko L., et al. PCR-based subtractive hybridization and differences in gene content among strains of Helicobacter pylori. Proc. Natl. Acad. Sci. USA (1998) 95:13108–13113.
[Abstract/Free Full Text] - Kersulyte D., Mukhopadhyay A. K., Shirai M., Nakazawa T., Berg D. E. Functional organization and insertion specificity of IS607, a chimeric element of Helicobacter pylori. J. Bacteriol. (2000) 182:5300–5308.
[Abstract/Free Full Text] - Lehours P., Dupouy S., Bergey B., et al. Identification of a genetic marker of Helicobacter pylori strains involved in gastric extranodal marginal zone B cell lymphoma of the MALT-type. Gut (2004) 53:931–937.
[Abstract/Free Full Text] - Abdelbaqi K., Menard A., Prouzet-Mauleon V., Bringaud F., Lehours P., Megraud F. Nucleotide sequence of the gyrA gene of Arcobacter species and characterization of human ciprofloxacin-resistant clinical isolates. FEMS Immunol. Med. Microbiol. (2007) 49:337–345.[CrossRef][ISI][Medline]
- Huang X., Miller M. A time-efficient, linear-space local similarity algorithm. Adv. Appl. Math. (1991) 12:337–357.[CrossRef]
- Lehours P., Menard A., Dupouy S., et al. Evaluation of the association of nine Helicobacter pylori virulence factors with strains involved in low-grade gastric mucosa-associated lymphoid tissue lymphoma. Infect. Immun. (2004) 72:880–888.
[Abstract/Free Full Text] - Rozen S., Skaletsky H. Primer3 on the WWW for general users and for biologist programmers. Methods Mol. Biol. (2000) 132:365–386.[Medline]
- Boneca I. G., de Reuse H., Epinat J.-C., Pupin M., Labigne A., Moszer I. A revised annotation and comparative analysis of Helicobacter pylori genomes. Nucl. Acids Res. (2003) 31:1704–1714.
[Abstract/Free Full Text] - Meinnel T., Guillon J. M., Mechulam Y., et al. The Escherichia coli fmt gene, encoding methionyl-tRNA(fMet) formyltransferase, escapes metabolic control. Disruption of the gene for Met-tRNA(fMet) formyltransferase severely impairs growth of Escherichia coli. J. Bacteriol. (1993) 175:993–1000.
[Abstract/Free Full Text] - Guillon J. M., Mechulam Y., Schmitter J. M., Blanquet S., Fayat G. Disruption of the gene for Met-tRNA(fMet) formyltransferase severely impairs growth of Escherichia coli. J. Bacteriol. (1992) 174:4294–4301.
[Abstract/Free Full Text] - Rocha E. P., Danchin A. Base composition bias might result from competition for metabolic resources. Trends Genet. (2002) 18:291–294.[CrossRef][ISI][Medline]
- Saunders N. J., Boonmee P., Peden J. F., Jarvis S. A. Inter-species horizontal transfer resulting in core-genome and niche-adaptive variation within Helicobacter pylori. BMC Genom. (2005) 6:9.[CrossRef]
- Miller W. G., Pearson B. M., Wells J. M., Parker C. T., Kapitonov V. V., Mandrell R. E. Diversity within the Campylobacter jejuni type I restriction-modification loci. Microbiology (2005) 151:337–351.
[Abstract/Free Full Text] - Fouts D. E., Mongodin E. F., Mandrell R. E., et al. Major structural differences and novel potential virulence mechanisms from the genomes of multiple campylobacter species. PLoS Biol. (2005) 3.
- Lazarevic V., Dusterhoft A., Soldo B., Hilbert H., Mauel C., Karamata D. Nucleotide sequence of the Bacillus subtilis temperate bacteriophage SPbetac2. Microbiology (1999) 145:1055–1067.[Abstract]
- Kapatral V., Anderson I., Ivanova N., et al. Genome sequence and analysis of the oral bacterium Fusobacterium nucleatum strain ATCC 25586. J. Bacteriol. (2002) 184:2005–2018.
[Abstract/Free Full Text] - Suerbaum S., Josenhans C. Helicobacter pylori evolution and phenotypic diversification in a changing host. Nat. Rev. Microbiol. (2007) 5:441–452.[CrossRef][ISI][Medline]
- Qasba P. K., Kumar S. Molecular divergence of lysozymes and alpha-lactalbumin. Crit. Rev. Biochem. Mol. Biol. (1997) 32:255–306.[ISI][Medline]
- Rydman P. S., Bamford D. H. Bacteriophage PRD1 DNA entry uses a viral membrane-associated transglycosylase activity. Mol. Microbiol. (2000) 37:356–363.[CrossRef][ISI][Medline]
- Blackburn N. T., Clarke A. J. Identification of four families of peptidoglycan lytic transglycosylases. J. Mol. Evol. (2001) 52:78–84.[ISI][Medline]
- Petersen L., Bollback J. P., Dimmic M., Hubisz M., Nielsen R. Genes under positive selection in Escherichia coli. Genome Res. (2007) 17:1336–1343.
[Abstract/Free Full Text] - Tamura K., Dudley J., Nei M., Kumar S. MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. (2007) 24:1596–1599.
[Abstract/Free Full Text] - Saitou N., Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. (1987) 4:406–425.[Abstract]
- Kimura M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. (1980) 16:111–120.[CrossRef][ISI][Medline]
- Vandamme A. Basic concepts of molecular evolution. In: The Phylogenic Handbook - A Practical Approach To Dna And Protein Phylogeny—Salemi M., Vandamme A., eds. (2003) Cambridge: Cambridge University Press. 1–23.
- Nei M., Gojobori T. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Molecular Biol. Evol. (1986) 3:418–426.
- Nei M., Kumar S. Synonymous substitutions and non synonymous nucleotide substitutions. In: Molecular Evolution and Phylogenetics—Nei M., ed. (2000) Volume 1. New York: Oxford University Press. 52–61.
- Schmid E. N., von Recklinghausen G., Ansorg R. Bacteriophages in Helicobacter (Campylobacter) pylori. J. Med. Microbiol. (1990) 32:101–104.[Abstract]
- Marsich E., Zuccato P., Rizzi S., Vetere A., Tonin E., Paoletti S. Helicobacter pylori expresses an autolytic enzyme: gene identification, cloning, and theoretical protein structure. J. Bacteriol. (2002) 184:6270–6279.
[Abstract/Free Full Text] - Solnick J. V., Hansen L. M., Salama N. R., Boonjakuakul J. K., Syvanen M. Modification of Helicobacter pylori outer membrane protein expression during experimental infection of rhesus macaques. Proc. Natl. Acad. Sci. USA (2004) 101:2106–2111.
[Abstract/Free Full Text] - Colbeck J. C., Hansen L. M., Fong J. M., Solnick J. V. Genotypic profile of the outer membrane proteins BabA and BabB in clinical isolates of Helicobacter pylori. Infect. Immun. (2006) 74:4375–4378.
[Abstract/Free Full Text]
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

