Skip Navigation

DNA Research 2005 12(3):215-220; doi:10.1093/dnares/dsi006
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (25)
Right arrow Request Permissions
Google Scholar
Right arrow Articles by Steane, D. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Steane, D. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Kazusa DNA Research Institute

Short Communications

Complete Nucleotide Sequence of the Chloroplast Genome from the Tasmanian Blue Gum, Eucalyptus globulus (Myrtaceae)

Dorothy A. Steane*

Cooperative Research Centre for Sustainable Production Forestry, School of Plant Science, University of Tasmania Private Bag 55, Hobart, Tasmania 7001, Australia

Received 22 November 2004; revised 4 April 2005


    Abstract
 Top
 Abstract
 Acknowledgements
 References
 
The complete nucleotide sequence of the chloroplast genome of the hardwood species Eucalyptus globulus is presented and compared with chloroplast genomes of tree and non-tree angiosperms and two softwood tree species. The 160 286 bp genome is similar in gene order to that of Nicotiana, with an inverted repeat (IR) (26 393 bp) separated by a large single copy (LSC) region of 89 012 bp and a small single copy region of 18 488 bp. There are 128 genes (112 individual gene species and 16 genes duplicated in the inverted repeat) coding for 30 transfer RNAs, 4 ribosomal RNAs and 78 proteins. One pseudogene ({psi}-infA) and one pseudo-ycf ({psi}-ycf15) were identified. The chloroplast genome of E. globulus is essentially co-linear with that of another hardwood tree species, Populus trichocarpa, except that the latter lacks rps16 and rpl32, and the IR has expanded in Populus to include rps19 (part of the LSC in E. globulus). Since the chloroplast genome of E. globulus is not significantly different from other tree and non-tree angiosperm taxa, a comparison of hardwood and softwood chloroplasts becomes, in essence, a comparison of angiosperm and gymnosperm chloroplasts. When compared with E. globulus, Pinus chloroplasts have a very small IR, two extra tRNAs and four additional photosynthetic genes, lack any functional ndh genes and have a significantly different genome arrangement. There does not appear to be any correlation between plant habit and chloroplast genome composition and arrangement.

Key words: eucalypt; Myrtaceae; chloroplast DNA; pseudogene; gymnosperm


Eucalyptus globulus is one of the most economically important species for hardwood forestry plantations in temperate regions of the world.1Go It has been studied intensively by quantitative, population and evolutionary geneticists and is becoming a model species for genetic research in Eucalyptus. Chloroplast DNA has been essential to many studies of population genetics and phylogeography in Eucalyptus. This paper presents the complete chloroplast genome from E. globulus and compares it with chloroplast genomes from other angiosperm taxa [including the hardwood tree species, Populus trichocarpa (B. Heinz, S. DiFazio, K. Ritland et al., manuscript in preparation)] and softwood tree species (Pinus thunbergii2Go and Pinus koraiensis).

The complete chloroplast genome of E. globulus (GenBank accession no. AY780259) may be represented as a circular chromosome (Fig. 1), although this is likely to be a rare form of the molecule, as most chloroplast DNA is, in fact, linear.3Go,4Go Comprising 160 286 bp, it ranks among the larger land plant chloroplast genomes. Most land plant plastids sequenced to date have genomes of 116–163 kb, and the longest belongs to Oenothera elata (163 935 bp5Go). The structure of the E. globulus chloroplast genome is typical of most plastids: a large single copy (LSC) region (89 012 bp) and a small single copy (SSC) region (18 488 bp) are separated by an inverted repeat (IR) (26 393 bp). The relative sizes of the LSC, SSC and IR regions remain reasonably constant across genomes of angiosperms (approximately 55, 12 and 16.5% of the total genome size, respectively), regardless of the overall size of the genome. The relative size of the IR in gymnosperms varies much more. For example, in Ginkgo biloba the IR is 17 kb, but in P. thunbergii it is just 495 bp2Go containing trnI-CAU and 83 bp from the 3' end of psbA, but lacking the ribosomal RNA genes that characterize other land plant IRs.



View larger version (40K):
[in this window]
[in a new window]
 
Figure 1. Gene map of the plastid chromosome of Eucalyptus globulus. Genes belonging to different functional groups are color coded (see key). Genes drawn inside the circle are transcribed clockwise; those outside the circle are transcribed anti-clockwise. In cases where two genes overlap, one of them is shifted off the map to show its position. Asterisks indicate genes that contain introns. Pseudogenes are marked by {psi}. ORF366 in IRB is a truncated form of ycf1.

 
The Eucalyptus chloroplast genome has a GC-content of 36.9%, which is comparable with that of other vascular plant plastids (e.g. 36.7% in Populus, 37.8% in Nicotiana, 38.4% in Zea, 39.2% in Oenothera and 38.5% in P. thunbergii). The genome is AT-rich in both the non-coding intergenic regions (67% AT) and the coding regions (62% AT), where there is an AT bias (73% ± 4.5%) in the third base positions of all amino acid codons. This phenomenon is also observed in other plastid genomes. In contrast, the tRNA genes show less of an AT bias (58%), and the rRNA genes have a slight GC bias (55%). The latter is characteristic of rRNA genes in other plants.6Go

Table 1 lists all genes detected in the chloroplast genome of E. globulus. The start codons for the protein-coding genes were based on alignments with plastomes of other taxa. The start positions are, therefore, hypothetical and can be confirmed only through analysis of gene transcripts. The genome is essentially co-linear with that of the annual angiosperm Nicotiana tabacum (Fig. 2a), with all the same genes except sprA, which is absent from E. globulus. The chloroplast genome of E. globulus is also virtually co-linear with that from another hardwood tree species, P. trichocarpa (Fig. 2b), except for three notable differences: (i) rps16 and flanking intergenic sequences (~1800 bp of LSC) are missing from Populus; (ii) the gene rpl32 and flanking sequences (~1100 bp of SSC) are absent from Populus; and (iii) the IR in Populus has expanded to include rps19, with the ‘extra’ copy of this gene located close to JLA. As in other angiosperms, the E. globulus plastome has four ribosomal RNA (rRNA) genes and 30 transfer RNA (tRNA) genes (of which seven are located in the IRs) that provide tRNAs for all 20 amino acids (Table 1). There are 78 protein-coding genes, including four conserved open reading frames (ORFs) (‘ycfs’). Approximately 74 protein-coding genes are common to most angiosperm chloroplast genomes, and an additional 5 are present in only some species.7Go Of these five, four (accD, ycf1, ycf2 and rpl23) appear to be functional in the plastome of E. globulus, but the fifth, infA, is a pseudogene ({psi}), as in Populus, Nicotiana, Arabidopsis and Oenothera.7Go One other pseudogene was detected, that of a hypothetical chloroplast protein, {psi}ycf15. One open reading frame, ORF113, has high homology to regions of ycf68 in rice, maize and Pinus, as well as to hypothetical proteins ORF119 and ORF58 in the trnI intron of Oenothera. A second open reading frame, ORF366, is found in IRB at the junction with the SSC. It is a truncated inverted repeat of ycf1 and is probably non-functional.


View this table:
[in this window]
[in a new window]
 
Table 1. List of genes found in Eucalyptus globulus chloroplast genome (GenBank accession no. AY780259; herbarium accession no. HO528199)a.

 


View larger version (26K):
[in this window]
[in a new window]
 
Figure 2. Harr plot analysis comparing chloroplast genomes from an annual angiosperm, hardwood (angiosperm) trees and softwood (gymnosperm) trees: a) Nicotiana tabacum and Eucalyptus globulus; b) E. globulus and Populus trichocarpa; c) Pinus koraiensis and Pinus thunbergii; and d) E. globulus and P. thunbergii. Plots were constructed using COMPARE (GCG) and DOTPLOT (GCG). Each dot represents a position where 45 out of 50 nucleotides match in both sequences. All genomes are available from GenBank, except for that of Populus, which can be viewedon-line (http://genome.ornl.gov/poplar_chloroplast/).

 
There are three classes of ORFs in plastid DNA: (i) genes of known function; (ii) hypothetical chloroplast reading frames (ycfs) that are highly conserved between species; and (iii) species-specific or rapidly diverging ORFs. Four major ycfs have been partially characterized, but their precise functions are not yet understood. Two highly conserved ycfs, ycf1 and ycf2, have been demonstrated to be essential to cellular function in dicots;8Go they are not involved in photosynthesis, but are speculated to be involved in cellular metabolism or to have a structural role in the plastid.8Go Two more ycfs, ycf 3 and ycf 4, are believed to be involved in the formation of photosystem I.9Go,10Go The functionality of some other ycfs, however, has been brought into question by the relatively frequent occurrence of pseudo-ycf loci. For example, although ycf15 in tobacco appears to be a potentially functional protein-coding gene, in many other species—including E. globulus—a variable insertion of ~250 bp (295 bp in E. globulus) introduces premature stop codons. Schmitz-Linneweber et al.11Go showed that although the ycf15 cistron may be transcribed, splicing of the two conserved ends does not occur; hence, ycf15 is probably not a protein-coding gene. The ycf15 sequences of E. globulus and Oenothera are very similar after the removal of their insertions. However, both with and without the intervening sequence, ycf15 of both taxa have premature stop codons, providing further evidence that ycf15 is probably not a functional protein-coding gene. Another example of a ycf that has highly conserved domains, but often is not completely conserved, is ycf68. In E. globulus, ORF113 is highly homologous to a small region of ycf68 in rice and maize, ORF75 in P. koraiensis, ORF75a in P. thunbergii and a hypothetical protein in O. elata (ORF58). All these ORFs have some homology to ycf68. Such ORFs and ycfs that have some highly conserved regions may have roles in gene regulation (e.g. as promotor or terminator sequences) or may be genes specifying a structural RNA11Go (as was at first proposed for sprA in tobacco chloroplasts,12Go but was later discounted13Go).

The psbL gene that codes for a 38 amino acid peptide of photosystem II is highly conserved among many higher plants. This gene is unusual because in Eucalyptus, as well as in some other taxa (e.g. Nicotiana and Spinacia, but not Populus), transcription of the gene does not require any of the standard chloroplast initiation codons [i.e. leucine (TTG, CTG), isoleucine (ATT, ATC, ATA), valine (GTG) or, the most common, methionine (ATG)]. Instead, ACG appears at the beginning of the gene. It has been shown in Nicotiana that a translatable psbL mRNA containing an AUG initiator codon is formed by C to U editing of the ACG codon,14Go and it is possible that a similar mechanism exists in Eucalyptus.

In general, the chloroplast genome of E. globulus is not significantly different from most other angiosperms, so a comparison of hardwood and softwood chloroplasts becomes, in essence, a comparison of angiosperm and gymnosperm chloroplasts. Chloroplast DNA sequences are available for two gymnosperms, P. thunbergii (119 707 bp) and P. koraiensis (116 866 bp). Both genomes are significantly smaller than those of most angiosperms sequenced so far. Pairwise comparisons using Harr plots (Fig. 2c) and DOGMA software15Go (data not shown) show that the chloroplast DNA sequences of the two pine species are very similar. In contrast, those same analytical techniques indicate that the chloroplast genomes of P. thunbergii and E. globulus are arranged very differently (Fig. 2d). Relative to Eucalyptus, rbcL and its neighboring regions in the LSC region are inverted in the pines, and a large region from the LSC, including psaA and psaB, occurs in the SSC.2Go The rRNA genes from rrn16 to trnR-AGC that are in the inverted repeat in angiosperms form a cluster in the middle of the SSC in P. thunbergii.2Go In addition to the 30 tRNA genes found in angiosperms, the two pine species have two unusual tRNAs, trnP-GGG and trnR-CCG. The first of these is also found in hornworts16Go and ferns17Go, and trnR-CCG has been found in moss, although it is not essential for plastid function in moss and may not be a functional gene.18Go Angiosperms and pines have the same suite of ribosomal protein genes, except that the pines lack rps16. Pines have an intact infA gene, in contrast to the pseudogene found in Eucalyptus and many other angiosperms (see above). In addition to the 29 genes encoding components of the photosynthetic apparatus in angiosperms, pines have 4 more genes that exist in some lower plants: psaM, chlB, chlL and chlN. The psaM gene (which is duplicated in the LSC of P. thunbergii,2Go but not in P. koraiensis) has been found in non-vascular plants, but is absent from ferns and angiosperms, suggesting parallel losses in the latter two groups during their evolution.17Go The genes chlB, chlL and chlN may be associated with the ability of pines to synthesize chlorophyll in the dark (as in Chlamydomonas19Go). A major difference in the gene content between pines and angiosperms is the complete absence of functional ndh genes from pine chloroplasts.2Go It is unclear whether chloroplast ndh genes have been transferred to the nuclear genome of pines, or whether pine chloroplasts lack an NADH dehydrogenase altogether. Eucalyptus and Nicotiana have 21 introns, 5 more than P. thunbergii and P. koraiensis. Of these five, three occur in genes that are absent from pines (rps16, ndhA and ndhB), and two occur in clpP that, in pines, has no introns. The 16 remaining split genes are conserved between pines and angiosperms.2Go

In conclusion, there does not appear to be any correlation between plant habit and plastome composition and arrangement. Differences between chloroplast genomes of tree and non-tree angiosperm species are slight. In contrast, although angiosperm and gymnosperm chloroplasts share many genes, there are significant differences in genome size, arrangement and gene content.


    Acknowledgements
 Top
 Abstract
 Acknowledgements
 References
 
The author thanks Peter Wilson and other staff at the Australian Genome Research Facility (AGRF); Natalie Papworth and Alan McFadden (Royal Tasmanian Botanical Garden); Peter Boyer (SouthWind Writing and Publishing Services, Tasmania); Bob Elliott, Adam Smolenski, Natalie Conod, Rebecca Jones, Catherine Phillips, Briony Patterson, Gay McKinnon, Brad Potts and René Vaillancourt (University of Tasmania). This research was funded by the Cooperative Research Centre for Sustainable Production Forestry (CRC-SPF).


    Footnotes
 
*Tel. +61-3-62261828, Fax. +61-3-62262698, E-mail: dorothy.steane{at}utas.edu.au

Communicated by Katsumi Isono


    References
 Top
 Abstract
 Acknowledgements
 References
 

  1. Eldridge, K. G., Davidson, J., Harwood, C., van Wyk, G. 1993, Eucalypt Domestication and Breeding, Oxford Clarendon Press.
  2. Wakasugi, T., Tsudzuki, J., Ito, S., Nakashima, K., Tsudzuki, T., Sugiura, M. 1994, Loss of all ndh genes as determined by sequencing the entire chloroplast genome of the black pine Pinus thunbergii, Proc. Natl Acad. Sci. USA, 91, 9794–9798.[Abstract/Free Full Text]
  3. Oldenburg, D. J. and Bendich, A. J. 2004, Most chloroplast DNA of maize seedlings in linear molecules with defined ends and branched forms, J. Mol. Biol., 335, 953–970.[Medline]
  4. Bendich, A. J. 2004, Circular chloroplast chromosomes: the grand illusion, Plant Cell, 16, 1661–1666.[Free Full Text]
  5. Hupfer, H., Swiatek, M., Hornung, S., et al. 2000, Complete nucleotide sequence of the Oenothera elata plastid chromosome, representing plastome I of the five distinguishable Euoenothera plastomes, Mol. Gen. Genet., 263, 581–585.[ISI][Medline]
  6. Goremykin, V. V., Hirsch-Ernst, K. I., Wolfl, S., Hellwig, F. H. 2003, Analysis of the Amborella trichopoda chloroplast genome sequence suggests that Amborella is not a basal angiosperm, Mol. Biol. Evol., 20, 1499–1505.[Abstract/Free Full Text]
  7. Millen, R. S., Olmstead, R. G., Adams, K. L., et al. 2001, Many parallel losses of infA from chloroplast DNA during angiosperm evolution with multiple independent transfers to the nucleus, Plant Cell, 13, 645–658.[Abstract/Free Full Text]
  8. Drescher, A., Ruf, S., Calsa, T., Carrer, H., Bock, R. 2000, The two largest chloroplast genome-encoded open reading frames of higher plants are essential genes, Plant J., 22, 97–104.[CrossRef][ISI][Medline]
  9. Boudreau, E., Takahashi, Y., Lemieux, C., Turmel, M., Rochaix, J. D. 1997, The chloroplast ycf3 and ycf4 open reading frames of Chlamydomonas reinhardtii are required for the accumulation of the photosystem I complex, EMBO J., 16, 6095–6104.[CrossRef][ISI][Medline]
  10. Ruf, S., Kossel, H., Bock, R. 1997, Targeted inactivation of a tobacco intron-containing open reading frame reveals a novel chloroplast-encoded photosystem I-related gene, J. Cell Biol., 139, 95–102.[Abstract/Free Full Text]
  11. Schmitz-Linneweber, C., Maier, R. M., Alcaraz, J. P., Cottet, A., Herrmann, R. G., Mache, R. 2001, The plastid chromosome of spinach (Spinacia oleracea): complete nucleotide sequence and gene organization, Plant Mol. Biol., 45, 307–315.[CrossRef][ISI][Medline]
  12. Vera, A. and Sugiura, M. 1994, A novel RNA gene in the tobacco plastid genome: its possible role in the maturation of 16S ribosomal RNA, EMBO J, 13, 2211–2217.[ISI][Medline]
  13. Sugita, M., Svab, Z., Maliga, P., Sugiura, M. 1997, Targeted deletion of sprA from the tobacco plastid genome indicates that the encoded small RNA is not essential for pre-16S rRNA maturation in plastids, Mol. Gen. Genet., 257, 23–27.[CrossRef][ISI][Medline]
  14. Kudla, J., Igloi, G., Metzlaff, M., Hagemann, R., Kossel, H. 1992, RNA editing in tobacco chloroplasts leads to the formation of a translatable psbL mRNA by a C to U substitution within the initiation codon, EMBO J., 11, 1099–1103.[ISI][Medline]
  15. Wyman, S. K., Jansen, R. K., Boore, J. L. 2004, Automatic annotation of organellar genomes with DOGMA, Bioinformatics, 20, 3252–3255.[Abstract/Free Full Text]
  16. Kugita, M., Kaneko, A., Yamamoto, Y., Takeya, Y., Matsumoto, T., Yoshinaga, K. 2003, The complete nucleotide sequence of the hornwort (Anthoceros formosae) chloroplast genome: insight into the earliest land plants, Nucleic Acids Res., 31, 716–721.[Abstract/Free Full Text]
  17. Wolf, P. G., Rowe, C. A., Sinclair, R. B., Hasebe, M. 2003, Complete nucleotide sequence of the chloroplast genome from a leptosporangiate fern, Adiantum capillus-veneris L, DNA Res., 10, 59–65.[Abstract]
  18. Sugiura, C. and Sugita, M. 2004, Plastid transformation reveals that moss tRNA(Arg)-CCG is not essential for plastid function, Plant J., 40, 314–321.[CrossRef][Medline]
  19. Liu, X. Q., Xu, H., Huang, C. Z. 1993, Chloroplast chlB gene is required for light-independent chlorophyll accumulation in Chlamydomonas reinhardtii, Plant Mol. Biol., 23, 297–308.[CrossRef][ISI][Medline]
  20. Palmer, J. D. 1986, Methods in Enzymology, New York Academic Press167–186.
  21. Steane, D. A., West, A. K., Potts, B. M., Ovenden, J. R., Reid, J. B. 1991, Restriction fragment length polymorphisms in chloroplast DNA from six species of Eucalyptus, Aust. J. Bot., 39, 399–414.
  22. Doyle, J. J. and Doyle, J. L. 1990, Isolation of plant DNA from fresh tissue, Focus, 12, 13–15.
  23. Ewing, B. and Green, P. 1998, Base-calling of automated sequencer traces using Phred. II. Error probabilities, Genome Res., 8, 186–194.[Abstract/Free Full Text]
  24. Wakasugi, T., Sugita, M., Tsudzuki, T., Sugiura, M. 1998, Updated gene map of tobacco chloroplast DNA, Plant Mol. Biol. Rep., 16, 231–241.[CrossRef]

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol Biol EvolHome page
M. Ueda, T. Nishikawa, M. Fujimoto, H. Takanashi, S.-i. Arimura, N. Tsutsumi, and K.-i. Kadowaki
Substitution of the Gene for Chloroplast RPS16 Was Assisted by Generation of a Dual Targeting Signal
Mol. Biol. Evol., August 1, 2008; 25(8): 1566 - 1575.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Greiner, X. Wang, U. Rauwolf, M. V. Silber, K. Mayer, J. Meurer, G. Haberer, and R. G. Herrmann
The complete nucleotide sequences of the five genetically distinct plastid genomes of Oenothera, subsection Oenothera: I. Sequence evaluation and plastome evolution
Nucleic Acids Res., April 1, 2008; 36(7): 2366 - 2378.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Bot.Home page
M. M. Barthet and K. W. Hilu
Expression of matK: functional and evolutionary implications
Am. J. Botany, August 1, 2007; 94(8): 1402 - 1412.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Bot.Home page
R. E. Timme, J. V. Kuehl, J. L. Boore, and R. K. Jansen
A comparative analysis of the Lactuca and Helianthus (Asteraceae) plastid genomes: identification of divergent regions and categorization of shared repeats
Am. J. Botany, March 1, 2007; 94(3): 302 - 312.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
T. W. Chumley, J. D. Palmer, J. P. Mower, H. M. Fourcade, P. J. Calie, J. L. Boore, and R. K. Jansen
The Complete Chloroplast Genome Sequence of Pelargonium x hortorum: Organization and Evolution of the Largest and Most Highly Rearranged Chloroplast Genome of Land Plants
Mol. Biol. Evol., November 1, 2006; 23(11): 2175 - 2190.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (25)
Right arrow Request Permissions
Google Scholar
Right arrow Articles by Steane, D. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Steane, D. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?