Skip Navigation


DNA Research Advance Access originally published online on September 16, 2006
DNA Research 2006 13(3):111-121; doi:10.1093/dnares/dsl003
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary data
Right arrowOA All Versions of this Article:
13/3/111    most recent
dsl003v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (9)
Right arrow Request Permissions
Google Scholar
Right arrow Articles by Ner-Gaon, H.
Right arrow Articles by Fluhr, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ner-Gaon, H.
Right arrow Articles by Fluhr, R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Kazusa DNA Research Institute
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions@oxfordjournals.org

Whole-Genome Microarray in Arabidopsis Facilitates Global Analysis of Retained Introns

Hadas Ner-Gaon and Robert Fluhr*

Department of Plant Sciences, Weizmann Institute of Science Rehovot 76100, Israel

Received 2 May 2006; revised 19 July 2006


    Abstract
 Top
 Abstract
 1. Introduction
 2. Materials and methods
 3. Results
 4. Discussion
 Acknowledgements
 References
 
Alternative splicing (AS) is an important post-transcriptional regulatory mechanism that can increase protein diversity and affect mRNA stability. Different types of AS have been observed; these include exon skipping, alternative donor or acceptor site and intron retention. In humans, exon skipping is the most common type while intron retention is rare. In contrast, in Arabidopsis, intron retention is the most prevalent AS type (~40%). Here we show that direct transcript expression analysis using high-density oligonucleotide-based whole-genome microarrays (WGAs) is particularly amenable for assessing global intron retention in Arabidopsis. By applying a novel algorithm retained introns are detected in 8% of the transcripts examined. A sampling of 14 transcripts showed that 86% can be confirmed by RT–PCR. This rate of detection predicts an overall total AS rate of 20% for Arabidopsis compared with 10–22% based on EST/cDNA-based analysis. These findings will facilitate monitoring constitutive and dynamic whole-genome splicing on the next generation WGA slides.

Key words: microarray; TILING; alternative splicing; Arabidopsis; intron retention; NPR-1; GIGANTEA


    1. Introduction
 Top
 Abstract
 1. Introduction
 2. Materials and methods
 3. Results
 4. Discussion
 Acknowledgements
 References
 
Alternative RNA processing pathways are a result of combining different splice junctions that are present in pre-mRNA transcripts. In this way, a variety of mRNA and proteins can be created from the same gene. Alternative splicing (AS) is thought to play a major role in expanding the potential informational content of eukaryotic genomes. Recent evidence indicates a high incidence (32–60%) of AS in the human genome, predominantly in the form of exon skip while a minor form was intron retention (5–16%).1Go–4Go Although rare, intron retention can play an important biological role. It has been linked to tumor growth5Go and is part of developmental regulation of a proinsulin expression.6Go

Plants are thought to exhibit less AS (10–22%).3Go,7Go,8Go Computational analysis of AS in Arabidopsis by EST-pair alignment identified 436 alternatively spliced genes.9Go Unexpectedly, intron retention was found to be the most common type (45%). Retained introns were shown to be present in RNA derived from polyribosomes, demonstrating that these intron retention events are not the byproduct of incomplete splicing but are found in a translatable context in the cytoplasm.9Go Iida et al.7Go aligned 248 514 RAFL (RIKEN Arabidopsis Full-Length) cDNA/EST sequences to the Arabidopsis genome using a BLAST-based method.10Go They identified 15 214 transcription units (TUs) containing at least two sequences each and observed AS for 11.6% of these TUs of which 44% were retained introns.7Go In a recent study using a large collection of EST/cDNA, 4707 (21.8%) of the transcripts showed 8264 AS events. Approximately 56% of these events were of the intron retention type.8Go These studies confirmed that a low percentage of genes (10–22%) are alternatively spliced and that intron retention is the most prevalent AS type in Arabidopsis.

Calculating the rate of AS using expressed sequence data (ESTs and cDNAs) can underestimate the amount of AS for the following reasons. Collectively, the EST and cDNA database in plants is relatively small and, thus, may lack low-abundance transcripts. In addition, AS predictions based on cDNA data are in many cases biased towards the termini of transcripts owing to the preponderance of end-sequence reads among ESTs and oligo(dT)-based priming for reverse transcription. Recently, high-density oligonucleotide-based whole-genome microarrays (WGAs) have emerged as a novel platform for genomic analysis beyond gene expression profiling. Unbiased WGAs can be used for ‘epigenomic’ mapping11Go and have been used to monitor exon skipping in humans.12Go,13Go By comparing the expression levels of all the exons within genes, it was shown that a large fraction (~80%) of expressed genes on chromosomes 21 and 22 exhibit exon skipping. This estimate is significantly higher than previous predictions, which was based on computational analysis of ESTs and cDNAs.4Go The array-based approach to analyzing AS also has limitations. For example, the detection of relatively rare splicing events is difficult. Furthermore, in a manner analogous to obtaining multiple EST libraries from different tissue, a large number of distinct tissues or cell populations must be surveyed to obtain sufficient confidence to call a splice variant at a specific splice junction.

In this work by using published WGA information, we monitor intron retention in Arabidopsis and show that it is readily detectable, generally leading to transcripts containing new reading frames that would result in truncated gene products.


    2. Materials and methods
 Top
 Abstract
 1. Introduction
 2. Materials and methods
 3. Results
 4. Discussion
 Acknowledgements
 References
 
2.1. Data sources
Arabidopsis genes, introns and coding sequences annotations and their genomic location were obtained from TAIR 6.0 release databases ftp://ftp.arabidopsis.org/home/tair/Sequences/blast_datasets/TAIR6_seq_20051108 ftp://ftp.arabidopsis.org/home/tair/Sequences/blast_datasets/TAIR6_intron_20060221 ftp://ftp.arabidopsis.org/home/tair/Sequences/blast_datasets/TAIR6_cds_20051108 ftp://ftp.arabidopsis.org/home/tair/home/tair/Sequences/blast_datasets/TAIR6_3_UTR_20060126 ftp://ftp.arabidopsis.org/home/tair/home/tair/Sequences/blast_datasets/TAIR6_5_UTR_20060126 ftp://ftp.arabidopsis.org/home/tair/Maps/seqviewer_data/sv_gene_feature.data In order to allocate each probe to its TUs, programs were written in PERL. Those programs analyzed the BLAST results of the probes and the different databases. Programs are available upon request from H.N-G.

2.2. Normalization of the whole-genome array-WGA
The hybridization value from a set of 12 oligonucleotide arrays representing 94% of the Arabidopsis genome sequence (110 Mb) was used.14Go Each array contains 835 000 oligos of 25 mer size. Four RNA populations were hybridized to these arrays.14Go The probe sequences and the hybridization values for the 12 slides were taken from GEO Accession viewer http://www.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi? using the GEO accession GPL slide numbers (GPL432 [NCBI GEO] -GPL443). Hybridization data were normalized by taking the log2 of the data of all probes and then by dividing each probe intensity value in a given experiment by the median intensity value of all probes in that experiment. As reported, the median intensity value of all probes were determined to be a good estimate of the background noise level in a given experiment.14Go

2.3. Retained-intron analysis
An intron was defined as candidate constitutive or as a retained intron based on its combined probes hybridization values relative to the mean hybridization value of the other introns and exons in the transcript. In order to define retained introns in an expressed transcript the probes of its exons were treated as one group, while each intron, which contained at least three probes, was treated as different group. A balanced one-way ANOVA for comparing the means of all the groups in a transcript was performed. The ANOVA test evaluates the hypothesis that the samples all have the same mean against the alternative hypothesis that the means are not all the same. To insure a False Discovery Rate (FDR) of at most 0.05, the Benjamini-Hochberg (BH) method was used.15Go Transcripts in which the exons mean was at least 1.09 and the introns mean was less than 1.09 were ranked according to their P ANOVA value. The P ANOVA value was then compared with the BH critical value to select significant values. The BH critical values and the P-value for each ANOVA test are shown in Supplementary Table S2 available at www.dnaresearch.oxfordjournals.org. Using these results a test was then performed that determines which pairs of means are significantly different and which are not. This test employs the Tukey's least significant difference procedure. PERL programs were written in order to differentiate between groups that have the same mean to groups that are different from each other. Programs are available upon request from H.N-G.

2.4. Gene ontology
Gene ontology classification of transcripts was carried out using the first level categories of the Gene Ontology classification of the biological processes of their protein products (ftp://ftp.arabidopsis.org/home/tair/Ontologies/Gene_Ontology/ATH_GO_GOSLIM.20060121.txt). Gene ontology categories that deviate from cumulative binomial distribution with the probability for retained-intron genes P = 0.08 and 1 – P for constitutive introns genes, with significant P-value set at 0.05, are summarized.

2.5. Growth of plants, cell culture, RNA extraction and RT–PCR
Arabidopsis thaliana (ecotype Columbia) seeds were surface sterilized and placed onto sterilized MS containing Petri dishes. The seeds were cold-treated at 4°C for 2 days and then incubated 7 days at 16 h light/8 h dark cycle. Flower buds were collected from mature Arabidopsis plants (5–7 weeks old) grown on soil. Arabidopsis suspension-culture cells were a generous gift from Dr Gideon Grafi. The culture was maintained at 16 h light/8 h dark cycle with constant shaking and was subcultured every week and harvested then. Root tissue was obtained by incubating surface-sterilized seeds in flasks containing 100 ml Gamborg's liquid media. The seedlings were incubated for 2 weeks with constant agitation under 16 h light/8 h dark cycle, and root tissue was collected and frozen.

RNA was extracted from the four different tissues, using RNeasy (Qiagen, Hilden, Germany). RT–PCR was conducted as previously published.16Go Oligonucleotide primers were designed for simultaneous amplification of both the spliced and unspliced variants or by using an internal intron probe in which the opposing primer was from the next non-adjacent exon. A complete list of the PCR primers used in this study is shown as Supplementary Table S6 available at www.dnaresearch.oxfordjournals.org.


    3. Results
 Top
 Abstract
 1. Introduction
 2. Materials and methods
 3. Results
 4. Discussion
 Acknowledgements
 References
 
3.1. Division of probes in WGA to accommodate gene structure-criteria
A unique set of whole-genome arrays representing ~94% of the Arabidopsis genome sequence are composed of a set of 12 microarrays (6 for the forward strand and 6 for the reverse complement), each containing ~835 000 oligos of 25 mer size. Each array contains an end-to-end ‘tile’ of oligonucleotides (from head to tail) with no overlap in their DNA sequences. In this array, only ‘perfect match’ probes were included in the whole-genome array design. The standard Affymetrix Arabidopsis gene expression controls are present on each of these chips. Recently, RNA from four sources were used to probe this array to determine by empirical analysis the transcriptional activity of the Arabidopsis genome.14Go Tissues sampled included seedlings, flowers, suspension-culture cells and roots.

We wished to ascertain the possibility of refining from this data knowledge about alternative transcripts. In order to parse the data into established annotation of genes and their exons and introns structure the genes structures predicted by TAIR (The Arabidopsis Information Resource; http://www.arabidopsis.org/) were used. The gene structure includes the genomic location and sequence of each exon and intron. In cases in which one gene includes more then 1 transcript (e.g. known AS), the transcript structure that included the most exons was used so that a gene was divided into as many distinct groups as possible. The oligos were aligned to the exons and introns sequences in order to allocate oligos to exons and introns. The rest of the oligos that include control probes, probes with more than 1 hit in the genome, intergenic probes as well as probes that fall on intron/exon borders were removed. Thus, in total, 1 682 698 unique probes of the introns or exons were found to be amenable to analysis (Supplementary Table S1 is available at www.dnaresearch.oxfordjournals.org).

3.2. Criteria for differentiating introns and exons in the whole-genome screen
Our aim was to devise statistical tests for each transcriptional unit so that aberrant behavior of introns and exons could be registered. Hybridization values were normalized by dividing each probe intensity value in a given experiment by the median intensity value of all probes in that experiment as done previously.14Go As hybridization values are distributed normally in a log scale, we transformed all the values to log2 and then normalized by dividing each probe intensity value in a given experiment by the median intensity value of all probes in that experiment.

We first examined the possibility of assessing retained introns by comparing the expression data of individual introns compared with a transcripts exon hybridization value. An intron was defined as ‘candidate constitutive’ or as ‘retained’ intron based on its mean hybridization value relative to the mean hybridization value of all introns and exons in a TU. To establish global statistical parameters that can be used to distinguish between intron and exon scores in a transcriptional unit, we first explored the hybridization values of introns and exons in the full genome expression database. Genes that do not contain introns or that have multiple non-unique probes in their introns are not included in this analysis. The hybridization values of 18 275 genes were used (Table 1). The mean hybridization values of all the probes that belong to the exons of a specific gene were calculated, as well as the mean hybridization values of all the probes that belong to the gene introns. This assumes that the vast majority of plant introns are constitutively spliced. The mean hybridization values distribution of all the transcripts exons and introns is shown in Fig. 1. The two samples are statistically different based on the goodness-of-fit hypothesis test (Two-sample Kolmogorov–Smirnov, P < 10–100). A cut-off minimum for the mean hybridization value of exons of a transcript of 1.09 was chosen for analysis as at that level 58% (10 574) of the transcripts could be analyzed but only 1.5% of the introns mean hybridization values were above this level.


View this table:
[in this window]
[in a new window]
 
Table 1. Intron and exon classes in the WGA survey

 

Figure 1
View larger version (21K):
[in this window]
[in a new window]
 
Figure 1. Distribution of the mean hybridization values of the probes in exons and introns transcripts. Probes were divided into exons and introns. The mean hybridization value of all the exons in a transcript as well as the mean hybridization value of all its introns were calculated.

 
3.3. Examining intron retention in whole-genome arrays
In order to identify retained introns in the 10 574 transcripts, the probes of the exons in each transcript were treated as one group, while each intron, which contained at least three probes, was treated as a different group. The three probes limitation was used to increase the significance of the statistical tests. A balanced one-way ANOVA for comparing the means of all the groups in a transcript was performed. To insure a FDR of at most 0.05, the BH method was used.15Go The P-value of the ANOVA tests was found to be statistically significant in 10 270 transcripts (analyzed transcripts, Table 1) and used to define three different intron classes.

In Class 1 transcripts, all the introns hybridization values are different from the exon group (Fig. 2A). In the example shown, the AT2G36530 transcript is defined as containing only constitutive introns as the exons group is significantly different from all the introns. In Class 2 transcripts, one intron is statistically similar to the exons group. Thus, in the case of transcript AT2G47470, intron 5 is defined as a retained intron (Fig. 2B). Intermediate cases are exemplified in Class 3 (intermediate retained). In this case the exceptional intron is expressed at a level that is between the exons group and all the other introns. In the example shown in Fig. 2C, the AT5G11200 transcript shows that intron 6 hybridization values are between the exon and intron groups.


Figure 2
View larger version (14K):
[in this window]
[in a new window]
 
Figure 2. Schematic expression profiles of introns and exons. Each circle represents one intron or exon group defined by at least three probes. The circle represents the mean hybridization values of the group and the line borders are the 95% confidence level of the mean. (A) Transcript hybridization values of AT2G36530. (B) Transcript hybridization values of AT2G47470. (C) Transcript hybridization values of AT5G11200.

 
Examination of the rest of the transcripts shows that 8206 out of 10 270 (80%) fall into one of the intron classes. In the remaining 2064 cases, there is no clear statistical differentiation of the intron and exon groups. In these cases, the transcript exon and intron groups are compared again using a lower confidence level (a = 0.1 instead of 0.05). This causes the interval around the mean of each group to contract. At this probability level conclusions could be drawn for an additional 620 cases. In the remaining 14% of transcripts the statistical variance was too great and no clear definition of the intron/exon expression boundaries could be made. In these cases, the transcripts were termed ‘undifferentiated’. The algorithms above as applied to the whole-genome arrays are summarized in the Table 1. Thus, ~8% rate of the transcripts that could be measured contained retained introns. This is slightly higher than previous EST and cDNA-derived estimates for retained introns of between 3 to 6% of total transcripts.3Go,9Go The detailed analysis of all the transcripts are tabulated in Table S2 available at www.dnaresearch.oxfordjournals.org.

3.4. Confirmation of retained introns by RT–PCR analysis
To test the veracity of using WGA expression data to characterize AS, a sample of 14 AS events and 5 constitutive-type introns was examined by RT–PCR. The genes were selected such that they represent low- and high-expression value transcripts, retained introns in different parts of the transcripts and contain intermediate and retained-type introns. Each PCR was carried out using RNA that was extracted from four different samples including; seedlings, flowers, suspension-culture cells and roots that were combined together. Retained introns were traced either by primers that flank the intron or by using an internal intron probe in which the opposing primer was from the next non-adjacent exon. In this way, retained (I) and the smaller spliced fragment (S) could be recovered. However, if a fragment larger than that predicted by ‘I’ was detected, i.e. a fragment that included the non-adjacent but normally constitutively spliced intron was detected, this indicated that the RT–PCR amplification used pre-mRNA as its template and was rejected. All fragments containing putative introns were confirmed by sequencing (data not shown). The results shown in Fig. 3 and summarized in Table 2 indicate that 12 out of 14 introns (86%) that were predicted to be retained by the expression analysis were detected by RT–PCR. In a similar manner, when constitutive introns were examined 4 out of 5 could be verified as being constitutive.


Figure 3
View larger version (52K):
[in this window]
[in a new window]
 
Figure 3. Composite of gene structure, primer choices and gel fractionation of RT–PCR products. The gene structure is from TIGR View (http://www.tigr.org/tigr-scripts/tgi/T_index.cgi?species=arab). Arrows indicate the position of primers used. In cases where flanking intron probes showed two migrating fragments one gel insert is shown. In cases using intron probes, the appropriate controls are also shown. The results are summarized in Table 2. I and S indicates retained or spliced fragment, P indicate pre-mRNA, m indicates marker lane.

 

View this table:
[in this window]
[in a new window]
 
Table 2. Detection of retained introns by RT–PCR

 
3.5. Feasibility of detecting ‘exon skip’ in whole-genome arrays
In theory, a modification of the same algorithm should be applicable to estimating exon skip. However inspection of Fig. 1 shows that the distribution of the introns mean hybridization value is much narrower than the exons mean hybridization value distribution. Thus, the lower hybridization values expected from potentially skipped exons would fall well within the normal exons distribution values, making the direct application of WGA difficult. The limitation of this application is exemplified when the same statistical methods used for detection of retained introns are applied to detect exon skip. Thus, to detect exon skip in a transcript, the probes of its introns were treated as one group, while each exon, which contained at least three probes, was treated as a different group. Exon skip of the first or last exons were excluded as these are probably due to 5' and 3' transcription start/finish differences and do not necessarily arise from splicing. Furthermore in that case, any data retrieved would be impossible to verify owing to the lack of flanking sequence for RT–PCR priming. The results of the search for exons skips were divided as for the introns. Examination of internal exons using the statistical tests and parameters as those applied for examination of the introns showed an exon skip level of more than 8% (Table 1). A sample of previously un-annotated AS events in Arabidopsis was experimentally tested by RT–PCR. Each PCR was done using RNA that was extracted from four different samples as carried out for intron retention analysis. However, in this case only a low rate verification was achieved, 3 out of 8 (37.5%; Supplementary Table S3 available at www.dnaresearch.oxfordjournals.org), Thus, owing to the difficulty of separating the exons distribution from the introns distribution in the available database, the detection of exon skip is unreliable.

3.6. Comparison of the results for intron retention to EST-derived estimates
Identification of alternative transcripts have relied on comparison of sequences from EST and cDNA databases, a method that is inherently different from the type of analysis carried out here, and, therefore, it is worthy to compare the results. The TAIR6 release contains 31 407 putative genes of which 3159 (10%) have annotated splice variants. Using TAIR annotation (ftp://ftp.arabidopsis.org/home/tair/Maps/seqviewer_data/SV_gene_feature.data) the type of each splice variant was determined. Out of the 3159 genes, 2251 can be analyzed in the WGA experiments and were composed of two transcripts that could be compared to one another. Of these 468 were shown to be of the retained-intron type, a rate that is considerably lower than the more than 40% that was previously observed.3Go,7Go–9Go In order to compare the efficiency of both methods the limitations applied to the WGA analysis were applied to the retained introns in the TAIR database. Of the 468 retained introns, 257 met the criteria employed to analyze the existing WGA data. Of these 45 or 18% were detected in common by the WGA method (Supplementary Table S4 is available at www.dnaresearch.oxfordjournals.org). These results indicate the overlapping but complementary nature of both approaches as a tool to determine AS.

3.7. Distribution of biological functionalities of transcripts with retained introns
Examining the assigned global functional activity of transcripts with alternatively spliced retained introns will contribute to understanding their function in the biology of the plant. Genes that are analyzed in this study are specific to flowers, root, cell culture or seedling and as a result their gene ontology will be different from the total genes ontology. Therefore, in our case, changes in gene ontology distribution were sought by comparing the retained introns group with the constitutive introns group analyzed in this work rather than to total genes ontology. As nearly 8% of the genes contain retained introns we expect a similar distribution to be emulated in the gene ontology. Deviations from the expected binomial distribution with the P-value set at 0.05 are summarized in Table 3. Strikingly, the groups that are enriched in the retained introns group include mostly RNA processing and signal transduction. In contrast, transcripts related to protein metabolism are particularly underrepresented in the intron retention group.


View this table:
[in this window]
[in a new window]
 
Table 3. Distribution of assigned gene ontology in genes with retained introns and constitutive introns

 
3.8. Physical characterization of retained introns
Differences in the physical characteristics of introns, such as size or flanking sequence information, may serve to characterize this class of AS. The characteristics that were measured include; position of the intron in the transcript, i.e. present in the UTR or CDS, whether it contains an open reading frame (ORF), the intron size, its GC content and borders of the intron. Interestingly as a group, the median size of the retained introns is significantly smaller when compared with the size of the constitutive introns irrespective of the presence of an ORF (117 and 176 bp, respectively). The size distributions are statistically different based on the goodness-of-fit hypothesis test (Two-sample Kolmogorov–Smirnov, P = 9.5 x 10–44).

We note that the median size of introns calculated from the TAIR database (ftp://ftp.arabidopsis.org/home/tair/Sequences/blast_datasets/TAIR6_intron_20060221) is 99 bp. The higher means in all types of introns measured here is a direct result of the requirement that introns be represented by at least three probes (i.e. 3 probes x 25 bp size/probe). Interestingly, the GC content of retained introns is higher than that of constitutive introns (37% compare to 33%) but lower than the GC content of coding regions (44%).17Go The two GC content distributions are statistically different based on the goodness-of-fit hypothesis test (Two-sample Kolmogorov–Smirnov P = 1.5 x 10–139; Supplementary Figure S1 is available at www.dnaresearch.oxfordjournals.org). Higher GC contents have been recently detected in retained introns that were identified by genome EST/cDNA alignments.8Go The possibility exists that the higher GC content reflects an artifact that is associated with enhanced non-specific hybridization of probes, i.e. they may be more sensitive to background noise. In order to assess this possibility, we examined the distribution of the GC content of the intron probes whose hybridization values were above or equal 1.09 (Supplementary Figure S2 is available at www.dnaresearch.oxfordjournals.org) In this case, the R2 for the correlation of GC content and hybridization value was found to be 0.06. Hence for the range of hybridization values used in our survey non-specific hybridization is not correlated to GC content and the values measured are a result of true specific hybridization.

Additional qualities of note in the characterization of retained introns are that, relatively more retained introns originate from the 5'- and 3'-UTR region compared with the constitutive intron distribution (13.4% compared with 7.3%; Table 4). By comparison, the global analysis predicts that at least 9% of Arabidopsis genes have introns in their UTRs.18Go,19Go Strikingly, inspection of Table 4 shows that 609 (86.6%) of the retained introns are inside the CDS. Yet of these, only 4.6% maintain an in-frame ORF (Table 4). Interestingly this rate, while low, is still more than 3-fold higher than that noted for constitutive introns (1.5%). Finally, Intron borders can play a crucial role in splicing efficiency.20Go However a comparison between the borders of the constitutive and retained introns showed no differences (Supplementary Figure S3 is available at www.dnaresearch.oxfordjournals.org). Thus, border sequence cannot be used to predict alternative intron splicing.


View this table:
[in this window]
[in a new window]
 
Table 4. Characterization of retained and constitutive introns

 

    4. Discussion
 Top
 Abstract
 1. Introduction
 2. Materials and methods
 3. Results
 4. Discussion
 Acknowledgements
 References
 
We demonstrate that direct transcript expression analysis using 'high-density oligonucleotide-based WGA is particularly amenable for assessing global intron retention in Arabidopsis. However, the detection of exon skip is unreliable owing to the difficulty in separating the distribution of the expression values that distinguish exons from introns. We note that detection of alternative 5' and 3' splicing sites by analysis of WGA would be limited to those alternative sites that comprise a number of probe lengths (in the present case a minimum of 75 bp). As alternative 5' and 3' splice choices are usually of shorter length, they would not generally be detected.

Genome-wide detection of retained introns in Arabidopsis reveals that at least 8% of genes that can be surveyed by microarray include a transcript with a retained intron. A sample of these transcripts was tested using RT–PCR and 86% could be verified. In our analysis, results of the WGA data were combined to enable the generation of statistically reliable outcomes. Thus, tissue-specific intron retention may be lost as combining RNA from diverse biological material can obscure the AS signal. In the future, the use of sample repeats from the same tissue will undoubtedly improve the statistical veracity of the results and optimize the detection of aberrant intron behavior. Nonetheless, as intron retention comprises above 40% of all AS events3Go,7Go,9Go this indicates that overall the rate of AS in Arabidopsis is 20%. This rate is considerably higher than 10–14%, which was garnered from compilation of EST and cDNA data from different sources3Go,7Go but similar to the one that was recently reported from a large collection of EST/cDNA.8Go Similarly, the recent application of WGA to human chromosome 21 and 22 showed a much higher rate of AS than previously detected.13Go

The retained-intron type is distinct from the other AS types because it represents an absence or failure of spliceosome action rather than alternative choice of splice junctions. Our analysis detects several features that are specific to retained introns. They are smaller, have higher GC content, are relatively enriched in the UTR and those in the CDS tend to have a slightly higher chance of containing an ORF when compared with a constitutive-type introns. The size of introns has been shown to play a role in AS biology. In Drosophila, the size and location of the flanking introns control the frequency and the type of AS that a pre-mRNA transcript undergoes.21Go Furthermore, it is possible that the attribute of higher GC content could contribute to a weaker U-rich splicing signal as a result of less efficient AU protein binding.22Go This is because intronic U-rich or AU-rich elements can influence splice site selection and splicing efficiency particularly in plants (reviewed in Simpson and Filipowicz23Go). These sequences are likely to bind proteins (analogous to animal hnRNP proteins), which may permit intron recognition and recruit or allow access to spliceosomal components and the formation of commitment splicing complexes.

Of particular note is that 82.6% (581 out of 703) of the transcripts containing retained introns are in the CDS and would shift reading frames and have an early stop codon. Of these, in 68% (477 out of 703) of the cases the intron that is retained is not the last one. (data not shown). The position of these stop codons is important in the context of potential nonsense-mediated decay (NMD) effects. NMD serves as a surveillance mechanism that removes erroneous mRNAs containing a premature termination codon.24Go,25Go Thus, if NMD mechanism exists in plants, its selective power of degradation is incomplete as the transcripts with AS events are readily detected. The existence of limited use of NMD in plants has recently been shown in the analysis of transcripts containing premature termination codon.26Go However, the coupling of AS and NMD could be an important post-transcriptional regulation mechanism to adjust the level of transcript isoforms.27Go–29Go

Inspection of the intron retention database provides biological insight. RNA processing and signal transduction transcripts appear to be differentially enriched in the retained-introns group. A serine/arginine-rich (SR) family of proteins is implicated in constitutive and AS of pre-mRNAs. They appear in alternatively spliced forms30Go,31Go and at least one of them, SR1, contains a retained intron.32Go Transcription factors that show AS variants were described for five genes and in one of them, At1g26260, the first intron is retained.33Go Numerous studies have contributed to the view that SR proteins play a general role in splicing and can modulate splice site selection in a concentration-dependent manner (Ref. 34Go and reviewed by Manley and Tacke35Go). One imaginable consequence of this is that cells may regulate the expression or activity of individual SR proteins, or their antagonists, to control the expression of one or more target genes in a tissue-specific and/or developmentally regulated fashion. Transcript that contains intron may serve as a negative regulator of the SR protein concentration.

The potential regulatory power of transcripts with retained introns can be gleaned from examination of the database (Supplementary Table S2 is available at www.dnaresearch.oxfordjournals.org). Two examples are NPR1 (At1g64280) and GIGANTEA (At1g22770). NPR1 is a key regulator of the salicylic acid (SA)-mediated systemic acquired resistance (SAR) pathway and it confers resistance to pathogens. Known mutations of this gene cause loss of SAR induction, loss of expression of PR genes and increased susceptibility to infections.36Go,37Go WGA analysis predicts that the last intron of this gene is retained and contains an in-frame stop codon. GIGANTEA promotes flowering under long days in a circadian clock-controlled flowering pathway, together with CONSTANTS (CO) and FLOWERING LOCUS T (FT). Mutations in this gene cause late-flowering phenotype.38Go WGA analysis predicts intron 10 (out of 13) to be retained and to contain a stop codon.

The biological function of retained introns remains an enigma. A recent survey of human transcripts shows that one-third of the AS contained premature termination codons that are the apparent targets of NMD. It was proposed that regulated unproductive splicing may be a means of modifying the level of protein expression or of introducing modified gene products.39Go This work provides a novel database of retained introns and their genomic sites and lays a framework for future application of WGA for dynamic screening of changes in splicing during the plants' life and as a result of environmental input.


    Acknowledgements
 Top
 Abstract
 1. Introduction
 2. Materials and methods
 3. Results
 4. Discussion
 Acknowledgements
 References
 
This research was supported by the Israel Science Foundation grant No 388/02 and by the BARD United States-Israel Binational Agricultural Research and Development Fund Grant number IS-3454-03.

Supplementary Data: Supplementary data is available at http://www.dnaresearch.oxfordjournals.org


    Footnotes
 
*To whom correspondence should be addressed. Tel. +972-8-9342175. Fax. +972-8-9344181, E-mail: robert.fluhr{at}weizmann.ac.il

Communicated by Kazuo Shinozaki


    References
 Top
 Abstract
 1. Introduction
 2. Materials and methods
 3. Results
 4. Discussion
 Acknowledgements
 References
 

  1. Kan, Z., States, D., Gish, W. 2002, Selecting for functional alternative splices in ESTs, Genome Res., 12, 1837–1845.[Abstract/Free Full Text]
  2. Carninci, P., Kasukawa, T., Katayama, S., et al. 2005, The transcriptional landscape of the mammalian genome, Science, 309, 1559–1563.[Abstract/Free Full Text]
  3. Nagasaki, H., Arita, M., Nishizawa, T., Suwa, M., Gotoh, O. 2005, Species-specific variation of alternative splicing and transcriptional initiation in six eukaryotes, Gene, 364, 53–62.[CrossRef][Web of Science][Medline]
  4. Modrek, B. and Lee, C. 2002, A genomic view of alternative splicing, Nat. Genet., 30, 13–19.[CrossRef][Web of Science][Medline]
  5. Goodison, S., Yoshida, K., Churchman, M., Tarin, D. 1998, Multiple intron retention occurs in tumor cell CD44 mRNA processing, Am. J. Pathol., 153, 1221–1228.[Abstract/Free Full Text]
  6. Mansilla, A., Lopez-Sanchez, C., de la Rosa, E.J., et al. 2005, Developmental regulation of a proinsulin messenger RNA generated by intron retention, EMBO Rep., 6, 1182–1187.[CrossRef][Web of Science][Medline]
  7. Iida, K., Seki, M., Sakurai, T., et al. 2004, Genome-wide analysis of alternative pre-mRNA splicing in Arabidopsis thaliana based on full-length cDNA sequences, Nucleic Acids Res., 32, 5096–5103.[Abstract/Free Full Text]
  8. Wang, B.B. and Brendel, V. 2006, Genomewide comparative analysis of alternative splicing in plants, Proc. Natl. Acad. Sci. USA, 103, 7175–7180.[Abstract/Free Full Text]
  9. Ner-Gaon, H., Halachmi, R., Savaldi-Goldstein, S., Rubin, E., Ophir, R., Fluhr, R. 2004, Intron retention is a major phenomenon in alternative splicing in Arabidopsis, Plant J., 39, 877–885.[CrossRef][Web of Science][Medline]
  10. Altschul, S.F., Madden, T.L., Schaffer, A.A., et al. 1997, Gapped BLAST and PSIBLAST: a new generation of protein database search programs, Nucleic Acids Res., 25, 3389–3402.[Abstract/Free Full Text]
  11. Martienssen, R.A., Doerge, R.W., Colot, V. 2005, Epigenomic mapping in Arabidopsis using tiling microarrays, Chromosome Res., 13, 299–308.[CrossRef][Web of Science][Medline]
  12. Kapranov, P., Cawley, S.E., Drenkow, J., et al. 2002, Large-scale transcriptional activity in chromosomes 21 and 22, Science, 296, 916–919.[Abstract/Free Full Text]
  13. Kampa, D., Cheng, J., Kapranov, P., et al. 2004, Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22, Genome Res., 14, 331–342.[Abstract/Free Full Text]
  14. Yamada, K., Lim, J., Dale, J.M., et al. 2003, Empirical analysis of transcriptional activity in the Arabidopsis genome, Science, 302, 842–846.[Abstract/Free Full Text]
  15. Benjamini, Y. and Hochberg, Y. 1995, Controlling the false discovery rate: a practical and powerful approach to multiple testing, J. R. Stat. Soc. Ser., B57, 289–300.
  16. Savaldi-Goldstein, S., Aviv, D., Davydov, O., Fluhr, R. 2003, Alternative splicing modulation by a LAMMER kinase impinges on developmental and transcriptome expression, Plant Cell, 15, 926–938.[Abstract/Free Full Text]
  17. Arabidopsis Genome Initiative. 2000, Analysis of the genome sequence of the flowering plant Arabidopsis thaliana, Nature, 408, 796–815.[CrossRef][Medline]
  18. Zhu, W., Schlueter, S.D., Brendel, V. 2003, Refined annotation of the Arabidopsis genome by complete expressed sequence tag mapping, Plant Physiol., 132, 469–484.[Abstract/Free Full Text]
  19. Haas, B.J., Delcher, A.L., Mount, S.M., et al. 2003, Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies, Nucleic Acids Res., 31, 5654–5666.[Abstract/Free Full Text]
  20. Rogozin, B., Sverdlov, V., Babenko, N., Koonin, V. 2005, Analysis of evolution of exon–intron structure of eukaryotic genes, Brief Bioinform., 6, 118–134.[Abstract/Free Full Text]
  21. Fox-Walsh, K.L., Dou, Y., Lam, B.J., Hung, S.P., Baldi, P.F., Hertel, K.J. 2005, The architecture of pre-mRNAs affects mechanisms of splice-site pairing, Proc. Natl Acad. Sci. USA, 102, 16176–16181.[Abstract/Free Full Text]
  22. Brown, J.W. 1996, Arabidopsis intron mutations and pre-mRNA splicing, Plant J., 10, 771–780.[CrossRef][Web of Science][Medline]
  23. Simpson, G.G. and Filipowicz, W. 1996, Splicing of precursors to mRNA in higher plants: mechanism, regulation and sub-nuclear organisation of the spliceosomal machinery, Plant Mol. Biol., 32, 1–41.[CrossRef][Web of Science][Medline]
  24. Nagy, E. and Maquat, L.E. 1998, A rule for termination-codon position within intron-containing genes: when nonsense affects RNA abundance, Trends Biochem. Sci., 23, 198–199.[CrossRef][Web of Science][Medline]
  25. Baker, K.E. and Parker, R. 2004, Nonsense-mediated mRNA decay: terminating erroneous gene expression, Curr. Opin. Cell Biol., 16, 293–299.[CrossRef][Web of Science][Medline]
  26. Hori, K. and Watanabe, Y. 2005, UPF3 suppresses aberrant spliced mRNA in Arabidopsis, Plant J., 43, 530–540.[CrossRef][Web of Science][Medline]
  27. Lejeune, F. and Maquat, L.E. 2005, Mechanistic links between nonsense-mediated mRNA decay and pre-mRNA splicing in mammalian cells, Curr. Opin. Cell Biol., 17, 309–315.[CrossRef][Web of Science][Medline]
  28. Lareau, L.F., Green, R.E., Bhatnagar, R.S., Brenner, S.E. 2004, The evolving roles of alternative splicing, Curr. Opin. Struct. Biol., 14, 273–282.[CrossRef][Web of Science][Medline]
  29. Yoine, M., Nishii, T., Nakamura, K. 2006, Arabidopsis UPF1 RNA helicase for nonsense-mediated mRNA decay is involved in seed size control and is essential for growth, Plant Cell. Physiol., 47, 572–580.[Abstract/Free Full Text]
  30. Reddy, A.S. 2004, Plant serine/arginine-rich proteins and their role in pre-mRNA splicing, Trends Plant Sci., 9, 541–547.[CrossRef][Web of Science][Medline]
  31. Iida, K. and Go, M. 2006, Survey of conserved alternative splicing events of mRNAs encoding SR proteins in land plants, Mol Biol Evol., [Please update reference 31 if now it has been published.].
  32. Lazar, G. and Goodman, H.M. 2000, The Arabidopsis splicing factor SR1 is regulated by alternative splicing, Plant Mol. Biol., 42, 571–581.[CrossRef][Web of Science][Medline]
  33. Gong, W., Shen, Y.P., Ma, L.G., et al. 2004, Genome-wide ORFeome cloning and analysis of Arabidopsis transcription factor genes, Plant Physiol., 135, 773–782.[Abstract/Free Full Text]
  34. Jumaa, H. and Nielsen, P.J. 1997, The splicing factor SRp20 modifies splicing of its own mRNA and ASF/SF2 antagonizes this regulation, EMBO J., 16, 5077–5085.[CrossRef][Web of Science][Medline]
  35. Manley, J.L. and Tacke, R. 1996, SR proteins and splicing control, Genes Dev., 10, 1569–1579.[Free Full Text]
  36. Cao, H., Glazebrook, J., Clarke, J.D., Volko, S., Dong, X. 1997, The Arabidopsis NPR1 gene that controls systemic acquired resistance encodes a novel protein containing ankyrin repeats, Cell, 88, 57–63.[CrossRef][Web of Science][Medline]
  37. Ryals, J., Weymann, K., Lawton, K., et al. 1997, The Arabidopsis NIM1 protein shows homology to the mammalian transcription factor inhibitor I kappa B, Plant Cell, 9, 425–439.[Abstract]
  38. Eimert, K., Wang, S.M., Lue, W.I., Chen, J. 1995, Monogenic recessive mutations causing both late floral initiation and excess starch accumulation in Arabidopsis, Plant Cell, 7, 1703–1712.[Abstract]
  39. Lewis, B.P., Green, R.E., Brenner, S.E. 2003, Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans, Proc. Natl Acad. Sci. USA, 100, 189–192.[Abstract/Free Full Text]
  40. Berardini, T.Z., Mundodi, S., Reiser, L., et al. 2004, Functional annotation of the Arabidopsis genome using controlled vocabularies, Plant Physiol., 135, 745–755.[Abstract/Free Full Text]

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Plant CellHome page
H. Wang, L. C. Boavida, M. Ron, and S. McCormick
Truncation of a Protein Disulfide Isomerase, PDIL2-1, Delays Embryo Sac Maturation and Disrupts Pollen Tube Guidance in Arabidopsis thaliana
PLANT CELL, December 1, 2008; 20(12): 3300 - 3311.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
W. B. Barbazuk, Y. Fu, and K. M. McGinnis
Genome-wide analyses of alternative splicing in plants: Opportunities and challenges
Genome Res., September 1, 2008; 18(9): 1381 - 1392.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
R. Narsai, K. A. Howell, A. H. Millar, N. O'Toole, I. Small, and J. Whelan
Genome-Wide Analysis of mRNA Decay Rates and Their Determinants in Arabidopsis thaliana
PLANT CELL, November 1, 2007; 19(11): 3418 - 3436.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
F.-C. Chen, S.-S. Wang, S.-M. Chaw, Y.-T. Huang, and T.-J. Chuang
Plant Gene and Alternatively Spliced Variant Annotator. A Plant Genome Annotation Pipeline for Rice Gene and Alternatively Spliced Variant Identification with Cross-Species Expressed Sequence Tag Conservation from Seven Plant Species
Plant Physiology, March 1, 2007; 143(3): 1086 - 1095.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary data
Right arrowOA All Versions of this Article:
13/3/111    most recent
dsl003v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (9)
Right arrow Request Permissions
Google Scholar
Right arrow Articles by Ner-Gaon, H.
Right arrow Articles by Fluhr, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ner-Gaon, H.
Right arrow Articles by Fluhr, R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?