DNA Research Advance Access originally published online on October 30, 2009
DNA Research 2009 16(6):371-383; doi:10.1093/dnares/dsp022
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Identification and Functional Analyses of 11 769 Full-length Human cDNAs Focused on Alternative Splicing
1 Graduate School of Pharmaceutical Sciences, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
2 Central Research Laboratory, Hitachi, Ltd, Kokubunji, Tokyo 185-8601, Japan
3 Reverse Proteomics Research Institute, 1-9-11 Kaji, Chiyoda-ku, Tokyo 101-0044, Japan
4 National Institute of Advanced Industrial Science and Technology, 2-41-6 Aomi, Koto-ku, Tokyo 135-0064, Japan
5 Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 4-6-1 Shiroganedai, Minato-ku, Tokyo 108-8639, Japan
Received 25 August 2009 ; accepted 1 October 2009.
We analyzed diversity of mRNA produced as a result of alternative splicing in order to evaluate gene function. First, we predicted the number of human genes transcribed into protein-coding mRNAs by using the sequence information of full-length cDNAs and 5'-ESTs and obtained 23 241 of such human genes. Next, using these genes, we analyzed the mRNA diversity and consequently sequenced and identified 11 769 human full-length cDNAs whose predicted open reading frames were different from other known full-length cDNAs. Especially, 30% of the cDNAs we identified contained variation in the transcription start site (TSS). Our analysis, which particularly focused on multiple variable first exons (FEVs) formed due to the alternative utilization of TSSs, led to the identification of 261 FEVs expressed in the tissue-specific manner. Quantification of the expression profiles of 13 genes by real-time PCR analysis further confirmed the tissue-specific expression of FEVs, e.g. OXR1 had specific TSS in brain and tumor tissues, and so on. Finally, based on the results of our mRNA diversity analysis, we have created the FLJ Human cDNA Database. From our result, it has been understood mechanisms that one gene produces suitable protein-coding transcripts responding to the situation and the environment.
Key words: full-length cDNA; alternative splicing; alternative transcription start site; mRNA diversity; tissue-specific expression
* Corresponding author. E-mail: tisogai{at}mol.f.u-tokyo.ac.jp