DNA Research Advance Access originally published online on April 4, 2008
DNA Research 2008 15(3):115-122; doi:10.1093/dnares/dsn005
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Transcriptome Analysis of a cDNA Library from Adult Human Epididymis
1 Shandong Research Center of Stem Cell Engineering/Centre Laboratory, Yu-Huang-Ding Hospital, Yan-Tai, Shan-Dong Province, Shandong 264000, People's Republic of China
2 Shanghai Key Laboratory for Molecular Andrology, State Key Laboratory of Molecular Biology, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, People's Republic of China
Received 28 June 2007; accepted 5 March 2008.
| Abstract |
|---|
|
|
|---|
Mammalian Gene Collection (MGC) verified over 9000 human full-ORF genes and FLJ Program reported 21 243 cDNAs of which 14 409 were unique ones and 5416 seemed to be protein-coding. The pity is that epididymis cDNA library was missing in their sequencing target list. Epididymis is a very important male accessory sex organ for sperm maturation and storage. Fully differentiated spermatozoa left from testis acquire their motility and capacity for fertilization via interactions with the epididymal epithelium duct lumen during passage through this convoluted duct. Here, we report that 20 000 clones from a healthy male epididymis cDNA library have been sequenced. The sequencing data provided 8234 known sequences and 650 unknown cDNA fragments. Hundred and six of 650 unknown cDNA clone inserts were randomly selected for fully sequencing. There were 25 unknown unique sequences and 19 released but unreported sequences came out. By northern blot analysis, four sequences randomly selected from the 19 released sequences with no known function showed positive mRNA signals in epididymis and testis. The signals for three of six from those unknown group showed as epididymis abundant in a region-specific manner but not in the testis and other tissues tested. All the sequencing data will be available on the website www.sdscli.com.
Key words: human epididymis cDNA library; transcriptomes for human epididymis; sperm maturation
| 1. Introduction |
|---|
|
|
|---|
In 2002, Mammalian Gene Collection (MGC) sequenced and verified over 9000 human full-ORF genes and 7800 candidate full-ORF genes.1
13% of the genes expressed in the epididymis, differ in their level of expression in different segments by at least fourfold. The large number of regulated genes within a grossly dissected single organ is unprecedented. In comparison, only 1186 genes are differentially regulated (fourfold, P = 0.01) between the kidney and the liver, two distinct organs with very different functions. This suggests that sperm maturation, transport, and storage in the epididymis are highly complex events. The more and more people pay attentions to this organ. Because understanding the molecular mechanisms of sperm maturation in epididymis will be of great helps not only in answering how the expression of region-specific programmatically expressed genes was controlled but also in male contraceptive drug design, personalized diagnosis, and treatment of infertility and sperm health evaluation. To this end, human genome U133 plus 2.0 microarray covered over 47 000 transcripts and variants representing 38 500 well-characterized human genes were applied by our lab to obtain a full profile of gene expression in the epididymis of a fertile young male.5| 2. Materials and methods |
|---|
|
|
|---|
2.1. Construction of cDNA library
Human epididymis was obtained from male of age 28 with one-child reproductive history and normal spermatogenesis without the other related diseases (the donor died of accident and were voluntary to donate), which was approved by the Ethics Committee of Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences. Samples were sliced into 1 mm3 under low temperature, frozen by nitrogen, and stored at –70°C. Human epididymis total RNA was extracted by Trizol (Invitrogen). The quality was determined by electrophoresis on a 1.1% formaldehyde degenerated agarose gel.
The cDNA library of human epididymis was constructed by Creator SMART cDNA Library Construction Kit (CLONTECH) according to the manufacturer's suggestions. Finally, the human epididymal cDNA was ligated into the vector pDNR-LIB and 5 µL of ligation product was transformed to 25 µL of electrocompetent cell XL1-Blue. The entire transformation mixture was transferred to the tubes containing 970 µL of LB broth. The method of consecutive dilution was used to titer the bacteria solution.6
After incubating with shaking (225 rpm) for 1 h at 37°C, 1 µL of each transformation mixture was added into the appropriate tube containing 1 mL of LB broth and then 1 µL of dilution was added into 1 mL of LB broth. Combine 1 µL of final dilution with 50 µL of LB broth and spread it onto LB/Cm plates. Inverted the plates and incubated at 37°C overnight. The number of colonies was counted to determine the titer (cfu/mL). Calculated the titer according to the following formulas: colony x 103 x 103 cfu/mL.
2.2. Characterization of human epididymis cDNA library
2.2.1. Library amplification and PCR identification
The number of 150 mm plates was calculated based on the library titer. Library diluted by LB medium was spreaded onto LB/Cm plates which were inverted and incubated at 37°C overnight. Fourteen isolated colonies were picked and boiled as template for further PCR identification using M13 Sense and Antisense primer.
2.2.2. Sequencing and bioinformatics analysis
Thousand isolated colonies from the human epididymis cDNA library were first picked and sequenced by 5'-end one-pass using ABI 3100 sequencing system for quality evaluation of this cDNA library. Further, total of 20 000 colonies were sequenced by 5'-end one way. Then the sequencing results were blasted with Nr and EST database published on March 2004. The non-public cDNAs were identified by bilateral sequencing. All the sequencing was subjected to bioinformatics analysis.
Our sequencing results were compared with 11 monkey epididymis genes7
and six human epididymis specific expression genes HE1-HE68
using clustalw software. The unmatched genes are amplified by PCR using cDNA library as template. The primers used for identification of each unmatched genes can be seen in supplementary Table S1.
2.3. Gene expression analysis using northern blot and RT–PCR
Some selected clones were used as probes for further northern analysis. Northern blot analysis was performed according to the procedure described previously.9
The 10 µg total RNA of each sample was loaded in each lane. An 18S r-RNA hybridization signal was used as a loading control. Epididymis-specific expression of con16, 32, and 33 was further evaluated by RT–PCR using cDNA from caput, corpus, cauda epididymis, and other seven human tissues. The ESC42 and β-actin primers were applied as control.
| 3. Results and discussions |
|---|
|
|
|---|
3.1. Evaluation of the human epididymis cDNA library
The quality of the total mRNAs prepared from an entire human epididymis was judged as A260/A280 = 1.85 and the 28S/18S = 2/1 shown in the Fig. 1A. The size of the resulted ds.cDNAs smeared from 4 kb to 200 bp on the agarose gel as shown in the Fig. 1B.
|
Creator SMART cDNA Library Construction Kit (CLONTECH) was used to construct the library. It is the case that PCR employed in SMART kit can result in the clone frequencies bias. However, in this work, we have introduced the minimal PCR cycles (18 cycles) as recommended by the manufacturers which would reduce the bias significantly. The titer of this cDNA library is 2 x 107 cfu/mL. The 14 clones randomly selected from this library were examined to evaluate the quality of this library. Fig. 1C showed that 13 of 14 clones were positive. The distribution of the insert size was as follows: >1 kb for two clones, >700 bp for three clones, >500 bp for seven clones, and 400 bp for one clone. The quality of our library might be not very high because it contained a lot of relatively short inserts. However, it is true that the size of many epididymis-specific transcripts is quite small, less than 1 kb, such as at least 19 beta-defensins,10
It indicated that either the transformation efficiency (92.8%) or the insert length was high enough. For further testing the quality of the inserts, 1000 clones were sequenced at the 5'-teminal. The 450 bp sequencing data in average proved that 92.5% clones were positive. Among them, 92% of which could be found in the EST bank (December 2003) and only 8% could not. Seventy-five percentage of the insert sequence was protein coding, corresponding to 693 genes. To this end, we decided to sequence them all.
3.2. Sequencing data for 20 000 clones and bioinformatics analysis
As we know, the appearing frequency of the clones with same insert in the library was determined by their expression levels and in a direct proportion manner. So the number of the clones was not equal to the number of genes. And the insert of the clones could not be entirely the full-length cDNA. Owing to the financial limitation, we could not sequence all the clones in the library. According to the transcriptomes data from the human epididymis by DNA array, 26 893 qualifiers as present signal > 1 and 15 946 qualifiers as present signal > 50 were found in the whole epididymis, accounting for
49 and 29% of the total on the array respectively were found in the whole epididymis. So, 49 and 29% of the gene number 25 000 which were announced by the International Human Genome Sequencing Consortium would be around 12 250 and 7250 respectively to be expressed in epididymis. The DNA array data also showed that the gene expression level in epididymis was not the same. The variations sometimes were very high. There were 0.4% genes with present signal > 5000, 6.8% genes with present signal > 500, 58.3% genes with present signal > 50. It indicated that 41.7% of them (with present signal > 1) expressed in a very low level might be neglected. Therefore, we chose those genes with present signal > 50 holding 29% of the whole gene number 25 000 which will be only
7250 genes expressed in human epididymis being sent for sequencing. Besides, both MGC and FLJ programs also sequenced 10 000 clones per each cDNA library only. So, we decided to sequence 20 000 clones. It might be enough to reveal those sequences for 29% of the total genes (
7250 genes) expressed in this library. The route map for analyzing these sequencing data was summarized in Fig. 2.
|
3.2.1. 7450 non-redundant sequences
After removal of those recognizable contaminating sequences and the same sequence from different clones, the 5'-terminal one-pass sequences of the 20 000 clones were narrowed down to 7450. These sequences were blasted against NR and EST database published on the March 2003. Six thousand seven hundred and fifty cDNA sequences could be found and the rest 650 ones could not. After further analysis by using tblastx software against NR, EST database, GeneBank, and ProteinBank Swissprot, it was found that 5390 cDNAs had been reported with somewhat various biological functions using the software Genespring 7.0. Table 1 showed the classification of their functions and the numbers of genes belong to. Fig. 3 showed the chromosome localization of the 3394 cDNAs reported. Notably, quite low frequency appeared on chromosome 12 and Y for the epididymis transcripts. Although the rest 1360 cDNA sequences can be found on the database, they were only a predicted sequence resulted from bioinformatics. 650 ESTs could not be found in all the databases and they were submitted to GenBank (EH041735-EH041927, FE192591 [GenBank] -FE193047 [GenBank] ). The frequency of each of 7450 clones appearing in the sequenced 20 000 clones was also evaluated. Most of the clones are low-copy clones with the frequency no >10 among 20 000 clones. The contents of medium-copy (frequency 11–100) and high copy (frequency >100) clones are only 1.7% among the 7400 non-redundant clones.
|
|
3.2.2. 650 EST
For further understanding the nature of these so-called novel cDNA fragments, 106 of 650 clones were randomly selected for fully sequencing. The 106 sequences were submitted to GenBank (DV643899 [GenBank]
DV643998
[GenBank]
, DQ822205, DQ823637, DQ823638, EF426753, EF426754, EF426755). After getting rid of the contaminations or some repeated ones and being pieced together as long as possible, 79 independent unknown sequences were identified by Blast searching against GenBank NR database. These sequences were further searched against human genome database. Fifteen of 79 were found with significant redundant sequence in human genome and seemingly non-coding-transcripts; 8 of 79 clones were surprisingly not found in human genome; 19 of them have already been released during the course of this project but their function is still totally unknown; 25 sequences seemed to be totally unknown and unique in human genome (Fig. 2).
For further checking if these novel genes expressed in human epididymis, 6 of 25 novel clones were selected for northern blot analysis. Con16, 32, and 33 did express in human epididymis even in a region-specific manner and not in testis (Fig. 4A and Table 2). Con32 and 33 were abundant in corpus region and con16 was abundant in cauda epididymis. Their tissue specificity was further tested by RT–PCR (Fig. 5). Epididymis-specific gene ESC42 was used as positive control which has the highest expression in caput epididymis.7
Con32 and 33 were exclusively expressed in human epididymis and con16 was expressed in human epididymis except lung. These three sequences have been extended by electronic cloning (EST overlapping and extension) and 5'race and have been deposited in GenBank. Con16 belongs to the defensin family and con32 belongs to the colipase family. They might play the important roles in sperm maturation and innate immunity in male reproductive tract. Northern blot analysis was also applied for testing if 4 of 19 released but unknown sequences were present in human epididymis. Although all of them (Con8, 29, 79, 97) did express in human epididymis, it looked neither epididymis-specific nor region-specific (Fig. 4B and Table 2). Since the released sequences were found in the cDNA library from non-epididymis tissues, it is not surprised to find these four clones were not epididymis-specific and also expressed in testis. The information dealing with these clones are summarized in Table 2.
|
|
|
Next, we used BLAST search to compare the sequences for the reminder 544 of 650 clones again with relevant data sets published on the March of 2006. 342 of 544 EST sequence could be found in the NR and EST database during the course of this project. The sequences for the rest 202 clones were regarded as unknown ESTs and 193 of them were submitted to the EST database (EH041735-EH041927).
3.2.3. Qualification of the library
Fifteen reported human epididymis-specific genes and eight monkey epididymis-specific genes newly discovered by our lab were checked in this cDNA library for qualifying its representative. It is happy to see that all the above cDNAs could be detected in this library shown in Table 3 and Fig. 6. However, only 69.6% of them could be found in our sequencing data. This means that sequencing 20 000 clones is not enough to cover all the expressing transcripts in this library. In another words, due to the financial limitation, at least one of three genes in this library had not been sequenced. Nevertheless, Table 3 showed that there is 87% of the 15 reported epididymis-specific genes could be found in our sequencing data, but only 37.5% for the eight newly discovered ones. Notably, it seems that those genes appearing in our sequencing data are basically with higher expression level. It makes sense, because as we mentioned above that the probability for the cDNA clone appearing in the library was determined by their expression levels and in a direct proportion manner. From this point of view, the sequencing data presented in this report might include the majority of the human epididymal transcriptome with relatively higher expression level. Further works based on the novel cDNAs resource will be good for revealing the whole transcriptome for an organ—human epididymis.
|
|
In summary, the provided data possess the following significances. (i) Providing a list of the reported genes which are also expressed in epididymis. With the aid of human epididymis cDNA array data on our website, people can know the expression level and region localization of the particular genes in epididymis. (ii) Providing a number of unknown ESTs which are expressed in human epididymis. It is the source making the novel epididymis expressed gene discovery much more convenient. (iii) Providing a number of epididymis-specific transcripts which make up for the deficiency of the updated human transcriptome.
| Funding |
|---|
|
|
|---|
This work is supported by Shandong Province Science & Technology Key Program (032050102), The 973 Program (2006CB504002) & (2006CB944002), The National Natural Sciences Foundation of China (30230190), Shanghai Science and Technology Funding (05DZ22103), The CAS Knowledge Creative Program (KSCX1-YW-R-54), State 863 High Technology R & D project of China 2004AA221120.
| Supplementary Data |
|---|
|
|
|---|
Supplementary data are available online at www.dnaresearch.oxfordjournals.org.
| Footnotes |
|---|
* To whom correspondence should be addressed. Tel. + 86 21-54921263. Fax. + 86 21-54921011. E-mail: sdscli{at}126.com (J-Y.L.), ylzhang{at}sibs.ac.cn (Y-L.Z.)
| References |
|---|
|
|
|---|
- Strausberg R. L., Feingold E. A., Grouse L. H., Derge J. G., Klausner R. D., Collins F. S., Wagner L., Shenmen C. M., Schuler G. D., Altschul S. F., Zeeberg B., Buetow K. H., et al. Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences. Proc. Natl. Acad. Sci. USA (2002) 99:16899–16903.
[Abstract/Free Full Text] - Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R., Wakamatsu A., Hayashi K., Sato H., Nagai K., Kimura K., Makita H., et al. Complete sequencing and characterization of 21,243 full-length human cDNAs. Nat. Genet (2004) 36:40–45.[CrossRef][ISI][Medline]
- International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature (2004) 431:931–945.[CrossRef][ISI][Medline]
- Johnston D. S., Jelinsky S. A., Bang H. J., DiCandeloro P., Wilson E., Kopf G. S., Turner T. T. The mouse epididymal transcriptome: transcriptional profiling of segmental gene expression in the epididymis. Biol. Reprod (2005) 73:404–413.
[Abstract/Free Full Text] - Zhang J. S., Liu Q., Li Y. M., Hall S. H., French F. S., Zhang Y. L. Genome-wide profiling of segmental-regulated transcriptomes in human epididymis using oligo microarray. Mol. Cell Endocrinol (2006) 250:169–177.[CrossRef][ISI][Medline]
- Ausubel F. M., Brent R., Kingston R. E., Moore D. D., Seidman J. G., Smith J. A., Struhl K. Current Protocols in Molecular Biology (1994) John Wiley and Sons: New York.
- Liu Q., Hamil K. G., Sivashanmugam P., Grossman G., Soundararajan R., Rao A. J., Richardson R. T., Zhang Y. L., O'Rand M. G., Petrusz P., French F. S., Hall S. H. Primate epididymis-specific proteins: characterization of ESC42, a novel protein containing a trefoil-like motif in monkey and human. Endocrinology (2001) 142:4529–4539.[CrossRef][ISI][Medline]
- Kirchhoff C. Molecular characterization of epididymal proteins. Rev. Reprod. (1999) 3:86–95.[ISI]
- Li P., Chan H. C., He B., So S. C., Chung Y. W., Shang Q., Zhang Y. D., Zhang Y. L. An antimicrobial peptide gene found in the male reproductive system of rats. Science (2001) 291:1783–1785.
[Abstract/Free Full Text] - Patil A. A., Cai Y., Sang Y., Blecha F., Zhang G. Cross-species analysis of the mammalian beta-defensin gene family: presence of syntenic gene clusters and preferential expression in the male reproductive tract. Physiol. Genom. (2005) 23:5–17.
[Abstract/Free Full Text] - Suzuki K., Lareyre J. J., Sánchez D., Gutierrez G., Araki Y., Matusik R. J., Orgebin-Crist M. C. Molecular evolution of epididymal lipocalin genes localized on mouse chromosome 2. Gene (2004) 339:49–59.[CrossRef][ISI][Medline]
- Kirchhoff C., Osterhoff C., Pera I., Schröter S. Function of human epididymal proteins in sperm maturation. Andrologia (1998) 30:225–232.[ISI][Medline]
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





