详细信息
基于高通量测序的瑞香狼毒转录组数据分析 被引量:2
Transcriptome characterization of Stellera chamaejasme with Illumina sequencing technology
文献类型:期刊文献
中文题名:基于高通量测序的瑞香狼毒转录组数据分析
英文题名:Transcriptome characterization of Stellera chamaejasme with Illumina sequencing technology
第一作者:杨艳芳
机构:[1]中国林业科学研究院林业研究所,林木遗传育种国家重点实验室 国家林业局林木培育重点实验室,北京100091;[2]河北省科学院生物研究所,河北省主要农作物病害微生物控制工程技术研究中心,河北石家庄050081
年份:2017
卷号:48
期号:22
起止页码:4740-4747
中文期刊名:中草药
外文期刊名:Chinese Traditional and Herbal Drugs
收录:CSTPCD;;Scopus;北大核心:【北大核心2014】;CSCD:【CSCD2017_2018】;
基金:中央级公益性科研院所基本科研业务费专项(CAFYBB2014QB001);国家自然科学基金项目(31570675)
语种:中文
中文关键词:瑞香狼毒;转录组;萜类;佛波酯;代谢通路
外文关键词:Stellera chamaejasme L.; transcriptome; terpenoid; phorbol ester; metabolic pathway
分类号:R282.12
摘要:目的获得瑞香狼毒Stellera chamaejasme转录组数据库代谢途径基因序列、SSR以及转座子等信息。方法以瑞香狼毒根作为受试材料,采用二代测序方法中的Illumina Hi Seq 2000进行转录组测序,并进行系统的生物信息学分析。结果共获得26 785 872个Clean reads片段,拼接得到47 053条Unigenes,平均长度为419 nt。将拼装所得到的Unigene序列利用BLAST工具分别与Nr、Swiss-Prot、KEGG、COG和GO数据库进行比对,分别有11 138和24 744条Unigene在Nr和Swiss-Prot数据库中比对得到了注释信息,可归于36个GO分类,涉及119个KEGG标准代谢通路,进一步分析发现15条萜类生物合成途径的关键酶基因。利用MISA软件发现3 480个SSR,数量最高的SSR类型为单碱基重复,为1 986条,出现频率为57.07%,最少的是六碱基重复SSR,只有5条,出现频率仅为0.14%。利用Repeat Masker在线工具针对瑞香狼毒转录组序列进行转座子预测分析,结果共发现有1 497条转座子,其中E值<1×10-5的序列有827条,包含22种类型转座子,数目最多的为LINE/L1类型(405条),占比为48.97%,占比最少的为DNA/Ginger、DNA/h AT、DNA/PIF-ISL2EU和LINE/Jockey以及LTR/Lenti类型分别只有1条。结论对瑞香狼毒进行高通量测序,获得了大量基因序列信息以及SSR和转座子信息,为今后分离克隆瑞香狼毒中佛波酯等有效成分生物合成的关键酶基因以及开展相关分子机制研究提供了数据资源和理论基础。
Objective To obtain the transcriptome database and gene sequence, SSR as well as transposon information of Stellera chamaejasme. Methods Using the high-throughput sequencing platform(Illumina Hi Seq 2000), a root transcriptome dataset of S. chamaejasme was obtained, and the sequencing results were analyzed with the bioinformatic way. Results With a total of 26 785 872 clean reads, 47 053 unigenes were assembled. All these unigenes were then blasted with Nr, Swiss-Prot, KEGG, COG, and GO databases. There were 11 138 and 24 744 unigenes were annotated with Nr and Swiss-Prot databases, respectively. The unigenes were involved in 36 GO-terms and 119 metabolic pathways. Further analysis showed that 15 unigenes were involved in terpenoids biosynthesis. Using MISA software, the results showed that there were 3 480 SSR from the 47 053 unigenes, and the most type of SSR was mononucleotide(1 986) with the frequency of 57.07%. Moreover, the hexanucleotide only had five repeat SSR and the frequency was only 0.14%. With Repeat Masker online tools to analyze the transposon of the transcriptome sequences, the results indicated that there were 1 497 transposons, and the number of transposons with E 1×10-5 was 827. All the transposons were grouped into 22 types, and the LINE/L1 type(405) had the highest frequency(48.97%). The DNA/Ginger, DNA/h AT, DNA/PIF-ISL2 EU, and LINE/Jockey as well as LTR/Lenti were the least type since each of them has only one transposon. Conclusion In this study, richsequence information of gene, SSR as well as transposon information of Stellera chamaejasme is helpful to carry out the research of the molecular mechanism of phorbol ester biosynthesis in S. chamaejasme in the future.
参考文献:
正在载入数据...