详细信息
滇东南濒危植物长梗杜鹃转录组微卫星特征分析 被引量:12
Characteristic Analysis of Microsatellites in the Transcriptome of Rhododendron longipedicellatum,an Endangered Species Endemic to Southeastern Yunnan,China
文献类型:期刊文献
中文题名:滇东南濒危植物长梗杜鹃转录组微卫星特征分析
英文题名:Characteristic Analysis of Microsatellites in the Transcriptome of Rhododendron longipedicellatum,an Endangered Species Endemic to Southeastern Yunnan,China
作者:李太强[1] 刘雄芳[1] 万友名[1] 李正红[1] 李钰莹[1] 刘秀贤[1] 马宏[1]
第一作者:李太强
机构:[1]中国林业科学研究院资源昆虫研究所
年份:2017
卷号:30
期号:4
起止页码:533-541
中文期刊名:林业科学研究
外文期刊名:Forest Research
收录:CSTPCD;;Scopus;北大核心:【北大核心2014】;CSCD:【CSCD2017_2018】;
基金:"云南省技术创新人才"培养对象项目(2016HB007)
语种:中文
中文关键词:长梗杜鹃;转录组;微卫星特征;潜在多态性
外文关键词:Rhododendron longipedicellatum ; transcriptome ; microsatellites characteristics ; potential of polymorphism
分类号:S685.21
摘要:[目的]全面了解滇东南特有濒危植物长梗杜鹃转录组SSR位点的分布及序列特征,为长梗杜鹃的保护和合理开发利用提供遗传学资料,为同属植物及近缘种SSR标记的开发及遗传研究提供便利。[方法]利用Illumina Hiseq 4000高通量测序平台对长梗杜鹃叶片进行转录组测序,再通过MISA软件对测序所得Unigenes进行SSR位点的发掘和分析。[结果]发现含SSR的序列17 354条,共得到23 192个SSR,出现频率为31.30%,平均每3 kb出现1个SSR。二碱基和三碱基重复为长梗杜鹃SSR主要重复单元类型,分别占SSR总数的69.25%和15.07%,187种重复基元中,所占比例最高的是(AG/CT)n(62.01%),其次是(A/T)n(12.34%)、(AC/GT)n(4.52%)和(AAG/CTT)n(4.23%)。在SSR和CDS的交集基因中,共发现15 908个SSR位点,其中2 792个位于编码区,出现频率为0.076 SSR/kb,而非编码区为0.344 SSR/kb,在基因编码区中出现频率最高的是三碱基重复(1 356,48.57%)。在不同长度重复单元中,二碱基重复SSR长度变异程度最高,其次是单碱基重复。长梗杜鹃SSR的频率和长度呈显著负相关(P<0.01),相关系数为-0.566。[结论]长梗杜鹃转录组SSR位点的出现频率高、分布密度大、基元类型丰富、重复次数较高、长片段较多,具有较高的多态性潜能,用于遗传分析的潜力很大,能满足该物种的保护遗传学研究。
[Objective] To comprehensively understand the distribution and sequence characteristics of SSR loci in the Rhododendron longipedicellatum transcriptome, and to provide a theoretical basis for further development of high efficient SSR markers. [Method] Transcriptome sequencing was conducted on young leaves of R. longipedicellatum by using Illumina Hiseq 4000. Then the SSR loci were sought and analyzed using MISA software from the obtained unigenes. [Result] A total of 23,192 SSRs were identified in 17,354 unigenes, with an average density of one SSR per 3 kb. Dinucleotide and trinucleotide repeat were the main SSR types, accounting for 69.25% and 15.07% of all SSRs, respectively. Among all the 187 repeat motifs, (AG/CT)n was the most frequent repeat motif (62.01%), followed by (A/T)n (12.34%), (AC/GT)n (4.52%) and (AAG/CTT)n (4.23%). A total of 15,908 SSRs occurred in the intersection of SSR and CDS, only 2792 of which occurred in protein-coding regions of these sequences. The density of SSRs was 0.076 SSR/kb in coding regions which was significantly lower than that in non-coding regions (0.344 SSR/kb). Moreover, trinucleotide repeat was the most abundant in coding regions (1356, 48.57%). In terms of different length repeat units, the variation of the length of dinucleotide repeat SSR was the most abundant, followed by the mononucleotide. There was a significant negative correlation (P〈0.01) between the frequency of SSR and the length, with the correlation coefficient of -0.566. [Conclusion] The SSR loci in the R. longipedicellatum transcriptome showed high frequency and density of distribution, rich repeat motifs, high repeat times, more long fragment and significant potential of polymorphism. The SSR loci could be applied in genetic analysis and conservation genetics of R. longipedicellatum in the future.
参考文献:
正在载入数据...