登录    注册    忘记密码

详细信息

GcForest-based compound-protein interaction prediction model and its application in discovering small-molecule drugs targeting CD47  ( SCI-EXPANDED收录)   被引量:1

文献类型:期刊文献

英文题名:GcForest-based compound-protein interaction prediction model and its application in discovering small-molecule drugs targeting CD47

作者:Shan, Wenying[1,2] Chen, Lvqi[1] Xu, Hao[3,4] Zhong, Qinghao[5] Xu, Yinqiu[6] Yao, Hequan[1] Lin, Kejiang[1] Li, Xuanyi[1]

第一作者:Shan, Wenying

通信作者:Yao, HQ[1];Lin, KJ[1];Li, XY[1]

机构:[1]China Pharmaceut Univ, Sch Pharm, Dept Med Chem, Nanjing, Peoples R China;[2]Univ Macau, Fac Hlth Sci, Macau, Peoples R China;[3]Chinese Acad Forestry, Inst Chem Ind Forest Prod, Nanjing, Peoples R China;[4]Natl Engn Lab Biomass Chem Utilizat, Nanjing, Peoples R China;[5]Chinese Univ Hong Kong, Sch Humanities & Social Sci, Shenzhen, Peoples R China;[6]Nanjing Univ, Nanjing Drum Tower Hosp, Dept Pharm, Affiliated Hosp,Med Sch, Nanjing, Peoples R China

年份:2023

卷号:11

外文期刊名:FRONTIERS IN CHEMISTRY

收录:;Scopus(收录号:2-s2.0-85175858746);WOS:【SCI-EXPANDED(收录号:WOS:001092371800001)】;

基金:Generous financial support from the National Natural Science Foundation of China (81903439) and the Natural Science Foundation of Jiangsu Province of China (BK20190562) is gratefully acknowledged.r The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This study was supported by the Projects 81903439 of the National Natural Science Foundation of China, the Natural Science Foundation of Jiangsu Province of China (No. BK20190562).

语种:英文

外文关键词:artificial intelligence; word2vec; GcForest; compound-protein interaction prediction; small-molecule CD47 inhibitors

摘要:Identifying compound-protein interaction plays a vital role in drug discovery. Artificial intelligence (AI), especially machine learning (ML) and deep learning (DL) algorithms, are playing increasingly important roles in compound-protein interaction (CPI) prediction. However, ML relies on learning from large sample data. And the CPI for specific target often has a small amount of data available. To overcome the dilemma, we propose a virtual screening model, in which word2vec is used as an embedding tool to generate low-dimensional vectors of SMILES of compounds and amino acid sequences of proteins, and the modified multi-grained cascade forest based gcForest is used as the classifier. This proposed method is capable of constructing a model from raw data, adjusting model complexity according to the scale of datasets, especially for small scale datasets, and is robust with few hyper-parameters and without over-fitting. We found that the proposed model is superior to other CPI prediction models and performs well on the constructed challenging dataset. We finally predicted 2 new inhibitors for clusters of differentiation 47(CD47) which has few known inhibitors. The IC50s of enzyme activities of these 2 new small molecular inhibitors targeting CD47-SIRP alpha interaction are 3.57 and 4.79 mu M respectively. These results fully demonstrate the competence of this concise but efficient tool for CPI prediction.

参考文献:

正在载入数据...

版权所有©中国林业科学研究院 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心