QSAR studies of BBR analogues against coxsackievirus B1
Bulletin of the National Research Centre volume 46, Article number: 14 (2022)
Coxsackievirus group B (CVBs) are common enteroviruses associated with several diseases from etiologically to inflammatory cardiomyopathies and constitute a severe cause of mortality in newborn resulting in severe meningitis, fulminant infection, myocarditis, and encephalitis. While Berberian (BBR) is an effective antivirus and possesses potentials of suppressing CVB replication, Zeng et al. explored a structural modification of BBR by incorporating a substituted primary amine enhance antiviral potency and safety. Based on data set from Zeng et al., we attempted to propose a QSAR model that can predict the bioactivity of unknown compounds as anti-CVB1.
Among many descriptors, four were selected using the Genetic Functional Approximation (GFA). Internal and external validation was carried out on data set using statistical parameters. The QSAR model was seen to meet the minimum requirement with Lack of fit = 0.068744, R2 0.897, Adjusted R2 = 0.8627, cross-validated R2 = 0.76169, R2 predicted = 0.68.
The predictive ability of the model was found to be satisfactory and could be used for designing a similar group of compounds.
Coxsackievirus group B (CVB) is a member of the genus Enterovirus of Picornaviridae family responsible for many heart, liver, pleura, and pancreas infections (Harb et al. 2021). CVBs are an important cause of mortality in newborn and cause severe meningitis, fulminant infection, myocarditis, and encephalitis (Kaplan et al. 1983). Group B coxsackievirus has six serotypes (1–6). CVB infection can be inhibited by some CVB inhibitors to some degrees. Although type 3 of CVBs is mostly responsible for chronic myocarditis and dilated cardiomyopathy, there is no effective drug for treating or method for diagnosing CVB3 infections.
Berberian is a quaternary ammonium salt of an isoquinoline alkaloid and could be extracted from many plants (Gaba et al. 2021). Berberian (BBR) is often used in many Chinese herbal medicines for diarrhoea treatment (Gaba et al. 2021). BBR has been reported to an efficient antiviral against several viruses (Tillhon et al. 2012). BBR has been identified to suppress replication of CVB3 by suppressing c-Jun N-terminal kinase (JNK) and p38 MAPK activation (Dai et al. 2017).
In a recent study by Zeng, 28 BBR analogues were prepared by total or semi-synthesized routine while the activity of BBRs against all CVB serotypes explored by introducing various substituted amine hydrochlorides and tert-butyl carbamate on BBR core (Zeng et al. 2020). Structural modifications were performed to determine structural-activity relationship (SAR) of BBR against CVBs (Zeng et al. 2020). A combination of the half-maximal inhibitory concentration (IC50) and selectivity index (SI) was used to evaluate the potency against CVBs strains. The results showed that introducing the substituted amine on position 3 of BBR core yields higher anti-CVB (Zeng et al. 2020).
QSAR modelling introduced by Corwin Hansch is popularly used in chemistry, toxicology, and pharmaceutical chemistry (Cherkasov et al. 2014). Based on the data set from Zeng et al. (Zeng et al. 2020), we proposed a QSAR model to predict the bioactivity of unknown compounds using Genetic Function Approximation Multi-Linear Regression (GFA-MLR) Technique.
The structural modification carried out by Zeng et al. suggests that compounds with potent anti-coxsackievirus properties could be synthesized via Ligand-based drug design (LBDD). The overall objective of current study is to propose a QSAR model that can predict the bioactivity of BBR analogues against Coxsackie virus B1. The outline followed in developing QSAR model starts with selecting appropriate descriptors for the given structures of BBR analogues, building models to predict activity and validating proposed model (Fig. 1).
Twenty-eight BBR analogues were prepared by introducing each amine hydrochlorides and tert-butyl carbamate on four positions (2, 3, 9 or 10) on the BBR cores (Table 1) to improve the bioavailability (Zeng et al. 2020). Out of the 28 compounds prepared, 24 compounds were randomly selected (which excludes four compounds: 4, 19, 26 and 27) and used for QSAR study. Bioactivities of selected compounds were obtained by taking a negative logarithm of the IC50 (Table 1). The structures of the selected 24 BBR analogues in Table 1 were drawn with ChemDraw 20.0 (Ikwu et al. 2020). Sparta v14 was employed to optimize BBR structures via Density Functional Theory (DFT), while Becke Three Lee Yang Parr (B3LYP) correlation and 6-31G* basis set were adopted (Scalmani and Frisch 2010; Lee et al. 1988). Optimized structures were processed through PaDel descriptor software to generate the 1D, 2D and 3D molecular descriptors.
Optimized molecular descriptors were pre-treated using Drug Theoretical and Cheminformatics Laboratory (DTC Lab) to decrease collinearity and data redundance (Oyeneyin et al. 2021). Kennard and Stone’s algorithm (Kennard and Stone 1969) was used to split the data set into 30% and 70% test and training set, respectively. Training data set was then subjected to Genetic Function Approximation (GFA) through Material Studio 2017 software to generate and internally validate four models. Multi-Linear Regression (MLR) was used to predict relationship between the bioactivity of the compounds (dependent variable) and their molecular descriptors (independent variable).
Built QSAR model was statistically analysed to validate its stability, reproducibility, reliability, and robustness. External validation of proposed model with test data set utilized equations provided in Ikwu et al. 2020. QSAR model predictive ability was then examined through variety of statistical approaches. Variance Inflation Factor (VIF) and Mean effect (ME) analysis clarified the dependability of the molecular descriptors on each other and highlight the descriptor with the highest effect. The applicability domain (AD) technique employed in this study is Leverage method. The technique helps to understand influential and outlier descriptors.
Four Molecular descriptors used in this study include averaged and centred moreau-broto autocorrelation of lag 7 weighted by ionization potential (AATSC7i), Maximum E-State descriptors of strength for potential Hydrogen Bonds of path length 10 (maxHBint10), Charge weighted partial positive surface area * total molecular surface area/1000 (WPSA-3), and Radial Distribution Function-110/ weighted by Sanderson electronegativity (RDF110e).
The relationship between the descriptors was analysed using Pearson correlation (Table 2). In addition to a pairwise correlation between four molecular descriptors examined in Table 2, the Variance Inflation Factor (VIF) was deployed to probe the severity of multicollinearity in regression analysis (Table 3).
To examine the relative importance as well as the contribution of each of the four molecular descriptors considered in the model, the value of the mean effect (ME) was calculated for each descriptor. Its sign indicates the direction in the values of the activities resulting from either an increase or decrease in descriptor values.
Deploying Genetic Function Approximation Multi-Linear Regression (GFA-MLR) technique yielded four model equations for BBR activity (pIC50) shown in Eqs. 1–4, while their respective validation parameters are presented in Table 3.
The respective structure for BBR compounds was presented with their experimental bioactivity, predicted bioactivity and the residual in Table 5. William’s plot was used to show the relationship between the leverages and standardized residual and visualize the applicability domain (Fig. 3). Warning limit denoted as h* was used to identify compounds that are influential or outliers. Compounds with leverage value higher than the warning limit reduce confidence of the model and may be considered outliers or influential. The plot of standardized residual versus experimental activity (training and test set) is presented in Fig. 3.
The four molecular descriptors that contributed to model development comprise of two 2D and two 3D descriptors (Table 2). The relationship between the descriptors was analysed using Pearson correlation (Table 2). A large-to-medium positive correlation (r > 0.5) exists between AATSC7i, maxHBint10, and WPSA-3 (Table 2). Largest positive correlation (r = ~ 0.82) was recorded between AATSC7i, and maxHBint10 which are 2D descriptors. However, correlation results indicate negative correlation between RDF110e and other molecular descriptors. The weakest negative correlation (r = ~ 0.34) was displayed between the 3D descriptors; RDF110e and WPSA-3. Overall, 3D descriptors could affect efficiency of developed QSAR model used in predicting the BBR analogues independently, while the 2D descriptors mostly depend on each other for effectiveness.
VIF above 4 could indicate existence of multicollinearity which may require further investigation (Marquaridt 1970). However, VIF higher than 10 is considered as an indicator for a significant multicollinearity requiring correction (Marquaridt 1970). While most molecular descriptors (AATSC7i, WPSA-3, and RDF110e) showed no multicollinearity (VIF < 4), maxHBint10 displayed some level of collinearity (VIF = 5.83). Overall, across four molecular descriptors considered for modelling, VIF measured was within a range of 1.83 to 5.83 indicating absence of any significant multicollinearity (VIF < 10).
Negative ME values for AATSC7i, maxHBint10, and RDF110e indicate that activity is inverse relationship with activity (Table 3). In contrast to other descriptors, positive ME value for WPSA-3 (ME = ~ 0.78) signifies a direct relationship with activity (Table 3). The significance of the descriptors is indicated by the magnitude of ME values as WPSA-3 > RDF110e > maxHBint10 > AATSC7i. Results show that WPSA-3 and RDF110e which are 3D descriptors contribute more significantly to the built model than the other two 2D descriptors—maxHBint10 and AATSC7i (Table 3). Findings reveal that 3D descriptors played the major role in building the model.
The critical significance of regression (SOR) F-value ~ 3.306 was applicable across all four model equations while considering a confidence interval (COI) of 95%. All four models show SOR F-values larger than the critical SOR F-value (Table 4). The best of the models was selected due to its ability to meet the minimum recommended parameter required for a QSAR model (Veerasamy et al. 2011). Across all four model equations generated, Eq. 1 records the largest adjusted R-squared value ~ 0.863 along with the smallest minimum experiment error (~ 0.09) for non-significant lack of fit (LOF).
The correlation coefficient of the experimental bioactivity and the predicted activity R2 is 0.8474 based on train and test set combined (Fig. 2). This implies that there is a strong correlation between the experimental activity and the predicted activity from the model. Based on test set, model yields a predicted correlation (R2pred) of 0.68 which is greater than the minimum recommended value (R2pred > 0.6) for a successful QSAR model (Veerasamy et al. 2011).
Results show that three compounds from test set yielded leverage higher than the warning limit of 0.88 (Fig. 3). The three compounds 12, 15 and 23 (Table 1) found outside the warning limit are considered influential as they might possess fewer of the major chemical descriptors present in other compounds within the overall data set.
Standardized residual versus experimental activity for test and train data set (Fig. 4) reveals that residual values of the bioactivities of the compounds used for the study are evenly distributed between y-axis (not totally positive or negative values) which makes the model free from bias or systematic error. Standardized residual recorded for this model falls within ± 3 which is a common threshold for accepting predictions (Jaworska et al. 2005).
While BBR presents an alternative pathway to suppressing CVB replication, a structural modification of BBR can further enhance the antiviral potency and safety of BBR. In a bid to enhance potency and safety of BBR in CVB treatment, Zeng et al. deployed a strategy that involved an incorporation of a primary amine hydrochlorides on position 3 of BBR core. Promising results from Zeng et al. provide data set from which activity of formulated BBR could be predicted using QSAR.
A QSAR study of 24 BBR analogues was performed based on the theoretical molecular descriptors calculated by the PaDel descriptor software and selected by GFA. Developed model passed through an internal and external validation to indicate a robust and satisfactory model, while four major descriptors were considered as structural features responsible for activity.
Our findings show that the activity of the studied compounds mainly depends mostly on 3D descriptors with level of significance as WPSA-3 > RDF110e > maxHBint10 > AATSC7i. The QSAR model developed in current study can predict the activity of new compounds and help to design new compounds with improved activity.
Availability of data and materials
Coxsackievirus group B
Genetic function approximation
Quantitative structure activity relationship
Cherkasov A, Muratov EN, Fourches D, Varnek A, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R, Consonni V, Kuzmin VE, Cramer R, Benigni R, Yang C, Rathman J, Terfloth L, Gasteiger J, Richard A, Tropsha A (2014) QSAR modeling: Where have you been? Where are you going to? J Med Chem 57(12):4977–5010. https://doi.org/10.1021/jm4004285
Dai Q, Di Zhang HY, Xie W, Xin R, Wang L, Xu X, He X, Xiong J, Sheng H, Le Zhang KZ, Hu X (2017) Berberine restricts coxsackievirus B type 3 replication via inhibition of c-Jun N-terminal kinase (JNK) and p38 MAPK activation in vitro. Med Sci Monit 23:1448–1455
Gaba S, Saini A, Singh G, Monga V (2021) An insight into the medicinal attributes of berberine derivatives: a review. Bioorg Med Chem 38:116143. https://doi.org/10.1016/j.bmc.2021.116143
Harb J, Mennesson N, Lepetit C, Fourny M, Louvois M, Bosseboeuf A, Allain-Maillet S, Decaux O, Moreau C, Tallet A, Piver E, Moreau P, Salle V, Bigot-Corbel E, Hermouet S (2021) Comparison of monoclonal gammopathies linked to poliovirus or coxsackievirus vs. other infectious pathogens. Cells. https://doi.org/10.3390/cells10020438
Ikwu FA, Isyaku Y, Obadawo BS, Lawal HA, Ajibowu SA (2020a) In silico design and molecular docking study of CDK2 inhibitors with potent cytotoxic activity against HCT116 colorectal cancer cell line. J Genet Eng Biotechnol 18(1):51. https://doi.org/10.1186/s43141-020-00066-2
Jaworska J, Nikolova-Jeliazkova N, Aldenberg T (2005) QSAR applicability domain estimation by projection of the training set in descriptor space: a review. Altern Lab Anim 33(5):445–459
Kaplan MH, Wayne Klein S, McPhee J, Harper RG (1983) Group B coxsackievirus infections in infants younger than three months of age: a serious childhood illness. Rev Infect Dis 5(6):1019–1032. https://doi.org/10.1093/clinids/5.6.1019
Kennard RW, Stone LA (1969) Computer aided design of experiments. Technometrics 11(1):137–148. https://doi.org/10.1080/00401706.1969.10490666
Lee C, Yang W, Parr RG (1988) Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density. Phys Rev B Condens Matter 37(2):785–789. https://doi.org/10.1103/physrevb.37.785
Marquaridt DW (1970) Generalized inverses, ridge regression, biased linear estimation, and nonlinear estimation. Technometrics 12(3):591–612. https://doi.org/10.1080/00401706.1970.10488699
Oyeneyin OE, Obadawo BS, Olanrewaju AA, Owolabi TO, Gbadamosi FA, Ipinloju N, Modamori HO (2021) Predicting the bioactivity of 2-alkoxycarbonylallyl esters as potential antiproliferative agents against pancreatic cancer (MiaPaCa-2) cell lines: GFA-based QSAR and ELM-based models with molecular docking. J Genet Eng Biotechnol 19(1):38. https://doi.org/10.1186/s43141-021-00133-2
Scalmani G, Frisch MJ (2010) Continuous surface charge polarizable continuum models of solvation. I. General formalism. J Chem Phys 132(11):114110. https://doi.org/10.1063/1.3359469
Tillhon M, Guaman Ortiz LM, Lombardi P, Scovassi AI (2012) Berberine: new perspectives for old remedies. Biochem Pharmacol 84(10):1260–1267. https://doi.org/10.1016/j.bcp.2012.07.018
Veerasamy R, Rajak H, Jain A, Sivadasan S, Varghese CP, Agrawal RK (2011) Validation of QSAR models-strategies and importance. Int J Drug Des Discov 2(3):511–519
Zeng QX, Wang HQ, Wei W, Guo TT, Yu L, Wang YX, Li YH, Song DQ (2020) Synthesis and biological evaluation of berberine derivatives as a new class of broad-spectrum antiviral agents against Coxsackievirus B. Bioorg Chem 95:103490. https://doi.org/10.1016/j.bioorg.2019.103490
Ethics approval and consent to participate
Consent for publication
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
See Table 5.
About this article
Cite this article
Obadawo, B.S., Asogwa, U. & Ali, A.A. QSAR studies of BBR analogues against coxsackievirus B1. Bull Natl Res Cent 46, 14 (2022). https://doi.org/10.1186/s42269-022-00698-z