Skip to main content

QSAR studies of BBR analogues against coxsackievirus B1



Coxsackievirus group B (CVBs) are common enteroviruses associated with several diseases from etiologically to inflammatory cardiomyopathies and constitute a severe cause of mortality in newborn resulting in severe meningitis, fulminant infection, myocarditis, and encephalitis. While Berberian (BBR) is an effective antivirus and possesses potentials of suppressing CVB replication, Zeng et al. explored a structural modification of BBR by incorporating a substituted primary amine enhance antiviral potency and safety. Based on data set from Zeng et al., we attempted to propose a QSAR model that can predict the bioactivity of unknown compounds as anti-CVB1.


Among many descriptors, four were selected using the Genetic Functional Approximation (GFA). Internal and external validation was carried out on data set using statistical parameters. The QSAR model was seen to meet the minimum requirement with Lack of fit = 0.068744, R2 0.897, Adjusted R2 = 0.8627, cross-validated R2 = 0.76169, R2 predicted = 0.68.


The predictive ability of the model was found to be satisfactory and could be used for designing a similar group of compounds.


Coxsackievirus group B (CVB) is a member of the genus Enterovirus of Picornaviridae family responsible for many heart, liver, pleura, and pancreas infections (Harb et al. 2021). CVBs are an important cause of mortality in newborn and cause severe meningitis, fulminant infection, myocarditis, and encephalitis (Kaplan et al. 1983). Group B coxsackievirus has six serotypes (1–6). CVB infection can be inhibited by some CVB inhibitors to some degrees. Although type 3 of CVBs is mostly responsible for chronic myocarditis and dilated cardiomyopathy, there is no effective drug for treating or method for diagnosing CVB3 infections.

Berberian is a quaternary ammonium salt of an isoquinoline alkaloid and could be extracted from many plants (Gaba et al. 2021). Berberian (BBR) is often used in many Chinese herbal medicines for diarrhoea treatment (Gaba et al. 2021). BBR has been reported to an efficient antiviral against several viruses (Tillhon et al. 2012). BBR has been identified to suppress replication of CVB3 by suppressing c-Jun N-terminal kinase (JNK) and p38 MAPK activation (Dai et al. 2017).

In a recent study by Zeng, 28 BBR analogues were prepared by total or semi-synthesized routine while the activity of BBRs against all CVB serotypes explored by introducing various substituted amine hydrochlorides and tert-butyl carbamate on BBR core (Zeng et al. 2020). Structural modifications were performed to determine structural-activity relationship (SAR) of BBR against CVBs (Zeng et al. 2020). A combination of the half-maximal inhibitory concentration (IC50) and selectivity index (SI) was used to evaluate the potency against CVBs strains. The results showed that introducing the substituted amine on position 3 of BBR core yields higher anti-CVB (Zeng et al. 2020).

QSAR modelling introduced by Corwin Hansch is popularly used in chemistry, toxicology, and pharmaceutical chemistry (Cherkasov et al. 2014). Based on the data set from Zeng et al. (Zeng et al. 2020), we proposed a QSAR model to predict the bioactivity of unknown compounds using Genetic Function Approximation Multi-Linear Regression (GFA-MLR) Technique.


The structural modification carried out by Zeng et al. suggests that compounds with potent anti-coxsackievirus properties could be synthesized via Ligand-based drug design (LBDD). The overall objective of current study is to propose a QSAR model that can predict the bioactivity of BBR analogues against Coxsackie virus B1. The outline followed in developing QSAR model starts with selecting appropriate descriptors for the given structures of BBR analogues, building models to predict activity and validating proposed model (Fig. 1).

Fig. 1
figure 1

Strategy of investigation

Twenty-eight BBR analogues were prepared by introducing each amine hydrochlorides and tert-butyl carbamate on four positions (2, 3, 9 or 10) on the BBR cores (Table 1) to improve the bioavailability (Zeng et al. 2020). Out of the 28 compounds prepared, 24 compounds were randomly selected (which excludes four compounds: 4, 19, 26 and 27) and used for QSAR study. Bioactivities of selected compounds were obtained by taking a negative logarithm of the IC50 (Table 1). The structures of the selected 24 BBR analogues in Table 1 were drawn with ChemDraw 20.0 (Ikwu et al. 2020). Sparta v14 was employed to optimize BBR structures via Density Functional Theory (DFT), while Becke Three Lee Yang Parr (B3LYP) correlation and 6-31G* basis set were adopted (Scalmani and Frisch 2010; Lee et al. 1988). Optimized structures were processed through PaDel descriptor software to generate the 1D, 2D and 3D molecular descriptors.

Table 1 Four positions considered for attachment of Amine hydrochlorides to BBR core and experimental bioactivity (Zeng et al. Jan 2020)

Optimized molecular descriptors were pre-treated using Drug Theoretical and Cheminformatics Laboratory (DTC Lab) to decrease collinearity and data redundance (Oyeneyin et al. 2021). Kennard and Stone’s algorithm (Kennard and Stone 1969) was used to split the data set into 30% and 70% test and training set, respectively. Training data set was then subjected to Genetic Function Approximation (GFA) through Material Studio 2017 software to generate and internally validate four models. Multi-Linear Regression (MLR) was used to predict relationship between the bioactivity of the compounds (dependent variable) and their molecular descriptors (independent variable).

Built QSAR model was statistically analysed to validate its stability, reproducibility, reliability, and robustness. External validation of proposed model with test data set utilized equations provided in Ikwu et al. 2020. QSAR model predictive ability was then examined through variety of statistical approaches. Variance Inflation Factor (VIF) and Mean effect (ME) analysis clarified the dependability of the molecular descriptors on each other and highlight the descriptor with the highest effect. The applicability domain (AD) technique employed in this study is Leverage method. The technique helps to understand influential and outlier descriptors.


Four Molecular descriptors used in this study include averaged and centred moreau-broto autocorrelation of lag 7 weighted by ionization potential (AATSC7i), Maximum E-State descriptors of strength for potential Hydrogen Bonds of path length 10 (maxHBint10), Charge weighted partial positive surface area * total molecular surface area/1000 (WPSA-3), and Radial Distribution Function-110/ weighted by Sanderson electronegativity (RDF110e).

The relationship between the descriptors was analysed using Pearson correlation (Table 2). In addition to a pairwise correlation between four molecular descriptors examined in Table 2, the Variance Inflation Factor (VIF) was deployed to probe the severity of multicollinearity in regression analysis (Table 3).

Table 2 Pearson Correlation of four molecular descriptors used in developing model
Table 3 Variance inflation factor and mean effect of the descriptors

To examine the relative importance as well as the contribution of each of the four molecular descriptors considered in the model, the value of the mean effect (ME) was calculated for each descriptor. Its sign indicates the direction in the values of the activities resulting from either an increase or decrease in descriptor values.

Deploying Genetic Function Approximation Multi-Linear Regression (GFA-MLR) technique yielded four model equations for BBR activity (pIC50) shown in Eqs. 14, while their respective validation parameters are presented in Table 3.

$$ {\text{pIC}}_{50} = \, 3.647808630 \times {\text{AATSC}}7i - 0.715291235 \times {\text{maxHBint}}10 + 0.019923483 \times {\text{WPSA}} - 3 - 0.035633683 \times {\text{RDF}}110{\text{e}} + 5.102395699 $$
$$ {\text{pIC}}_{50} = 0.000607683 \times {\text{ATSC}}7v - 0.518018878 \times {\text{maxHBint}}10 + 0.013005616 \times {\text{WPSA}} - 3 - 0.049090117 \times {\text{RDF}}110e + 5.561767598 $$
$$ {\text{pIC}}_{50} = \, 0.000563608 \times {\text{ATSC}}7v{ + }1.757351294 \times {\text{AATSC}}7i - 0.375415775 \times {\text{maxHBint}}10 - 0.043722625 \times {\text{RDF}}110e \, + \, 5.996304272 $$
$$ {\text{pIC}}_{50} = 0.000079054 \times {\text{ATSC}}7v + 1.146021773 \times {\text{AATSC}}7i + 0.005907707 \times {\text{WPSA}} - 3 - 0.020847632 \times {\text{RDF}}110e + 5.174332017 $$

The respective structure for BBR compounds was presented with their experimental bioactivity, predicted bioactivity and the residual in Table 5. William’s plot was used to show the relationship between the leverages and standardized residual and visualize the applicability domain (Fig. 3). Warning limit denoted as h* was used to identify compounds that are influential or outliers. Compounds with leverage value higher than the warning limit reduce confidence of the model and may be considered outliers or influential. The plot of standardized residual versus experimental activity (training and test set) is presented in Fig. 3.


The four molecular descriptors that contributed to model development comprise of two 2D and two 3D descriptors (Table 2). The relationship between the descriptors was analysed using Pearson correlation (Table 2). A large-to-medium positive correlation (r > 0.5) exists between AATSC7i, maxHBint10, and WPSA-3 (Table 2). Largest positive correlation (r =  ~ 0.82) was recorded between AATSC7i, and maxHBint10 which are 2D descriptors. However, correlation results indicate negative correlation between RDF110e and other molecular descriptors. The weakest negative correlation (r =  ~ 0.34) was displayed between the 3D descriptors; RDF110e and WPSA-3. Overall, 3D descriptors could affect efficiency of developed QSAR model used in predicting the BBR analogues independently, while the 2D descriptors mostly depend on each other for effectiveness.

VIF above 4 could indicate existence of multicollinearity which may require further investigation (Marquaridt 1970). However, VIF higher than 10 is considered as an indicator for a significant multicollinearity requiring correction (Marquaridt 1970). While most molecular descriptors (AATSC7i, WPSA-3, and RDF110e) showed no multicollinearity (VIF < 4), maxHBint10 displayed some level of collinearity (VIF = 5.83). Overall, across four molecular descriptors considered for modelling, VIF measured was within a range of 1.83 to 5.83 indicating absence of any significant multicollinearity (VIF < 10).

Negative ME values for AATSC7i, maxHBint10, and RDF110e indicate that activity is inverse relationship with activity (Table 3). In contrast to other descriptors, positive ME value for WPSA-3 (ME =  ~ 0.78) signifies a direct relationship with activity (Table 3). The significance of the descriptors is indicated by the magnitude of ME values as WPSA-3 > RDF110e > maxHBint10 > AATSC7i. Results show that WPSA-3 and RDF110e which are 3D descriptors contribute more significantly to the built model than the other two 2D descriptors—maxHBint10 and AATSC7i (Table 3). Findings reveal that 3D descriptors played the major role in building the model.

The critical significance of regression (SOR) F-value ~ 3.306 was applicable across all four model equations while considering a confidence interval (COI) of 95%. All four models show SOR F-values larger than the critical SOR F-value (Table 4). The best of the models was selected due to its ability to meet the minimum recommended parameter required for a QSAR model (Veerasamy et al. 2011). Across all four model equations generated, Eq. 1 records the largest adjusted R-squared value ~ 0.863 along with the smallest minimum experiment error (~ 0.09) for non-significant lack of fit (LOF).

Table 4 Important validation parameter of genetic function approximation (GFA) technique

The correlation coefficient of the experimental bioactivity and the predicted activity R2 is 0.8474 based on train and test set combined (Fig. 2). This implies that there is a strong correlation between the experimental activity and the predicted activity from the model. Based on test set, model yields a predicted correlation (R2pred) of 0.68 which is greater than the minimum recommended value (R2pred > 0.6) for a successful QSAR model (Veerasamy et al. 2011).

Fig. 2
figure 2

Predicted versus experimental activity for test and train data set

Results show that three compounds from test set yielded leverage higher than the warning limit of 0.88 (Fig. 3). The three compounds 12, 15 and 23 (Table 1) found outside the warning limit are considered influential as they might possess fewer of the major chemical descriptors present in other compounds within the overall data set.

Fig. 3
figure 3

Standard residual versus leverages for test and train data set

Standardized residual versus experimental activity for test and train data set (Fig. 4) reveals that residual values of the bioactivities of the compounds used for the study are evenly distributed between y-axis (not totally positive or negative values) which makes the model free from bias or systematic error. Standardized residual recorded for this model falls within ± 3 which is a common threshold for accepting predictions (Jaworska et al. 2005).

Fig. 4
figure 4

Standardized residual versus experimental activity for test and train data set


While BBR presents an alternative pathway to suppressing CVB replication, a structural modification of BBR can further enhance the antiviral potency and safety of BBR. In a bid to enhance potency and safety of BBR in CVB treatment, Zeng et al. deployed a strategy that involved an incorporation of a primary amine hydrochlorides on position 3 of BBR core. Promising results from Zeng et al. provide data set from which activity of formulated BBR could be predicted using QSAR.

A QSAR study of 24 BBR analogues was performed based on the theoretical molecular descriptors calculated by the PaDel descriptor software and selected by GFA. Developed model passed through an internal and external validation to indicate a robust and satisfactory model, while four major descriptors were considered as structural features responsible for activity.

Our findings show that the activity of the studied compounds mainly depends mostly on 3D descriptors with level of significance as WPSA-3 > RDF110e > maxHBint10 > AATSC7i. The QSAR model developed in current study can predict the activity of new compounds and help to design new compounds with improved activity.

Availability of data and materials

Not applicable.



Coxsackievirus group B


Genetic function approximation


Multi-linear regression


Quantitative structure activity relationship


Download references


Not applicable.



Author information

Authors and Affiliations



BSO contributed to conceptualization, methodology, investigation, data curation. UA contributed to conceptualization, methodology, investigation, data curation, investigation, writing, review and editing. AAA contributed to conceptualization, methodology, investigation, data curation, review and editing. All authors also read and approved the final manuscript.

Corresponding author

Correspondence to Babatunde Samuel Obadawo.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



See Table 5.

Table 5 Experimental, predicted bioactivity, residual and residual standardized of BBR analogues

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Obadawo, B.S., Asogwa, U. & Ali, A.A. QSAR studies of BBR analogues against coxsackievirus B1. Bull Natl Res Cent 46, 14 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: