Skip to main content

Chemo-informatics activity prediction, ligand based drug design, Molecular docking and pharmacokinetics studies of some series of 4, 6-diaryl-2-pyrimidinamine derivatives as anti-cancer agents



The most well-known cause of cancer deaths identified in female is breast cancer. Several drugs approved by the food and drug administration (FDA) for the treatment of breast cancer may have adverse health effects. This research is aimed at developing a QSAR model and utilize it to predict the inhibitive activities of newly designed novel compounds, examine their ADMET and drug-likeness properties and carry out molecular docking studies between the designed compounds and the VEGFR-2 receptors in order to identify the essential amino acid residues involved in protein–ligand interactions and possible mechanism of action of the designed compounds.


The first model was selected as the best because of its fitness statistically with the following assessment parameters: R2train = 0.832, R2adj = 0.79, R2ext = 0.62, Q2 = 0.68, and LOF = 0.14509. Compound 11 was selected as a template to design new powerful compounds based on its low residual and high pIC50 values. Majority of the designed compounds has predicted pIC50 greater than that of the lead compound and the standard drug (Sunitinib) used as reference. Molecular docking studies results of the designed compounds revealed that they have higher docking scores than the template and the reference drug (Sunitinib) and are found to bind to the VEGFR-2 receptor in a similar manner to the reference drug. Pharmacokinetics and ADMET properties revealed that the designed compounds passed drug-likeness criteria because they did not violate more than 1 Lipinski’s rule of Five, They are uniformly distributed to the brain and are assumed to penetrate the central nervous system and finally they are all found to non-toxic and orally bioavailable.


The developed model was therefore found to be efficient in predicting the pIC50 of Anti breast cancer compounds that are yet to be synthesized and it also help in reducing the cost and synthetic duration the compounds. The result of this research confirmed that the designed compounds may be developed as novel VEGFR-2 inhibitors.


Cancer is one of the main reasons of death globally nowadays. The mortality rate as a results of numerous kinds of cancer continues to skyrocket globally with an estimated 12 million deaths in 2030 (Solomon et al. 2009). Cancer cells are unique from exceptional regular counterparts in a variety of biochemical processes, particularly during the cell division and growth control. One attribute of most cancer cells, that distinguishes them from other ordinary cells, is their high proliferative index. As a result, focusing on proliferative pathways which consequences in cell death through apoptosis is regarded as an effective way for fighting this disease (Chandrappa et al. 2009) The most well-known kind of cancers identified in female is breast cancer. Prone populace of breast cancer have frequent characteristics, which include advanced age, low parity, delayed age at first delivery, short period of breastfeeding, overeating, constrained exercising and so on. Breast cancer is a heterogeneous disease and many sub kinds have been defined (Liu et al. 2018). Estrogen receptor a (ERα) which is a member of most of the categories of nuclear receptors has been recognized as the prime internal reasons for the disease, which involves at least 70% breast cancer patients, and this kind of patients are identified as ER positive (ER+) (Liu et al. 2018). Tamoxifen, and raloxifene which are selective estrogen receptor modulators (SERMs) that rival with estradiol for binding with ERα in breast tissue are mostly used for the remedy of ER+ breast cancer (Wang et al. 2009). However, due to their adverse health effects in other tissues, many SERMs may have acute side effects, such as endometrial cancer (Leeuwen et al. 1994). Moreover, approximately 50% of ER-positive tumors patients either in the beginning does not react or developed resistance to these medications in a period of first five (5) years of remedy (Clarke et al. 2015). Human umbilical vein endothelial cell (HUVEC) are cells derived from the endothelium of veins from the umbilical cord. They play an important function as a standard scenario for the research on the regulation of endothelial cell function and are favorable for the evaluation of anti-angiogenesis impact by anti-proliferative investigation. It is regarded that vascular endothelial growth factor receptor-2 (VEGFR-2) found in HUVEC cells had a significant purpose in the angiogenesis route that take parts in the conversion, acceleration, and infringement of breast cancer cells (Hicklin and Ellis 2005). The receptor was found to play a key role in tamoxifen resistance through the Ras/MAPK route as suggested by many studies (Huang et al. 2008). Low dose of brivanib alaninate, which is a VEGFR-2 inhibitor in combination with tamoxifen was reported to magnify therapeutic efficacy and also to decelerate selective estrogen receptor modulator (SERM) resistant cancer elevation (Patel et al. 2010). The role of the 2-pyrimidinamine frame in medicinal chemistry is popular. Many of these compounds serves as anticancer agent. 4,6-diaryl-2- pyrimidinamine derivatives suggest good activity in the field of medicine. 2-Pyrimidinamine scaffold is the fundamental frame of pazopanib and JNJ-17029259 which are regarded as VEGFR-2-inhibitors (Liu et al. 2018).

Optimal anti-cancer tablets are supposed to annihilate cancer cells without causing a detrimental damage on normal tissues (Al-Suwaidan et al. 2016). But these drugs can cause damage or completely destroy some ordinary proliferating cells hence, global search for identifying new higher quality drugs that are safe for the prevention and remedy of cancer became necessary.

Quantitative Structure–Activity Relationships (QSAR) are computational relations that correlates the biological activities (response variable) of chemical compounds with their molecular structures (Independent variables) in a quantitative way (Hansch et al. 1995). The molecular descriptors consist of parameters that account for conformational, constitutional, thermodynamic, steric effects and electronic properties of a molecule (Umar et al. 2019).

The approach is mostly employed to determine the properties of new chemical species prior to their synthesis (Abdulfatai et al. 2017).

Furthermore, QSAR strategy reduces the extensive variety of the synthesizable compounds with the aid of assisting in figuring out the most promising candidates and there by means of reducing the prolonged time and cost of drug production (Ibrahim et al. 2020).

The main purpose of this research is to develop a QSAR model with regards to the compounds received from literature and use the model to predict the inhibitive activities of compounds prior to their synthesis, design new novel compounds and predict their activities using the model, perform molecular docking studies of the designed novel compounds with VEGFR-2 receptor kinase and to examine the Absorption, Distribution, metabolism and excretion (ADME) and drug-resembleness properties of the designed novel compounds.


Data collection

A library of thirty (30) 4, 6-diaryl-2-pyrimidinamine analogues as cancer agents with their inhibitive capacities (IC50) measured in µM are obtained from the research work of Liu et al. (2018). The inhibitive capacities of all the compounds were normalized by taking the negative logarithm to base 10 using Eq. (1) (Ibrahim et al. 2020).

$${\mathrm{PIC}}_{50} = - {\mathrm{Log }}\left( {{\mathrm{IC}}_{50} \times 10^{ - 6} } \right).$$

Geometry optimization

Chemdraw 12.0 software was employed to sketch the 2-Dimensional structure of all the 30 molecules in the data set. Prior to energy minimization, 2-Dimensional constructions of the molecules were automatically transformed to 3-Dimensional using Spartan 14 software program. Energy minimization was done to reduce constraint in the structures prior to finding the most stable conformation of the considered molecules on potential energy surface (Ibrahim et al. 2020).

Density functional theory (DFT) quantum mechanical calculation utilizing Bee -3- Lee Yang Par (B3LYP) method and 6-31G* basis set present in the Spartan 14.0 software application was used for Geometry optimization of all the thirty (30) compounds. Geometry optimization was conducted in order to locate the most stable structures of all the studied molecules on global minima on the potential energy surface (Ibrahim et al. 2020). New folder was created and the completely optimized structures were saved in Spatial Document File (sdf) format. In order to compute thermodynamic, autocorrelation, topological, electronic, constitutional, and geometric descriptors, the least energy 3D structure in Spatial Document File (sdf) format was then imported into PaDEL descriptor software program and the calculation was conducted (Umar et al. 2019).

Data pretreatment and division

The results of calculated descriptors of all the 30 compounds in the data set have been pretreated using Data pre-treatment 1.2 application software and then manually to get rid of constant and redundant molecular descriptors. Data partitioning software program was then employed to divide the pretreated data set into modelling (training) and validation (test) sets. The model was developed using modeling (training) set while the validation (test) set was used to testify the selected model. In this research work 22 compounds samples were used as the model building (training) set and the remaining 8 molecules as the validation (test) set. This partitioning certify that a related principle can be employed to predict the activity of the validation set. Kennard–Stone Algorithm was utilized for partitioning the data samples into a modelling and validation set (Kennard and Stone 1969; Rajer-Kanduč et al. 2003).

Model development

Genetic function algorithm (GFA) approach present in material studio 8.0 was employed in building the models with actual pIC50 values as the response (dependent) variables and the molecular descriptors as the independent variables. The length of the regression equation was 4, and Population and Generation were set to 1000 and 1000, respectively. The number of top equations returned was 4. Mutation probability was 0.1, and the smoothing parameter was 0.5. The models were scored primarily based on Friedman’s Lack of Fit (LOF) (Khaled 2011).

It is a phenomenal feature of GFA that instead of creating a single model it could create a vast number of models. GFA algorithm, handpicks the most relevant descriptors genetically, develop a far better models than those developed through the utilization of stepwise regression methods. The models were estimated using the LOF, which was measured using a slight variation of the original Friedman formula, so that the best fitness score can be received (Abdulfatai et al. 2017). Lack of fit is estimated using the following formula:

$${\mathrm{LOF}} = {\mathrm{SSE}}/\left( {1 - \frac{c + dp}{M}} \right)^{2}$$

where SSE denotes the sum of squares of errors, c is the number of terms in the selected model, apart from the constant term, d is a user defined smoothing parameter, p is the total number of molecular descriptors contained in all model terms (without the constant term) and M is the number of samples in the modeling set. However, for a model to be robust the value of the sum of square of errors must be small. Equation (3) below is used to compute The SSE:

$${\mathrm{SSE}} = \frac{{\left( {Y_{{{\mathrm{exp}}}} - Y_{{{\mathrm{prep}}}} } \right)}}{{\sqrt {M - P - 1} }}$$

where Yexp and Ypred are the actual and computed pIC50 values of the modelling set samples, M is the number of samples in the model building data set and P is the number of independent variables present in the generated model (Troyer 2001).

Model validation

Internal and external validation parameters were utilized in order to testify the reliability and predictive capability of the developed QSAR models (Table 1).

Table 1 Actual and computed pIC50 values of 4,6-diaryl-2-pyrimidinamine series against HUVEC cancer cell line

Internal and external validation

For the quantitative assessment of the developed QSAR model, the internal and external validation parameters were compared with the minimum recommended values (Veerasamy et al. 2011) as depicted in Table 2. The most commonly used internal validation tool is the squared correlation coefficient (R2), for an excellent regression equation, the value of R2 should be close to unity. It is calculated using Eq. (4) below:

$$R^{2} = \, 1 - \frac{{\sum \left( {Y_{{{\mathrm{exp}}}} - Y_{{{\mathrm{pred}}}} } \right)^{2} }}{{\sum \left( {Y_{{{\mathrm{exp}}}} - Y_{{{\mathrm{train}}}} } \right)^{2} }}$$
Table 2 Standard minimum proposed value used to assess a quantifiable QSAR model

where Ƴexp, Ƴpred and Ƴtrain depicts the actual, computed and the mean actual biological capabilities of the modelling set samples (Abdulfatai et al. 2017).

Adjusted R2 (R2adj): R2 value vary literally with the rise in the population of independent variables; thus, R2 alone is not sufficient recommended criterion for the quality of model fit. Hence, R2 is altered for the number of elucidative variables in the model (Umar et al. 2019). The altered R2 is defined as in Eq. (5):

$$R^{2}_{{{\mathrm{adj}}}} = 1 - (1 - R^{2} )\frac{N - 1}{{N - P - 1}} = \frac{{(N - 1)R^{2} - P}}{N - P + 1}$$

where P = number of independent variables in the model and N = number of model building data set (Abdulfatai et al. 2017). The standard approved value for this parameter is presented in Table 2. Another important internal validation parameter is the Cross-validation coefficient parameter (\({Q}_{\mathrm{cv}}^{2}\)) which is computed using Eq. 6.

$$Q_{{{\mathrm{cv}}}}^{2} = 1 - \frac{{\sum (Y_{{{\mathrm{pred}}}} - Y_{{{\mathrm{exp}}}} )^{2} }}{{\sum (Y_{{{\mathrm{exp}}}} - Y_{{{\mathrm{mtrain}}}} )^{2} }}$$

where Ƴpred, Ƴexp and Ƴmtrain are the predicted, actual and average values of experimental activities of modeling set samples, respectively. It has been proclaimed that excessive estimation of statistical peculiarities is not sufficient to rationalize the functionality of a model, the strategy depicted by Golbraikh and Tropsha (2002), and Roy et al. (2015) were utilized in order to verify the predictive capability of the new QSAR model. The squared correlation coefficient of the test set R2test was calculated by Eq. (7):

$$R^{{2}}_{{{\mathrm{test}}}} = 1 - \frac{{\sum (Y_{{{\mathrm{exp}}}} - Y_{{{\mathrm{pred}}}} )^{2} }}{{\sum (Y_{{{\mathrm{exp}}}} - Y_{{{\mathrm{mtrain}}}} )^{2} }}$$

where \({Y}_{\mathrm{pred}}\), \({Y}_{\mathrm{exp}}\) are the predicted and experimental values of experimental activities of test set compounds, \({Y}_{\mathrm{mtrain}}\) = Mean value of biological activities of training set compounds, respectively.

Molecular docking studies of the designed compounds against VEGFR-2 receptors

Molecular docking studies between the template, the designed compounds, and the reference drug (Sunitinib) against the VEGFR-2 receptors was carried out to evaluate the fundamental amino acid residues accountable for the protein–ligand interactions and probable mechanism of action of the designed molecules. The geometrically optimized 3D structures of all the ligands were saved in Protein Data Bank (PDB) format. The 3D structures of the VEGFR-2 receptor kinase (pdb code: 4agd) co-crystallized with Sunitinib ligand was downloaded from the Protein data bank ( and prepared using Molegro Virtual Docker (MVD) software via eliminating the extra water molecules and co-crystallized ligand enveloped in the X-Ray structure prior to the docking process. The template, all the newly designed molecules and Sunitinib were docked on to the active site of the VEGFR-2 receptor kinase using the Molegro Virtual Docker 6.0. The docking simulation was run a minimum of 50 times for 5 poses, and the best poses were determined based on the set scoring functions such as the MolDock and rerank score, (Jaworska et al. 2005). A Discovery Studio (DS) Visualizer Version 3.5 was employed to visualize the various intermolecular interactions such as H-bond, hydrophobic, and aryl interactions.

ADME properties and drug likeness prediction of some selected designed compounds

pkCSM an online web server (, and SwissADME ( are accessible web tools that are designed to analyze ADMET and drug-resembleness properties of small molecules (Daina and Michielin 2017). They are utilized to figure out the novel drug candidate, to lower the number of experimental researches and to elevate the success rate. ADMET properties and drug-resembleness prediction of some selected designed compounds as anti-breast cancer agents was conducted by utilizing the web tools. One of the most important parameter at pre-clinical stage of drug discovery is the Lipinski's rule of five, it proposed that for a chemical compound to be permeable or readily absorbed in to the body system it shouldn’t violates greater than 2 of these criteria (Molecular weight ˂ 500, Number of hydrogen bond donors ˂ 5, Number of hydrogen bond acceptors ˂ 10, Calculated Log p ˂ 5 and Polar surface area (PSA) ˂140 Å2) (Ismail et al. 2018).


For the accessement of the powerfulness and statistical significance of the developed model, Kennard–Stone algorithm was employed to partition the data set into modeling and validation set samples. Four model generated were generated by utilizing genetic functional algorithm (GFA) among which model 1 was selected due to its fitness statistically.

Model 1

$$\begin{aligned} {\mathbf{PIC}}_{{{\mathbf{50}}}} & = - { 5}.{898694472 }*{\mathbf{AATS2s}} - { 7}.{592394612}\,*\,{\mathbf{AATSC5s}} \\ & \quad - { 14}.{6942253}0{3 }*{\mathbf{MATS3e}} - { 5}.{53758}0{1}0{7}\,*\,{\mathbf{SpMax8}}\_{\mathbf{Bhi}} \\ & \quad + { 38}.{59916}. \\ \end{aligned}$$
$$\begin{aligned} N_{{{\mathrm{train}}}} & = 22,\quad R^{2}_{{{\mathrm{train}}}} = \, 0.832,\quad R^{2}_{{{\mathrm{adj}}}} = \, 0.79,\quad R^{2}_{{{\mathrm{ext}}}} = \, 0.62, \\ Q^{2} & = 0.68,\quad {\mathrm{LOF}} = 0.14509,\quad N_{{{\mathrm{test}}}} = \, 8. \\ \end{aligned}$$


Four models were developed among which model 1 was chosen to predict the inhibitive capacity of the molecules because of its quality statistically as it has the best correlation coefficient (R2) of 0.83, adjusted correlation coefficient (R2adj) value of 0.79, Leave one out (LOO) cross validation coefficient (Q2) value of 0.68 and the external confirmation (R2ext) of 0.62. Referred to Table 2 above, the internal and external attestation parameters of model 1 agreed with the minimal criterion for any credible and powerful QSAR model.

The selected model was employed to compute the anti-proliferative capacity of test set data, and the results was placed in Table 1. The closeness of correlation coefficient (R2) to unity 1.0 indicates that the model explained a reasonably excessive proportion of the response variable (descriptor) variation, sufficient enough for a strong QSAR model. Its 0.832 value suggested that 83.2% of the deviation resides in the residual which implies that the model is promising. The excessive modified R2 (R2adj) of the model and its proximity in value to the R2 implies that the model has remarkable descriptive ability to the independent variables (descriptors) in it. Additionally, it reveals the actual impact of utilized descriptors on the pIC50. Also, the excessive and closeness of Q2cv to internal R2 revealed that the model was found not to be over-fitted. The high R2test of the model clarified that the model is able to deliver a reliable predictions for newly designed molecules.

Figure 1 is a plot of the predicted pIC50 for the modelling and validation sets against the experimental pIC50 values for the Inhibition of HUVEC cancer cell line. Additionally, the residuals values of the model building and validation sets were plotted against the experimental pIC50 values and is presented in Fig. 2. The calculated normalized values of the activities (pIC50) strongly agreed with those of the test set as sighted in the table and figures, therefore the model did not illustrate any relative and systematic error, since the agreement of the residuals on either side of zero is irregular (Table 3).

Fig. 1

Predicted pIC50 versus experimental figures of the modelling and validation set

Fig. 2

Plot of residuals versus actual pIC50 values for modelling and validation sets

Table 3 List of independent variables (descriptors), their descriptions and class for model 1

Variance inflation factor

The degree of inter-correlation between descriptors is detected by Variance inflation factor. For a model to be acceptable VIF values must range from 1 to 10. Computed VIF values that ranges from 1 to 5 indicates the model is acceptable, value less than 1 it suggests that there is no inter-correlation among the descriptors, and a value that exceeds 10, suggests that the model is not acceptable. VIF is computed using Eq. (8) below:

$${\mathrm{VIF}} = \frac{1}{{1 - R^{2} }}$$

where R2 is the correlation coefficient of the chosen model (Ibrahim et al. 2020). VIF values of the four molecular descriptors that appear in the selected model are highlighted in Table 4. It can be inferred from the table that since all the variables has VIF values is less than 10, then that the generated model was suggestive statistically, and that the descriptors were found to be reasonably orthogonal(Myers 1990).

Table 4 Pearson correlation matrix, VIF and ME of the selected model

Mean effect

The assessment of the role and contribution of each descriptor in the generated model is carried out by adopting their mean effect value. It provides key informations on the impact of the response variables (descriptors) on the generated model, the signal and the magnitude of these descriptors blended with their mean effect values signifies their powerfulness in determining the activity of a molecule (Arthur et al. 2018).

Equation (9) below is used to calculate the mean effect of each descriptor:

$${\mathrm{MF}}_{j} = \frac{{B_{j} \mathop \sum \nolimits_{j = 1}^{i = n} d_{ij} }}{{\mathop \sum \nolimits_{j}^{m} B_{j} \mathop \sum \nolimits_{i}^{n} d_{ij} }}$$

where MFj portrays the mean effect of a descriptor j in a model, βj represents the coefficient of the descriptor J in the model and dij is the value of the descriptor in the data matrix per sample in the modelling set, m illustrates the independent variables numbers that turn up in the model and n is the number of samples in the modelling set (Adedirin et al. 2018). A short definitions of each of the response variable are depicted in Table 3 and corresponding mean effect value are in Table 4 respectively.

From Table 4, the most important descriptor is AATS2s because it has the best possible mean effect value, this suggested that it has a remarkable impact on the pIC50 values of the molecules. The correlated descriptors were categorized in a sequence in accordance with their offerings towards the standard pIC50 of the compounds, in the following decreasing sequence.

$${\mathrm{AATS2s }} > {\mathrm{ SpMax8}}\_{\mathrm{Bhi }} > {\mathrm{MATS3e}} > {\mathrm{AATSC5s}}.$$

AATS2s descriptor which is defined as Average Broto-Moreau autocorrelation − lag 2/weighted by I-state is an auto correlation descriptor. According to this descriptor atomic masses and electronic dispersion of the atoms that made up the molecule had a tremendous impact on the Anti-cancer capability of the set of molecules. Its largest positive mean effect value indicates that increase in its descriptor’s value will elevate the compound’s antifroliferative activity against HUVEC cell line. Also the descriptor SpMax8_Bhi has positive signal as indicated in Table 4, the signal suggested that the anti-breast cancer activity of a molecules varies directly with its values. The other descriptors MATS3e, AATSC5s are having negative mean effect values, and it indicates that the activity of the molecules varies inversely with the values of these descriptors.

Applicability domain

One of the approaches used to examine whether there are substantial or irrelevant molecules in a particular set of data is applicability domain. A QSAR model is deemed to be acceptable if it is able to render a dependable formulation of fresh inhibitive capacities of both modelling, validation sets samples when subjected to the applicability domain (AD) (Golbraikh and Tropsha 2002). Leverage approach is among techniques utilized to evaluate the AD and it is expressed in Eq. (10).

$$h_{i} = x_{i} \left( { \, X^{T} X} \right)^{ - 1} x_{i}^{T}$$

The terms xi, X, and XT represents the model building set matrix I, n × k descriptor matrix of the modelling set and the transpose matrix X used in the model development. Lower-limit value of X is the cut-off (h*) leverage which is presented in the equation below:

$$h^{*} = \frac{{3\left( {p + 1} \right)}}{N}$$

where p and N are the numbers of independent variables used in developing the model, and the samples used in developing the model.

A plot of the standardized residuals versus the leverage values (h) is called the Williams’s plot. This plot is used to analyze the defined applicability domain (AD). A compound whose leverage exceeds the cut-off value has a severe influence on the performance of the model and may be eliminated, but due to the fact that its standardized residual may be minimal it does not meant to be an outlier. Additionally, the cutoff value for accepting predictions of a molecule lies within the range of − 3 to + 3. This is because any points that resides within ± 3 standardized residual from the mean cover ninety-nine percent (99%) of the generally expended data (Jaworska et al. 2005).

Figure 3 shows the Williams plot of the developed. The cutoff leverage for the selected model was 0.682. Four (22, 1, 23 and 27) compounds from the test set were found to have leverage values more than the cut off value (i.e. hi > 0.682), therefore they had been recognized as structurally outliers compounds.

Fig. 3

The Williams Plot of the selected model

Ligand based drug design

Information derived from the model enabled us to design five new potent compounds. Two molecular descriptors were assumed to play a major role in our design because they have a notable value of mean effect when compared to the other descriptors, AATS2s and SpMax8_Bhi molecular descriptors were the principal descriptors used for our design. Compound 11 was selected as our template for the design because of its low residual and high pIC50 values and seemed to stand inside the described domain applicability. Figure 4 represents the template compound (11) and the standard compound’s structures that are utilized for our ligand based design. Modification of the compound was achieved by addition and switching of several substituents on the lettered positions (i.e. X, Y and R1) so that experimental synthesis of new active molecules will be feasible. Table 5 represents the structure of the newly designed molecules and their predicted pIC50 values, from the table it is observed that majority of the composed molecules had greater predicted activities relative to the principal molecule (11) modified for this design and the reference drug (Sunitinib). Hence, it can be affirmed that a simple QSAR model is able to provide an opportunity of predicting and identifying molecules with satisfactory capability, also to pin point the structurally modified compounds that lies beyond the defined applicability domain. Lastly, Outcomes of this study ascertains how powerful and dependable the selected QSAR model is and also conveys that together with the application of in silico screening technique, the selected QSAR model is able to perceive new powerful molecules as synthetic targets for drug advancements.

Fig. 4

A: Structure of the lead compound (11) used for the design. B: structure of the template used for the design

Table 5 2D structures, predicted activities of the designed of 4, 6-diaryl-2-pyrimidinamine derivatives

Molecular docking studies of the designed compounds

The potential of the designed compounds to interact with the VEGFR-2 receptor is presented in terms of MolDock score and rerank score respectively. The MolDock and rerank scoring are adopted as the parameters for evaluating the docking results. The outcomes of docking studies of the compounds against VEGFR-2 receptor illustrates that they were docked at the binding site of the receptor with a favorable MolDock score and rerank score compared to Sunitinib. 3D structures of the template and the VEGFR-2 receptor (pdb id: 4agd) are shown in Figs. 5 and 6 respectively. Additionally, the docking simulation results and several interactions of the template, the designed compounds and Sunitinib with VEGFR-2 receptor kinase is presented in Table 6 respectively.

Fig. 5

3D Structure of stable conformation of compound 11 (Template)

Fig. 6

3D structure of VEGFR-2 (pdb id: 4agd) receptor

Table 6 Docking Results and various interactions of the template, the designed compounds and Sunitinib

The template compound is bound to the receptor via three conventional Hydrogen bonds between Hydrogen atom of the Hydroxyl group with PHE845, and LEU 1049, Hydrogen atom of the Nitrogen attached to the carbonyl group with ARG1027. Four Carbon-Hydrogen bonds via Carbonyl Oxygen with LEU1067 and PRO1068, Morpholine ring Oxygen with MET1072 and Morpholine ring Hydrogen with ARG1027. One pi-anion electrostatic interactions between the phenyl ring moiety with ASP1028. Two weak alkyl interactions are also observed between the morpholine ring with MET1072 and PRO1068. Other weak pi-Alkyl interactions are also observed (ALA844, LEU1049, ILE1053, ALA1065, ARG1027, and PRO1068). These interactions accounts for the high binding scores between the ligand and the VEGFR-2 receptors. 3D binding mode of the template with VEGFR-2 receptor is shown in Fig. 7 respectively.

Fig. 7

3D structure of the template interactions with VEGFR-2 receptor

Designed compound 1 has a MolDock and rerank score of − 161.031 and − 70.669 respectively it is bonded to the VEGFR-2 binding pocket via two conventional H-bonds, two alkyl one electrostatic interaction and 6 pi-alkyl bonds. The oxygen atom of the morpholine ring forms one Hydrogen bond with SER1090 and the other is formed between the Hydrogen of the OH- group attached with the phenyl ring and LEU1409. Electrostatic interaction exist between the phenyl ring moiety and ASP1028. Weak interactions such as alkyl with PRO1068 and MET 1072 and six π- alkyl interactions with ALA844, LEU1049, ILE1053, ALA1065, ARG1027 and PRO1068 exist between the Ligand and the receptor, these interactions account for its reasonable binding score. Figure 8 represents the 3D binding interactions of designed compound 1 with VEGFR-2 receptor.

Fig. 8

3D structure of the designed compound 1 interactions with VEGFR-2 receptor

The 3D binding interactions of designed compound 2 is shown in Fig. 9. It has a moldock and rerank score of − 144.005 and − 97.0301 respectively. It is bound to the receptor through two conventional H-bonds, nine Carbon-Hydrogen bond, an electrostatic and eight pi-alkyl interactions. The two conventional H-bonds are formed between H-atom of the Hydroxyl group attached with the benzene ring with ASP1052 and Nitrogen atom attached to the carbonyl group with SER877. Two C-H bonds between the Nitrogen atom, and the phenyl ring of the Pyrimidine with LEU802, another between Hydrogen of the morpholine group and that of the Nitrogen attached to the carbonyl group ARG1051, SER877, SER803, ASP1046 and GLY1027 residues with the Amino Hydrogen, Alkoxy Oxygen with ARG1027. The Interactions between the residues and the receptor accounts for the high rerank score of the ligand.

Fig. 9

3D structure of the designed compound 2 interactions with VEGFR-2 receptor

3D interactions of Compound 3 with VEGFR-2 receptor is indicated in Fig. 10. It has a molDock score of − 173.02 and rerank score of − 86.399 forms two conventional H-bonds, four Carbon-Hydrogen bond, one electrostatic, five alkyl and six pi-alkyl interactions. The Hydrogen atoms of the Nitrogen attached to the carbonyl and that attached to the Hydroxyl groups formed Conventional H-bonds with ARG1027 and LEU1049. Carbonyl oxygen form C–H bond with PRO1068, ASP1028 with the morpholine Hydrogen, ARG1027 with morpholine and the Hydrogen attached to the alpha carbon of the carbonyl group. Electrostatic interaction occur between ALA1065 and delocalized pi electrons of the benzene ring. The other weak interactions occur with PRO1068, MET1072, ILE1053, LEU1067, TYR1054, ALA844, LEU1049, ILE1053, ALA1065, and PRO106.

Fig. 10

3D structure of the designed compound 3 interactions with VEGFR-2 receptor

Compound 4 with rerank score − 52.7342 is bound to the receptor through 4 conventional H-bonds, 6 Carbon-Hydrogen bond, 1 electrostatic and 8 pi–alkyl interactions. Conventional Hydrogen bonds are formed between Oxygen atom of the OH group with SER1090, Nitrogen attached to the carbonyl group with GLY1048 and ARG1051, Nitrogen attached to the pyrimidine bond with ASP1028. Carbon-Hydrogen bonds are between the Oxygen atom of Hydroxyl group with ALA1065, Hydrogen atom of the morpholine group with ASP1052, Hydrogen of the Alpha carbon attached to the carbonyl with LEU1049, ARG1051 with morpholine and Alpha carbon Hydrogen, SER1086 with the methoxy Hydrogen. Other weak interactions include one pi-sulfur (MET1072), four alkyl interactions (ALA844, PRO1068, ARG1027 and MET1072), and ten π-Alkyl interactions (PHE845, TYR1054, TRP1071, TYR1082, ALA1065, LEU1067, PRO1068, ARG1027, ARG1051 and ILE1053). 3D interactions of designed compound 4 with VEGFR-2 receptor is depicted in Fig. 11.

Fig. 11

3D structure of the designed compound 4 interactions with VEGFR-2 receptor

Compound 5 having rerank score of − 84.8435 is bound to the receptor through two conventional H-bonds, five Carbon-Hydrogen bond, one electrostatic, four alkyl and six pi-alkyl interactions. There are two conventional Hydrogen bonds between the Nitrogen atom attached to the carbonyl group with ARG1027 and Hydrogen atom of the OH group with LEU1049. Two C-H bonds occur between ARG1027 and Hydrogen atoms of morpholine and alpha carbon, Carbonyl oxygen with PRO1068, Methoxy oxygen with ASP1052 one electrostatic interaction with ASP1023, four weak alkyl interactions with PRO1068, MET1072, ILE1053 and LEU1067. Lastly there is weak pi-alkyl interactions with ALA844, LEU1049, ILE1053, ALA1065, ARG1027 and PRO 1068. The 3D interactions of designed compound 5 with VEGFR-2 receptor is shown in Fig. 12.

Fig. 12

3D structure of the designed compound 5 interactions with VEGFR-2 receptor

Sunitinib interactions with VEGFR-2 receptor is presented in Fig. 13. The reference drug which is also the co-crystallized ligand in the receptor (Sunitinib) was also re-docked to the active site of the 4AGD target receptor to verify the preciseness of the docking procedure, and to test whether the designed inhibitors fit well in the active site of the target. It was found to have a Moldock and Rerank score of − 134.939 and − 5.23604 and interact with the following amino acid residues in the active site of the target receptor SER803, GLY1048, ARG1027, PRO1068, ILE1053, LEU1067, PRO1068, MET1072, TYR1059, TRP1071, ALA844, ILE1053 and ARG1051 respectively (Fig. 13). Vascular endothelial growth factor-2 (VEGFR-2) receptor is a well-validated target for Breast cancer treatment, many researchers in different literatures had used VEGFR as target in breast cancer therapy, to mention but few the researches reported by Liu et al. (2018), Luo et al. (2018) and Tahia et al. (2019) respectively. All of the amino acid residues are virtually common in all the designed compounds, the template and the reference drug (Sunitinib).

Fig. 13

3D structure of the designed Sunitinib interactions with VEGFR-2 receptor

ADMET and drug-likeness properties of the designed compounds

Tables 7 and 8 represent the ADMET and Drug-likeness properties of the ligand based designed molecules. As recommended by Lipinski's rule of five, there is an increased expectation that these molecules might be pharmacologically effective, on the account that all of them breach at most one (1) of the criteria. Additionally, the designed molecules exhibit absorbance value between 80.522 and 88.437% which passed the least approved values of 30% and consequently demonstrate promising human intestinal absorption. > 0.3 to <  − 1 Log BB and >  − 2 to <  − 3 Log PS, are the minimum recommended values for the blood–brain barrier (BBB) and central nervous system permeability respectively. The designed compounds have Log BB of >  − 1 and Log PS >  − 2 which is a clear indication that the compounds are uniformly dispersed to the brain, and assumed to permeate the central nervous system. Enzymatic metabolism is used to explain the biotransformation of a drug in the body hence, it became necessary to consider the drug’s metabolism. Cytochrome P450 is a class of super enzymes that plays a vital role in drug metabolism. 1A2, 2C9, 2C19, 2D6, and 3A4, are the CYP families responsible for drug metabolism. Results presented in Table 7, suggested that most of the designed compounds are the inhibitors of CYP2C19, 2C9, and 3A4 respectively. Total clearance which is an indicator that describes the relationship between the rate of elimination of the drug and its concentration in the body. The designed compounds demonstrate high values of total clearance which are within the acceptable limit of a drug molecule in the body. Furthermore, the five designed compounds are regarded to be nontoxic. The overall ADMET properties of these compounds reveal their good pharmacokinetic profiles.

Table 7 Predicted ADMET properties of the designed compounds
Table 8 Predicted Drug-likeness properties of the designed compounds


Four models were generated out of which the first was selected as the best because of its fitness statistically with the following assessment parameters: R2train = 0.832, R2adj = 0.79, R2ext = 0.62, Q2 = 0.68, and LOF = 0.14509 hence, satisfy the criteria of standard QSAR model. Additionally, new potentially active compounds on HUVEC cell line were designed by employing in-silico screening approach and their pIC50 was predicted with the aid of the generated QSAR model. The computed activity of most of the designed compounds were found to be greater than that of the lead reference molecule (11) employed in the design. Additionally, molecular docking simulation was conducted so as to gain insight in to the binding mode of the designed compounds against VEGFR-2 receptor. Designed compounds (N2, N3 and N5, with Rerank scores − 97.0301, − 86.3997 and − 84.8435) were found to have better docking scores than the template (Compound 11 Rerank score − 70.669) and the reference drug (Sunitinib Rerank score − 5.23604) respectively. High values of the docking scores of the designed compounds is attributed to the substituents that are introduced into the structure of the template during the design.

As recommended by Lipinski's rule of five, there is an increased expectation that these molecules are pharmacologically effective, on the account that all of them breach at most one (1) of the criteria. Consequently, the molecules are assumed to have good absorption, intense toxicity grade, are orally bioavailable and are absorptive. The developed model was therefore found to be efficient in predicting the pIC50 of Anti breast cancer agents that are yet to be synthesized and it also help in reducing the cost and synthetic duration the compounds. Finally, synthesis, in vivo, and in vitro evaluation of these ligands is recommended to be carried out so as affirm them as novel VEGFR-2 inhibitors for breast cancer treatment.

Availability of data and material

Not applicable.



Quantitative structure activity relationship


Human umbilical vein endothelial cell


Vascular endothelial growth factor receptor-2

ER + :

Estrogen receptor


Density functional theory


Pharmaceutical Data Exploration Laboratory


Bee -3- Lee Yang Par


Genetic function algorithm-Multi linear regression


Variance inflation factor


Mean effect


Absorption, Distribution, metabolism and excretion


  1. Abdulfatai U, Uzairu A, Uba S (2017) Quantitative structure-activity relationship and molecular docking studies of a series of quinazolinonyl analogues as inhibitors of gamma amino butyric acid aminotransferase. J Adv Res 8:33–43

    CAS  Article  Google Scholar 

  2. Adedirin O, Uzairu A, Shallangwa GA, Abechi SE (2018) QSAR and molecular docking based design of some n-benzylacetamide as γ-aminobutyrate-aminotransferase inhibitors. J Eng Exact Sci 4(1):0065–0084

    Article  Google Scholar 

  3. Al-Suwaidan IA, Abdel-Aziz AA-M, Shawer TZ, Ayyad RR, Alanazi AM, El-Morsy AM, Mohamed MA, Abdel-Aziz NI, El-Sayed MA-A, El-Azab AS (2016) Synthesis, antitumor activity and molecular docking study of some novel 3-benzyl-4 (3H) quinazolinone analogues. J Enzyme Inhib Med Chem 31(1):78–89

    CAS  Article  Google Scholar 

  4. Arthur DE, Uzairu A, Mamza P, Abechi SE, Shallangwa G (2018) Insilico modelling of quantitative structure–activity relationship of pGI50 anticancer compounds on K-562 cell line. Cogent Chem 4:1432520

    Article  Google Scholar 

  5. Chandrappa S, Kavitha CV, Shahabuddin MS, Vinaya K, Ananda Kumar CS, Ranganatha SR (2009) Synthesis of 2-(5-((5-(4-chlorophenyl) furan-2-yl) methylene)-4-oxo-2 thioxothiazolidin-3-yl) acetic acid derivatives and evaluation of their cytotoxicity and induction of apoptosis in human leukemia cells. Bioorg Med Chem 17:2576–2584

    CAS  Article  Google Scholar 

  6. Clarke R, Tyson JJ, Dixon JM (2015) Endocrine resistance in breast cancer—an overview and update. Mol Cell Endocrinol 418(3):220–234

    CAS  Article  Google Scholar 

  7. Daina AO, Michielin VZ (2017) SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci Rep 7:42717

    ADS  Article  Google Scholar 

  8. Golbraikh A, Tropsha A (2002) Beware of q 2! J Mol Graph Model 20(4):269–276

    CAS  Article  Google Scholar 

  9. Hansch C, Leo A, Hoekman DH (1995) Exploring QSAR. Fundamentals and application in chemistry and biology. Am Chem Soc, Washington, DC

    Google Scholar 

  10. Hicklin DJ, Ellis LM (2005) Role of the vascular endothelial growth factor pathway in tumor growth and angiogenesis. J Clin Oncol 23(5):1011–1027

    CAS  Article  Google Scholar 

  11. Huang D, Ding Y, Luo WM, Bender S, Qian CN, Kort E, Kristin Z (2008) Inhibition of MAPK pathways suppressed renal cell carcinoma growth and angiogenesis in vivo. Can Res 68(1):81–88

    CAS  Article  Google Scholar 

  12. Ibrahim MT, Uzairu A, Shallangwa GA, Uba S (2020) In-silico activity prediction and docking studies of some 2, 9-disubstituted 8 phenylthio/phenylsulfinyl-9h-purine derivatives as Anti-proliferative agents. Heliyon 6:e03158.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Ismail SY, Uzairu A, Sagagi B, Sabiu M (2018) In-silico molecular docking and pharmacokinetic study of selected phytochemicals with estrogen and progesterone receptors as anticancer agent for breast cancer. JOTCSA 5(3):1337–1350

    CAS  Google Scholar 

  14. Jaworska J, Nikolova-Jeliazkova N, Aldenberg T (2005) QSAR applicability domain estimation by projection of the training set descriptor space: a review. Atla-Nottingham 33:445

    CAS  Google Scholar 

  15. Kennard RW, Stone LA (1969) Computer aided design of experiments. Technometrics 11(1):137–148

    Article  Google Scholar 

  16. Khaled KF (2011) Modeling corrosion inhibition of iron in acid medium by genetic function approximation method: a QSAR model. Corros Sci 53(11):3457–3465

    CAS  Article  Google Scholar 

  17. Liu L, Tang Z, Wu C, Li X, Huang A, Lu X, You Q, Xiang H (2018) Synthesis and biological evaluation of 4, 6-diaryl-2-pyrimidinamine derivatives as anti-breast cancer agents. Bioorg Med Chem Lett 28:1138–1142

    CAS  Article  Google Scholar 

  18. Luo G, Tang Z, Lao K, Li X, You Q, Xiang H (2018) Structure-activity relationships of 2, 4-disubstituted pyrimidines as dual ERα/VEGFR-2 ligands with anti-breast cancer activity. Eur J Med Chem.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Myers RH (1990) Classical and modern regression application, 2nd edn. Duxbury Press, CA

    Google Scholar 

  20. Patel RR, Sengupta S, Kim HR, Klein-Szanto AJ, Pyle JR, Zhu F (2010) Experimental treatment of oestrogen receptor (ER) positive breast cancer with tamoxifen and brivanib alalinate, a VEGFR-2/fgfr-1 kinase inhibitor. Eur J Cancer 46(9):1537–1553

    CAS  Article  Google Scholar 

  21. Rajer-Kanduč K, Zupan J, Majcen N (2003) Separation of data on the training and test set for modelling: a case study for modelling of five colour properties of a white pigment. Chemom Intell Lab Syst 65(2):221–229

    Article  Google Scholar 

  22. Roy K, Kar S, Ambure P (2015) On a simple approach for determining applicability domain of QSAR models. Chemom Intell Lab Syst 145:22–29.

    CAS  Article  Google Scholar 

  23. Solomon VR, Hua C, Lee H (2009) Hybrid pharmacophore design and synthesis of isatin–benzothiazole analogs for their anti-breast cancer activity. Bioorg Med Chem 17:7585–7592

    CAS  Article  Google Scholar 

  24. Tahia KM, Rasha ZB, Samia AE, Al MM, Abeer EM (2019) Synthesis, anticancer effect and molecular modeling of new thiazolylpyrazolyl coumarin derivatives targeting VEGFR-2 kinase and inducing cell cycle arrest and apoptosis. Bioorg Chem 85:253–273

    Article  Google Scholar 

  25. Troyer JR (2001) The multiple discoveries of the first hormone herbicides. Weed Sci 49:290–297

    CAS  Article  Google Scholar 

  26. Umar BA, Uzairu A, Shallangwa GA, Sani U (2019) QSAR modeling for the prediction of pGI50 activity of compounds on LOX IMVI cell line and ligand-based design of potent compounds using in silico virtual screening. Netw Model Anal Health Inform Bioinform 8:22

    Article  Google Scholar 

  27. van Leeuwen FE, Benraadt J, Coebergh JW, Kiemeney LA, Gimbrère CH, Otter R, Schouten LJ, Damhuis RA, Bontenbal M, Diepenhorst FW (1994) Risk of endometrial cancer after tamoxifen treatment of breast cancer. Lancet 343:448

    Article  Google Scholar 

  28. Veerasamy R, Rajak H, Jain A, Sivadasan S, Varghese CP, Agrawal RK (2011) Validation of QSAR models-strategies and importance. Int J Drug Des Discov 2:511–519

    CAS  Google Scholar 

  29. Wang T, You Q, Huang FS, Xiang H (2009) Recent advances in selective estrogen receptor modulators for breast cancer. Mini-Rev Med Chem 9(10):1191–1201

    CAS  Article  Google Scholar 

Download references


The author sincerely acknowledges all the contributors of this exploratory group for their guidance and motivation at some stages of this research work and Ahmadu Bello University for providing the softwares utilized in this analysis.


Unrestrained allocation was not received by the authors for this research.

Author information




All authors participate accordingly, SHA performed all the computational analysis and wrote the manuscript, AU provided all the softwares for this research and edited the manuscript to ensure that error was minimized before final submission, MT analyzed the results obtained from the softwares, AB assessed the manuscript by using Plagiarism checker as well as arranged the manuscript according to the journal format. All authors have read and approved the manuscript.

Corresponding author

Correspondence to Sagiru Hamza Abdullahi.

Ethics declarations

Ethics approval and consent to participate

Not applicable, because this article does not contain any studies with human or animal subjects.

Consent of publication

Not applicable.

Competing interests

The correspondents did not acknowledge competing of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Abdullahi, S.H., Uzairu, A., Ibrahim, M.T. et al. Chemo-informatics activity prediction, ligand based drug design, Molecular docking and pharmacokinetics studies of some series of 4, 6-diaryl-2-pyrimidinamine derivatives as anti-cancer agents. Bull Natl Res Cent 45, 167 (2021).

Download citation


  • HUVEC cell line
  • DFT
  • QSAR
  • Williams plot
  • VIF
  • ME
  • ADME