Skip to main content

Table 6 Validation parameters for each model using multilinear regression (MLR)

From: Computational investigation, virtual docking simulation of 1, 2, 4-Triazole analogues and insillico design of new proposed agents against protein target (3IFZ) binding domain

S/NO

Validation parameters

Formula

Threshold

Model

Internal validation

 1

Friedman lack of fit (LOF)

\( \frac{\mathrm{SEE}}{{\left(1-\frac{w+q\times j}{N}\right)}^2} \)

Significantly low

0.1802

 2

R-squared

\( 1-\left[\frac{\sum {\left({Y}_{\mathrm{obs}\kern0.5em -{Y}_{\mathrm{pred}}}\right)}^2}{\sum {\left({Y}_{\mathrm{obs}\kern0.5em -{\overline{Y}}_{\mathrm{training}}}\right)}^2}\right] \)

R2 > 0.6

0.7759

 3

Adjusted R-squared

\( \frac{R^2-P\ \left(N-1\right)}{N-p+1} \)

\( {R}_{\mathrm{adj}}^2>0.6 \)

07381

 4

Cross-validated R-squared (\( {Q}_{cv}^2\Big) \)

\( 1-\left[\frac{\sum {\left({Y}_{\mathrm{pred}\kern0.5em -{Y}_{\mathrm{obs}}}\right)}^2}{\sum {\left({Y}_{\mathrm{obs}\kern0.5em -{\overline{Y}}_{\mathrm{training}}}\right)}^2}\right] \)

Q2 > 0.6

0.6954

 5

Significant regression

  

Yes

 6

Significance-of-regression F value

  

13.42

 7

Critical SOR F value (95%)

\( \frac{\sum {\left({Y}_{\mathrm{pred}\kern0.5em -{Y}_{\mathrm{obs}}}\right)}^2}{p}/\frac{\sum {\left({Y}_{\mathrm{pred}\kern0.5em -{Y}_{\mathrm{obs}}}\right)}^2}{N-p-1} \)

F(test) > 2.09

2.7294

 8

Replicate points

  

0

 9

Computed observed error

  

0

 10

Min expt. error for non-significant LOF (95%)

  

0.4120

Model randomization

 11

Average of the correlation coefficient for randomized data (\( {\overline{\boldsymbol{R}}}_{\boldsymbol{r}} \))

 

\( \overline{R}<0.5 \)

0.3642

 12

Average of determination coefficient for randomized data (\( {\overline{\boldsymbol{R}}}_{\boldsymbol{r}}^{\mathbf{2}}\Big) \)

 

\( {\overline{R}}_r^2<0.5 \)

0.1823

 13

Average of leave one out cross-validated determination coefficient for randomized data ( \( {\overline{\boldsymbol{Q}}}_{\boldsymbol{r}}^{\mathbf{2}} \) )

 

\( {\overline{Q}}_r^2<0.5 \)

− 0.3915

 14

Coefficient for Y-randomization (c\( {R}_p^2\Big) \)

\( {R}^2\times \left(1-\sqrt{\left|{R}^2-{\overline{R}}_{\mathrm{r}}^2\right|}\ \right) \)

c\( {R}_p^2>0.6 \)

0.9229

External validation

 15

\( /{\boldsymbol{r}}_{\mathbf{0}}^{\mathbf{2}}-{{\boldsymbol{r}}^{\prime}}_{\mathbf{0}}^{\mathbf{2}}/ \)

 

< 0.3

0.1591

 16

\( \frac{{\boldsymbol{r}}^{\mathbf{2}}-{\boldsymbol{r}}_{\mathbf{0}}^{\mathbf{2}}}{{\boldsymbol{r}}^{\mathbf{2}}} \)

 

< 0.1

0.0023

 17

\( \frac{{\boldsymbol{r}}^{\mathbf{2}}-{{\boldsymbol{r}}^{\prime}}_{\mathbf{0}}^{\mathbf{2}}}{{\boldsymbol{r}}^{\mathbf{2}}} \)

 

< 0.1

0.0136

 18

\( {\boldsymbol{R}}_{\mathbf{test}}^{\mathbf{2}} \)

\( {R}_{test}^2=1-\frac{\sum {\left(Y{\mathrm{pred}}_{\mathrm{test}}-{Y}_{{\mathrm{obs}}_{\mathrm{test}}}\right)}^2}{\sum {\left(Y{\mathrm{pred}}_{\mathrm{test}}-{\overline{Y}}_{\mathrm{training}}\ \right)}^2} \)

>0.6

0.6550

  1. SEE is the standard error of estimation, w is the total number of terms present in the built model except the constant term, j is the number of descriptors confined in the built model, q is a user-defined factor, and N is the number of compounds of training set. Yobs, \( {\overline{Y}}_{\mathrm{training}} \), and Ypred are the observed activity, the mean observed activity of the training compounds, and the predicted activity, respectively. r2 is the correlation coefficients of the plot of observed activity against predicted activity values, ro2 is the correlation coefficients of the plot of observed activity against predicted activity values at zero intercept, and ro2 is the correlation coefficients of the plot of predicted activity against observed activity at zero intercept (Adeniji et al. 2020a; Roy et al. 2011; Adeniji et al. 2020d)