Skip to main content

Table 6 Validation parameters for each model using multilinear regression (MLR)

From: Computational investigation, virtual docking simulation of 1, 2, 4-Triazole analogues and insillico design of new proposed agents against protein target (3IFZ) binding domain

S/NO Validation parameters Formula Threshold Model
Internal validation
 1 Friedman lack of fit (LOF) \( \frac{\mathrm{SEE}}{{\left(1-\frac{w+q\times j}{N}\right)}^2} \) Significantly low 0.1802
 2 R-squared \( 1-\left[\frac{\sum {\left({Y}_{\mathrm{obs}\kern0.5em -{Y}_{\mathrm{pred}}}\right)}^2}{\sum {\left({Y}_{\mathrm{obs}\kern0.5em -{\overline{Y}}_{\mathrm{training}}}\right)}^2}\right] \) R2 > 0.6 0.7759
 3 Adjusted R-squared \( \frac{R^2-P\ \left(N-1\right)}{N-p+1} \) \( {R}_{\mathrm{adj}}^2>0.6 \) 07381
 4 Cross-validated R-squared (\( {Q}_{cv}^2\Big) \) \( 1-\left[\frac{\sum {\left({Y}_{\mathrm{pred}\kern0.5em -{Y}_{\mathrm{obs}}}\right)}^2}{\sum {\left({Y}_{\mathrm{obs}\kern0.5em -{\overline{Y}}_{\mathrm{training}}}\right)}^2}\right] \) Q2 > 0.6 0.6954
 5 Significant regression    Yes
 6 Significance-of-regression F value    13.42
 7 Critical SOR F value (95%) \( \frac{\sum {\left({Y}_{\mathrm{pred}\kern0.5em -{Y}_{\mathrm{obs}}}\right)}^2}{p}/\frac{\sum {\left({Y}_{\mathrm{pred}\kern0.5em -{Y}_{\mathrm{obs}}}\right)}^2}{N-p-1} \) F(test) > 2.09 2.7294
 8 Replicate points    0
 9 Computed observed error    0
 10 Min expt. error for non-significant LOF (95%)    0.4120
Model randomization
 11 Average of the correlation coefficient for randomized data (\( {\overline{\boldsymbol{R}}}_{\boldsymbol{r}} \))   \( \overline{R}<0.5 \) 0.3642
 12 Average of determination coefficient for randomized data (\( {\overline{\boldsymbol{R}}}_{\boldsymbol{r}}^{\mathbf{2}}\Big) \)   \( {\overline{R}}_r^2<0.5 \) 0.1823
 13 Average of leave one out cross-validated determination coefficient for randomized data ( \( {\overline{\boldsymbol{Q}}}_{\boldsymbol{r}}^{\mathbf{2}} \) )   \( {\overline{Q}}_r^2<0.5 \) − 0.3915
 14 Coefficient for Y-randomization (c\( {R}_p^2\Big) \) \( {R}^2\times \left(1-\sqrt{\left|{R}^2-{\overline{R}}_{\mathrm{r}}^2\right|}\ \right) \) c\( {R}_p^2>0.6 \) 0.9229
External validation
 15 \( /{\boldsymbol{r}}_{\mathbf{0}}^{\mathbf{2}}-{{\boldsymbol{r}}^{\prime}}_{\mathbf{0}}^{\mathbf{2}}/ \)   < 0.3 0.1591
 16 \( \frac{{\boldsymbol{r}}^{\mathbf{2}}-{\boldsymbol{r}}_{\mathbf{0}}^{\mathbf{2}}}{{\boldsymbol{r}}^{\mathbf{2}}} \)   < 0.1 0.0023
 17 \( \frac{{\boldsymbol{r}}^{\mathbf{2}}-{{\boldsymbol{r}}^{\prime}}_{\mathbf{0}}^{\mathbf{2}}}{{\boldsymbol{r}}^{\mathbf{2}}} \)   < 0.1 0.0136
 18 \( {\boldsymbol{R}}_{\mathbf{test}}^{\mathbf{2}} \) \( {R}_{test}^2=1-\frac{\sum {\left(Y{\mathrm{pred}}_{\mathrm{test}}-{Y}_{{\mathrm{obs}}_{\mathrm{test}}}\right)}^2}{\sum {\left(Y{\mathrm{pred}}_{\mathrm{test}}-{\overline{Y}}_{\mathrm{training}}\ \right)}^2} \) >0.6 0.6550
  1. SEE is the standard error of estimation, w is the total number of terms present in the built model except the constant term, j is the number of descriptors confined in the built model, q is a user-defined factor, and N is the number of compounds of training set. Yobs, \( {\overline{Y}}_{\mathrm{training}} \), and Ypred are the observed activity, the mean observed activity of the training compounds, and the predicted activity, respectively. r2 is the correlation coefficients of the plot of observed activity against predicted activity values, ro2 is the correlation coefficients of the plot of observed activity against predicted activity values at zero intercept, and ro2 is the correlation coefficients of the plot of predicted activity against observed activity at zero intercept (Adeniji et al. 2020a; Roy et al. 2011; Adeniji et al. 2020d)