Parameter | Equation | Eq | Significance | Threshold value |
---|---|---|---|---|
Internal validation | ||||
Friedman lack of fit (LOF) | \(\mathrm{LOF}=\frac{\mathrm{SEE}}{{(1-\frac{\mathrm{c}+\mathrm{d x p}}{\mathrm{M}})}^{2}}\) | (3) | Allows for the best fitness score to be obtained | – |
 | \(SEE=\sqrt{\frac{{({Y}_{exp}-{Y}_{pred})}^{2}}{N-P-1}}\) |  |  |  |
Correlation coefficient ( R2) | \(R^{2} = 1 - \left[ {\frac{{\sum {(\user2{Y}_{{exp}} - Y_{{pred}} )^{2} } }}{{\sum {(Y_{{exp}} - \bar{Y}_{{training}} )^{2} } }}} \right]\) | (4) | Measures the degree of fitness of the regression equation |  ≥ 0.6 |
Adjusted R2 | \({R}_{adj}^{2}=\frac{{R}^{2}-p(n-1)}{n-p+1}\) | (5) | Ensures the model’s stability and reliability |  ≥ 0.5 |
Cross-validation regression coefficient ( Q2cv) | \(Q_{{cv}}^{2} = 1 - \left[ {\frac{{\sum {(Y_{{pred}} - Y_{{exp}} )^{2} } }}{{\sum (Y_{{exp}} - \bar{Y}_{{training}} )^{2} }}} \right]\) | (6) | Indicates a high internal predictive power |  ≥ 0.5 |
The coefficient of determination (\({cR}_{p}^{2}\)) of Y-Randomization | \({cR}_{p}^{2}=R X [{R}^{2}-{\left({R}_{r}\right)}^{2}{]}^{2}\) | (7) | This is for confirmation that the QSAR model built is strong and not created by chance | \({cR}_{p}^{2}\)> 0.50 |
External validation | ||||
Predicted R2 ( R2 test) | \(R_{{test}}^{2} = 1 - \frac{{\sum {(Ypred_{{test}} - Yexp_{{test}} )^{2} } }}{{\sum {(Ypred_{{test}} - \bar{Y}_{{training}} )^{2} } }}\) | (8) | Measures the ability of the model to predict activity values of an external set of compounds |  ≥ 0.6 |
Golbraikh and Tropsha’s acceptable model criteria | \(|r_{o}^{2} - r_{o}^{{\prime 2}} |\) | – | Assess the robustness and stability of the model |  < 0.3 |
 | \([{(r}^{2}-{r}_{o}^{2})/{r}^{2}]\) |  |  |  < 0.1 |
 | k |  |  | 0.85 ≤ k ≤ 1.15 |