Heavy metal concentrations
The numerical summary of metal concentration data obtained from ten monitoring sites including the reference sites during the August break study period is presented in Tables 2 and 3. The mean metal concentrations (Zn) remained at 0.030 µg/m3 at the reference sites ARR during the August break but were determined to be below the detection limit at the reference site ERR during the same period (Table 2) exhibiting no significant spatial variation. The mean concentration of Pb metal was 0.21 µg/m3 and 0.18 µg/m3 in ARR and ERR respectively, and therefore do not differ significantly.
The highest levels of Pb and Zn are found to be 0.91 µg/m3 (AMN) and 0.050 µg/m3 (ECM) respectively. The highest Pb level at ARR and ERR is close to the approved FMEnv 24 h. long duration of 1 µg/m3, NAAQS for 1 h. duration (1 ug/m3), and 1.4 µg/m3 of (NESREA) National environmental standards and regulation agency, but it is a higher value when compared to the revised EPA lead standard of 0.15 µg/m3 as shown in Fig. 1 (Obioh et al. 2005; Aloha et al. 2016; Frank et al. 2019).
This toxic level of Pb puts the vulnerable human populace, especially children and the terrestrial ecosystem at risk. Research has shown that Pb exposures negatively affect the oxygen-carrying capacity of the blood and also result in decreased growth and reproduction among plants and animals in the ecosystems (Frank et al. 2019; Benibo et al. 2020; Ghosh et al. 2021). The sources of Pb within that area are mainly from lead–zinc mining and the heavy-duty trucks that transport goods and raw materials from one town to the other. The heavy metals in SPM values for the measuring period ranged from 0.023 to 0.040 µg/m3 for Zn and 0.11 to 0.91 µg/m3 for Pb, which were about many times higher than EPA reference values but a little lower than NAAQS and NESREA recommendations as depicted in Fig. 2.
Zinc metal is not listed as an occupational hazard neither is it carcinogenic; hence there is no established national ambient air quality standard for zinc except ingestion of zinc metal fumes for long durations. Also, the graph in Fig. 4 shows that zinc concentration is fairly constant and unaffected by relative humidity and wind speed in Enugu as similarly observed in Abakaliki. The mean concentrations of the metals derived from the values of each site are shown in Tables 2 and 3. In comparison, the reference sites ERR and ARR had lower mean concentrations for Pb at all sites (0.21 µg/m3). The mean concentration of Zn particulate matter was 0.030 µg/m3 at both reference sites (ARR and ERR) as seen in Fig. 4.
The mean concentration of Pb was found to be over two orders of magnitude than the EPA standard. Thus, in this study, the particulate metal concentration levels can be listed in the order of ranking as follows: Zn at Abakaliki (ARR < AIN < ACM < ARS < AMN) and Zn at Enugu (ERR < ERS < EIN < EMN < ECM); Pb at Abakaliki (ARS < ARR < AIN < ACM < AMN) and Pb at Enugu (ERS < ERR < EIN < ERS < ECM). Hence the magnitude of metal order can be divided into values ≤ 0.025 ≤ 0.05 µg/m3 for Zn and ≤ 0.05 ≤ 0.1 µg/m3 for Pb. About 50% of zinc concentration levels were lower than 0.025 µg/m3 while the remaining 50% were less than 0.05 µg/m3. The Pb grouping also shows that all sites exceeded 0.05 µg/m3 concentration levels. Thus, the relative arrangement shows no specific ordering for the metals but the dominance of Pb within the study sites. This result shows that Pb is the highest contributor of particulate matter load in the environment within the measured locations (Offor et al. 2016; Aloha et al. 2017; Ojekunle et al. 2018; Ichu et al. 2021).
Pearson’s correlation
The Pearson’s correlation (PC) is generally used to describe the proportion of the total variance in the obtained data and is explained by a linear model of the variables under consideration. The Pearson’s correlation ranges from − 1 to 1, and higher absolute values indicate better dependency among the variables. The formula for Pearson’s correlation between two random variables x and y is given below in Eq. 1 as:
$$p_{xy} = \frac{{\sigma^{2}_{xy} }}{{\sqrt {\sigma^{2}_{x} \sigma^{2}_{y} } }}$$
(1)
where σx is known as the standard deviation of the variable x, σ2x is known as the variance of x, σy is known as the standard deviation of the variable y, σ2y is known as the variance of y and σ2xy is known as the covariance of the variables of x and y.
When two variables are found to be independent, Pearson's correlation coefficient will be 0 but the inverse is sometimes not always true because there are instances that have dependent variables and Pearson’s correlation is unable to detect the dependency. Although it can exist, because Pearson's correlation recognizes the linear correlations between variables that are jointly, and normally distributed. However, two variables can be correlated but unable to follow linear correlation and Pearson’s correlation failing to detect the dependency. One very useful factor in Pearson’s correlation is the measurement of dependency is the p value. It provides reliable information about the probability that a given dataset will be contrary to the hypothesis, in other words, the determined Pearson’s correlation is not significant despite the value of \(P_{xy}^{2}\). Thus, it is generally accepted that a p value ≤ of 0.05 shows a significant correlation and a p value > 0.05 is not a significant correlation. Also, the coefficient of determination ′, will give the percentage of PC between two variables. For instance, a Pxy value of 0.5 and \(P_{xy}^{2}\) value of 0.25 means that 25% of y values are explained by values of x (Bermudez-Edo et al. 2018; Hůnová 2020).
The results from Pearson’s correlation data are presented in Table 4 above. The p value is greater than 0.05 at Abakaliki and Abakaliki/Enugu (Zn/Zn) which means that the correlation is not significant, hence the data is inconsistent with the hypothesis. But at all the sites in Enugu and Enugu/Abakaliki (Pb/Pb), the p value was less than 0.05 which revealed that it is significant showing that the data is consistent with the hypothesis. The Pearson’s correlation (r) values of Abakaliki and Abakaliki/Enugu also reflect a statistical negative correlation values and indicate that the levels of particulate heavy metals in Abakaliki differ from each other and are reasonable to study each as an independent variable. They also indicate a weak link for air pollutants moving from Abakaliki to Enugu. However, at Enugu and Enugu/Abakaliki sites it appears to contribute significantly (positive correlation) to the recorded concentrations. This also confirms that a strong link exists between the two pollutants at Enugu and the potential movement of pollutants from Enugu to Abakaliki. Hence, selecting a single variable that incorporates two or more particulate matter is justifiable.
Principal component analysis (PCA)
The Principal Component Analysis (PCA) is known to be a powerful tool commonly used to reduce the dimension of data. The identified principal components are typically expected to account for most of the variability of the datasets acquired from the different samples. Hence, the PCA uses the identified principal component to differentiate samples or subsets of samples that are seen to be responsible for the variations between the different groups. Moreover, the contribution from each data set can be visualized in the scatter plot of any two principal components to reveal the relative variation of the unrelated data sets (Yu et al. 2016). In this study, PCA was conducted by dividing the datasets into two subsets one for Zn and the other for Pb. As can be seen in Fig. 5, it depicted the biplot of two principal components of datasets. Using the varimax rotation criteria, new variable vari-factors (VFs) were formed and reduced to PCs after rotating the obtained PCs (Al-Anzi et al. 2016).
The Eigenvalues of the correlation matrix provided information on how many principal components should be considered because it is the principal component that accounts for most of the variance in the observed variable. Hence in the first obtained Eigenvalue as shown in Fig. 3 only 54.42% of data is covered while the second Eigenvalue covered the remaining 45.58% of data. A view of the screen plot shows how many principal components that are to be considered. It shows a visual indication of only 2 principal components to be considered which were adopted in the plot (Simeonov et al. 2004). From the extracted Eigenvalues, the coefficient principal component 1 is mainly the presence of Zn and Pb particulate metals while the coefficient of principal component 2 was mainly Pb particulate matter. The biplot displays both the loadings and the scores for the two selected components. The scatter plots show the different sample locations within Enugu and Abakaliki. Hence sampling site ACM 3 relies on Zn as the principal component which indicates that the primary pollutant at ACM 3 is Zn particulate matter while, (ACM 1–2, ACM 4, AMN 2–4, EMN 2, EMN 4) are clustered around the Pb particulate matter which indicated that the primary source of those sampling sites is the Lead (Pb) particulate matter. But for locations like (AIN 2–4, ARS 2–4, EIN 2, ERS 1–4, ERR 1–4, ERS 2, and ARR 2) their primary pollutant source cannot be established from the PCA. These observations from the data correlate with the simple plot shown below in Fig. 6 where the Zn particulate matter at ACM 1–4 was not statistically indicative and identified to be the primary pollutant (principal component 1) at ACM, while the Pb particulate matter at ACM 1–4 and ECM 1–4 were significantly obvious, hence the primary pollutants (principal component) at these sampling sites (Jassim et al. 2018).
Cluster analysis
Hierarchical Cluster Analysis (HCA) is a useful multivariate tool for finding patterns and groupings within a given data set; each one of them representing concentrations of the particulate matter (Zn and Pb) (Núñez-Alonso et al. 2019). Cluster analysis involves splitting a given set of data into several group observations with unique characteristics in terms of common values or attributes of the group. Hence hierarchical cluster analysis aims to maximize between-group variance and further minimize within variance in the same group. A major advantage is that any number of variables can be used to group members of the given sample (Saksena et al. 2003).
Hierarchical cluster analysis will show how these particulate heavy metals and sampling sites relate to each other and thus plotted in a dendrogram using origin 2.0 Software as shown in Fig. 6. A two-cluster solution was selected to perform analysis for each particulate matter. The dendrogram shows that there are five primary clusters (cluster 1, cluster 2, cluster 3, cluster 4, and cluster 5) built into two other clusters when an imaginary straight line is drawn across from 0.2 scalings. The cluster also shows that these particulate metals relate with each other beyond their sampling sites. These can be seen for instance in Fig. 7 that sample points 1, 3, and 7 (cluster 1) is closely related to sampling points 5 and 21 on the same cluster. Similarly, on the far right-hand side (cluster 4), it shows that sampling points 2 and 30 are more closely related than 27 and 31 in cluster 5. Also, from cluster 2, sampling points 11, 13, and 15 are commonly related to 10, 14, and 16 than any data in cluster 1 and cluster 3 and 5. Hence there are multiple levels of clustering showing similarity of measurement and relationship in terms of levels of concentration among the data sets regardless of the measured particulate matter. Thus cluster 2 is closer in similarity to any of cluster 1 and cluster 3 than 4 and 5 because the distance between them is almost twice.
However, it took half of the dendrogram to join cluster 1 and cluster 2 together which means that they are quite different but more closely related to each other in terms of the levels of concentration than cluster 3 which took 3/4 of the dendrogram. Finally, the distance needed for clusters 4 and 5 to join clusters 1, 2, and 3 took the other half of the dendrogram which means that the concentration levels from clusters 4 and 5 are entirely dissimilar from other sampling points.
Enrichment factor (EF)
The Enrichment factor (EF) is applied in the statistical analysis of data to identify the anthropogenic source of metallic pollutants. It is useful in the determination of the degree of enrichment of a particular element compared to the relative abundance of that element in crustal material (Kothai et al. 2011). From our study, the crustal EF is calculated using Iron (Fe) as the reference material. The elemental compositions of soil in Enugu and Abakaliki are used from Chibuike et al. (2019) and (Okolo et al. 2013), whereas the aerosol levels are also calculated from Offor et al. (2016). The formula used for enrichment calculation is EFi = (i/j)air/(i/j)crust. where EFi is the enrichment factor of Zn and Pb species of I, J. where Iair is the content of the measured species in the examined environment, and Jair is the content of the examined metal in the reference environment. Icrust is the value of the reference element in the examined environment and Jcrust is the value of the reference element in the reference environment (Shaari et al. 2015). From our results, all the EF for Pb at all stations in Enugu was below 9 suggesting that the crustal source is dominant. While EF level at Abakaliki was between 10 and 11 meaning that the non-crustal source is the dominant source. For all calculated EF, Zn values were between 3 and 5 in both towns which indicate minor to moderate enrichment of crustal source.
Pollution indices
The overall suspended particulate matter pollution indices and their severity were determined using the Contamination Factor (CF) and Pollution Load Index (PLI). The CF and PLI were calculated using the straightforward mathematical expression by muller in 1969 and Tomlinson in 1980. Detailed explanations can be found in recent publications by Nagarajan et al. (2019) and Kowalska et al. (2018). The identification of contamination level of trace/heavy metal in the environmental matrix analysis is determined using Cf, while the PLI factors into consideration, the potential contribution of all elements to indicate the extent of pollution in a particular location. In this analysis, the soil metal concentrations and soil background concentrations were obtained using methods described by Basha et al. (2010) and Ercilla-Montserrat et al. (2018).
The results showed that CF for (Zn) at any of the Abakaliki sampling sites was 1 = CF < 3 which implies moderate contamination, nevertheless CF for (Zn) at Enugu sampling sites revealed CF < 1 at any sampling site and refers to low contamination levels. The calculated PLI (Pb) values for the entire Abakaliki and Enugu location were 1.318 and 1.603 (> 1), hence an indication of Pb pollution. On the other hand, PLI (Zn) values for the entire Abakaliki was > 1 (1.145) while PLI (Zn) values for the entire Enugu (0.0381) were less < 1 and indicates no pollution. The overall effect is that Abakaliki has Zn particulate matter enrichment, while the entire sampling location and site is at risk of Pb particulate matter pollution. The findings depict the relationship that exist between polluted soil and atmospheric suspended particulate matter in the study location.