Developed channel propagation models and path loss measurements for wireless communication systems using regression analysis techniques

The development of powerful and flexible management addresses is due to the improvement of deeply reliable gadgets and the advancement of the concept of cellular. The cellular principle was a major way of solving the wavelength crowding problem and the user capacity. It offered high capacity without major technological changes with limited allocation in spectrum. Wireless communication is an innovation in media transmission that allows remote transmission in all types of terrain between convenient gadgets. For estimating a transmitter 's radio coverage area, propagation models that anticipate the mean signal strength for an arbitrary transmitter–receiver separation distance are crucial as they are called large-scale propagation models so even though they define the average signal strength over long periods of time and large distances from the transmitter. Developed propagation models are presented according to the measured path loss values in exemplary urban and suburban areas at the operating frequency of 3.5 ghz by using the regression analysis. The measurements are implemented by using a spectrum analyzer FSH6 to get the channel response as shown in the stated tables and graphs. Based on the obtained results, it was observed that the path loss could be calculated as a function of distance and during the practical measurements 32 m was precise as a presumption for the break point distance. The values of path loss exponent (n) are defined and calculated for both of urban and suburban regions. Measurements results are analyzed and compared in order to study their influence for every specific environment. It was noticed that any radio signal will suffer attenuation when it travels from the transmitter to the receiver as a variety of various phenomena give rise to this radio path loss. The interaction between both the electromagnetic radiation and the environment tends to decrease the quality of the signal being sent from the transmitter to the destination which causes the loss of the path. Propagation models are the basis for channel estimation, as they attempt to identify how a radio transmission changes from the transmitter to the receiver throughout its path. The gained results from this research will be supportive for the arrangement of network planners and researchers as proof and directory materials before future location establishment.


Background
The existence of the weak quality of the signal as well as the way of misfortune due to the decrease in the control thickness of an electromagnetic wave when it moves via obstacles and the multi-track propagation environment Page 2 of 11 Zakaria et al. Bull Natl Res Cent (2021) 45:54 has been a major challenge with the use of wireless communication systems over several years and this phenomenon is especially obvious in urban centers with several obstacles and high population density (Emagbetere and Edeko 2009). The loss of the path may be triggered by several impacts, such as loss of free space, diffraction, reflection and absorption. Because of the differences in city structures, local terrain profiles and climate, the path loss prediction could even vary from one propagation model to the other. An accurate estimate of the path loss is extremely useful for estimating recurrence assignments, base station scope areas, fair assurance of the efficiency of the electric field, obstruction investigation and alteration of control level (Obota et al. 2011;Zakaria 2018). By modifying the current fading channels prediction models with both the practical measured data, one can determine the most accurate path loss predictive models for the mobile propagation channel. Moreover, the power received in free space ( P r P r P r ) is given by the well known Friis transmission equation (Failed 2005): where d is the distance between transmitter and receiver in meters; G r G r G r is the receiver antenna's gain; G t G t G t is the transmitter antenna's gain; P t P t P t is the transmitted power; L is the system loss factor, considered to be equal 1; is the wavelength in meters. In 1865, James Clerk Maxwell also suggested the idea that electromagnetic signals had the potential to travel over large separations with the speed of light (Maxwell 1865). By including the razing current term to the range of criteria that could include linear wave motion among their possible solutions, administering the electromagnetic conditions which now named Maxwell's equations. Hence, the electromagnetic waves will be able to disperse throughout essential ranges. In 1886, Heinrich Rudolf Hertz ascertained the transmission of electromagnetic waves (Schwab and Fischer 1998).
The path of transmission between both the transmitter and the receiver may vary from simply line of sight (LOS) to a non-line-of-sight (NLOS) gravely impeded by buildings or vegetations. On the opposite, smallscale or fading models are named propagation models that describe the rapid variations of the receiver signal intensity over quite short distances of travel or short durations of time. Also, in the lognormal path loss propagation model the average path loss for an arbitrary, L p (d), L p (d), L p (d), is expressed as a function of the distance d by using a path loss exponent (n), independently of the presence of a direct LOS between the transmitter and the receiver units as (Failed 2003): (1) where n is the path loss exponent that indicates the rate at which the path loss increases with the distance; d 0 d 0 d 0 is called the free-space close-in reference distance.
As of now, wireless communication frameworks are exceptionally fast developing due to the inclusiveness of innovations that allow wide-spread deployment (Rappaport 2002). The ability that provides cellular networks to the entire population was first envisioned when the cellular principle was developed in the 1960. In order to completely cover the signal's transmission path of excellent quality, the signal received in the cell must be strong. Once the signal passes the boundary of the cell, it will become intervention. Thus, the channel frequency is generally not repurposed in adjacent cells. If reused, intervention with the co-channel could affect the reception of the signal in the adjacent cells, and the quality of the service may be severely degraded (Goodman 1988;Wilson 1985).
Hence, attention should be paid to the fact that propagation models are used widely to measure the loss of paths in wireless communication through various types of environment. It is very important to gather information of the electromagnetic wave environment, such as the position of the transmitter and receiver, and the frequency at which the device is controlled together with formulated a radio coverage prediction model based on building parameters. The parameters in this model's procedure were extracted from the measuring data. The model offers a fast and easy way to predict loss of path in an indoor environmental quality. Therefore, this was useful for predicting the primary coverage (Keenan and Motley 1990).

Measurement environments
The measurement campaign concentrated on two distinctive engendering situations which are urban and suburban. The first study case was performed in Dokki district which reflecting a typical urban area with high buildings, very low vegetation, medium houses and vast streets. In this area, roads are predominantly wide and sufficiently rectilinear. The average building height at this environment is less than seventeen meters, and heavy concrete and brick are the most commonly used building materials. The urban environment was chosen to display a typical urban region which consists of densely buildings. Concrete and brick are considered to be the foremost commonly utilized building materials in that area. Around 53% of the area is filled with the buildings and 4% (2) Page 3 of 11 Zakaria et al. Bull Natl Res Cent (2021) 45:54 is filled with the vegetation. The measurement locations statistics are shown in Table 1. The second study case was conducted in Faysal district which reflecting a typical suburban region which consists of medium buildings, low vegetation and narrow streets. The average building height at this environment is less than fourteen meters with four to seven floors, and concrete together with brick are the most commonly used building materials. About 40% of the area is filled with the buildings and 7% is filled with the vegetation. The measurement locations of urban and suburban regions are indicated in Figs. 1 and 2, respectively. At measurement locations, the blue line refers to the measurement road.
A spectrum analyzer FSH6 ranging from 100 khz up to 6 ghz was placed in a moving vehicle to record the channel response. Furthermore, the measured path loss was obtained for the relaying system for the usage of outdoor conditions. The coordinates of measured sites are indicated in Table 2. The tracks were steady directly away from the transmitter as the purpose of those measurements was to inspect the behavior of the signal attenuation and the propagation steep, whereas the receiver was recording various samples from the travelling distance to record the large average values. The measurements survey was conducted with consultation, cooperation and discussion from National Research Centre (NRC) and Aswan University.

Measurement setup and apparatus
In this part, measurements were performed with a view to consider its impact on the affectability reaction of the models and signal propagation. Measurements were analyzed in urban and suburban environments. Measurement configurations are shown in Table 3. It was favored during the measurement setup to use the frequency band of 3.5 ghz as the properties of waves have good match with scope range in such urban and suburban provinces in order to explore the behavior of the signal attenuation and propagation slope.
It can be easily realized that the signal can diffract around the object but losses occur. The loss is higher, the more rounded the object as radio signals usually tend to diffract better around sharp edges and radio signal path loss due to propagation mechanisms could be occurred when an object appears in the path.

Results of measured data
The case of each location was conducted in order to investigate the behavior of the signal attenuation and propagation inclination. The table of measurements for both of urban and suburban environments is stated in Table 4. Moreover, the measured path loss for both of Dokki and Faysal territories is shown in Fig. 3.
According to the above results, it can be observed that the measurements could be oversensitive for guided waves propagation phenomenon as the street orientation and structure can assist the radio waves to propagate slickly as the values of the cable loss and insertion losses sometimes could be removed to calculate the path attenuation only due to spread zone and to have precise valuation. This compendium agrees with the description in (Zakaria et al. 2015;Abhayawardhana et al. 2005).
Attention should be directed to the outcome of the measured path loss values of Dokki region which are conformable in urban environment. According to parameter adjustment and path loss analysis, some measures that considered to be supplementary out of the way from the transmitter were excluded whose samples were considered very far from the plurality of the values in order to expel the measurement noise that caused by some obstacles. The mean analysis of all samples was performed to validate the collected data for each measure. Thus, some losses may be due to inefficient connected cables or broken connectors.

Regression analysis based on measured data
In statistical modeling, the analysis of regression is regarded as a collection of statistical assessing the relationship between variables. Thus, it is important to point out that path loss exponent (n) is a considerable factor for the design of radio signal propagation. By utilizing linear regression analysis, the path loss exponent (n) can be determined by minimizing the difference between measured values (P m ) and predicted values (P r ) of Eq. (2) to yield:

Fig. 3 Path loss in Dokki and Faysa territories
Page 5 of 11 Zakaria et al. Bull Natl Res Cent (2021) 45:54 where P m (d i ) represents measured path loss; P r (d o ) represents predicted path loss; d o = distance = 100 m; k is the number of measured data or sample points. A standard error deviation between prediction and measured specimens is thought to be a great measure of shadow fading when the values are properly collected to accelerate the fast fading component. Attention should be directed to the difference between P L (d i ) -P L (d 0 ), which also reflects the difference between (P m -P r ) which is an error term with respect to n, and the sum of the mean squared error, ε(n)ε(n)ε(n) . This error is given by the following equation: where By the way of this analysis, a comparative analysis is performed with the predictions that provided by the regression analysis. The predicted path loss results for both of Dokki and Faysal districts are shown in Tables 5 and 6, respectively.  Page 6 of 11 Zakaria et al. Bull Natl Res Cent (2021) 45:54 Then, the equations for mean squared error, ε (n) are as follows: After that by minimizing the Mean Square Error (MSE) by equating the derivative of equations to zero, we get the values of the coefficients n which depict how quickly the signal attenuates as a function of distance is calculated for urban area = 10.719 and for suburban area = 8.546. The mean analysis of all the samples was performed to validate the collected data for each measure. In statistical modeling, regression analysis is considered as a set of statistical processes for estimating the relationships among variables. Thus, it is obvious from the following graphs that the distance d [m] is plotted along with the horizontal axis, along the vertical path loss PL [db]. Since the logarithmic scale is on the horizontal axis, the logarithmic trend line is a straight line, and the linear trend line is the curve.

Linear regression
Linear regression is a basic form of predictive analysis and is commonly used. It was used to test the relation between two variables by fitting a linear equation to the observational data (Malakooti 2013). One variable is considered to be an explanatory variable, and the other is considered to be a dependent variable. A linear regression line has an equation of the form Y = a + bx, where X is the explanatory variable and Y is the dependent variable.
Before trying to integrate a linear model to observational data, a modeler must first evaluate if there is a relationship between the interest variables. This does not necessarily indicate that one variable affects the other, but also that the two variables have some stronger correlation. Linear trend analysis for both of urban and suburban environments is shown in Figs. 4 and 5, respectively. A scatter plot may be a useful tool for determining the relationship strength between both the two variables. If no association appeared between both the proposed explanatory and dependent variables. To view the model's fit to the observed data, to evaluate the results, one may plot the computed regression line over the actual data points. Although most data points are grouped in the direction of the plot's lower left corner, there are a few points that keep lying far from the data's main cluster. These parameters are called as outliers and may have a significant impact on the regression line depending on their location (Good and Hardin 2009).
A point which lies far from the line and has a large residual value is known as an outlier after a regression (9) P L = ad + bP L = ad + bP L = ad + b  Zakaria et al. Bull Natl Res Cent (2021) 45:54 line has been computed for a data group. Such points could portray erroneous data, or may signify a regression line which is inadequately fit. If a point in the horizontal direction lies much further from the other data, it is known as an influencing observation. The possible explanation for this distinction is that the slope of the regression line can have a massive effect on those points. The examination of the residuals (the deviations from the fitted line to the observed values) allows the legitimacy of the assumption that a linear relation exists to be investigated once a regression model has also been fitted to a data group. Plotting the residuals on the y-axis against the dependent variables on the x-axis reveals any possible nonlinear correlation between different variables, or might alert the modeler to evaluate the lurking variables that exist when the relationship between two variables is significantly affected by the presence of a third variable that was not included in the modeling effort. Therefore, whenever a linear regression model aligns a data group, the data range should be carefully monitored (Malakooti 2013;Good and Hardin 2009;Freedman 2005).

Exponential regression
An exponential regression is the process of finding the exponential function equation which fits best for a data set. Exponential trend analysis for both of urban and suburban environments is shown in Figs. 6 and 7, respectively. As a result, we get an equation of the form y = ab x where a ≠ 0.  Zakaria et al. Bull Natl Res Cent (2021) 45:54 Logarithmic regression Logarithmically transforming variables in a regression model is a really common way of handling circumstances where even the independent and dependent variables have a nonlinear relationship. Logarithmic trend analysis for both of urban and suburban environments is shown in Figs. 8 and 9, respectively. Using the logarithm with one or even more variables rather than the unlogged form makes the appropriate partnership nonlinear, while the linear model is still preserved. Logarithmic transformations are also an efficient way of converting a highly skewed variable into an approximately normal one (Armstrong 2012; Aldrich 2005).
As previously mentioned from results analysis, it can be noticed that the path loss is dominated by the free space loss and the building parameters function as added independent parameters to increase the overall model accuracy. Also, for both the building outlines and the foliage boundaries, the obtained value for the building attenuation decreases when the analyzed data are reduced to paths for each data set as described in (Fotheringham 2002;Zakaria and Urban 2016;Anderson 2003). According to the cases examined in this research, it can be observed that the value of path loss exponent (n) is higher in urban environment than in suburban environment. Thus, propagation of radio waves is considered to be a complex episode and generally characterized by reflections from soft surfaces, diffractions from the edges of buildings, scattering from coarse surfaces and refractions among two propagation mediums.

Developed channel propagation models
In this section, we present developed channel propagation models that derived from collected 3.5 ghz frequency band experimental data and distinguish among various terrain categories for outdoor conditions. Regression analysis is a set of statistical tests that are used to estimate the relationships between variables. It contains many methods to model and analyze multiple variables when the emphasis is on the relationship between a dependent variable and one or even more independent variables.
(10) P L = ae bd P L = ae bd P L = ae bd (11) P L = alnd + bP L = alnd + bP L = alnd + b Also, Analysis of regression can be used to deduce causal relations among both independent and dependent variables (Aldrich 2005). A simple power law path loss model which can be described as the ratio of the transmitted power to the received power was selected to anticipate the length of a reliable communication system, and this model is expressed as the following equation (Zakaria June 2016). Thus, in order to determine exponent (n) it was necessary to have a table of path loss dependence on the distance d with the same step of quantity d and the length of the step was 100 m.
where the first part is the path loss at the reference distance d 0 d 0 d 0 and the second part depends on the distance di together with the path-loss exponent (n).
The simple power law path loss model was chosen for predicting the distance of a reliable communication for the following reasons: The model is created as the measure of median radio frequency attenuation as per the concept of the path loss. Also, we have to bear in mind that the free space path is the loss of transmission strength of an electromagnetic wave which would result in diffraction from a straight path via the free space without any adjoining impediments. It is more helpful to be able to specific the transmission way in terms of a frontal loss because it is possible to calculate elements including the expected signal.
By referring to the above calculations, we found that Dokki region approximately perfectly suits the value of n = 10.719 and Faysal region suits the value of (12)

Fig.10 Comparison of results of trend lines in Dokki
Page 9 of 11 Zakaria et al. Bull Natl Res Cent (2021)  It was conspicuous from the above results that these results could be useful for improving the already existing propagation models and developing the new more accurate models.
Regression could even specifically refer to the prediction of continuous response variables, as compared to the discrete response variables used in categorization. To distinguish this from related problems, the case of a continuous dependent variable may be more specifically referred to as metric regression. To assess the distance between transmitter and receiver, a simple power law path-loss model was selected. The model is created as per the path loss concept, as the measurement of median radio frequency attenuation happened when the signals finally reach the receiver (Failed 2005;Stuber 2017). Moreover, the path loss or free-space loss (L fs )(L fs )(L fs ) is defined as the ratio between the effective transmitted power and the received power that can be calculated by using the following equation (Failed 2005): where f is the frequency in mhz; d is the distance between transmitter and receiver in meters.
To carry out a regression analysis, we need to define a dependent variable which is influenced by one or more independent variables. A lot of techniques have been developed for trying to carry out regression analysis. Commonly associated methods like linear regression and ordinary excluding square regression are symmetric in that the regression task is described in terms of a finite number of unknown values predicted from the data. So, nonparametric regression refers to techniques that enable the regression function to keep lying within a given set of functions that may be infinite dimensional (Bishop 2006).
Developed channel propagation models for densely urban and suburban environments as Dokki and Faysal are presented in the below section. By taking into consideration the regression analysis of the measured values together with the above conclusion and analysis, the following propagation models are obtained to evaluate the path loss (P L ) in Dokki and Faysal regions: Linear model: Exponential model: (18) Dokki: P L = 11.763 lnd+33.403Dokki: P L = 11.763 lnd+33.403Dokki: P L = 11.763 lnd+33.403

Fig.11 Comparison of results of trend lines in Faysal
Page 10 of 11 Zakaria et al. Bull Natl Res Cent (2021) 45:54 These models make uniqueness between various terrain classes. We found these distinctions to be very significant as experimental data showed tens of decibels of variation in path loss between classes. It was observed that the results of measurements furnish practical values for standard deviation of shadowing and path loss exponent (n) in urban and suburban environments. Meanwhile, these parameters can be used to prophesy reliable communication ranges with a simple power law path loss for future communication systems. Thus, the signal path loss is fundamental for designers of wireless communication systems and proving whether or not the designed system is qualified for confronting the requirements. It was noticed that the specifications of proposed models agreed with some propagation models as in Zakaria (2017), Rahul and Bansal (2019), Zakaria and Glesk (2017). Accordingly, in several radio and wireless survey tools, radio-path loss calculations are used to evaluate signal strength at different locations. These wireless survey techniques are progressively being used to better decide what strengths of the radio signal would be before installing the equipment. Obviously, radio coverage studies are essential for telecommunication networks because investment decisions in a macrocell base station are high. Furthermore, wireless judicial precedent provides an useful service for applications such as installing wireless systems in large offices and other centers, as they allow problems to overcome before installation and significantly reduce costs. Thus, there is an increasing importance being placed into wireless survey tools.
As a conclusion to be drawn from the above discussions, it can be realized that regression analysis estimates the dependent variable's conditional expectation given the independent variables which is the mean value of the dependent variable when the independent variables are set.

Conclusion
It is clear that in order to be able to design a wireless system, it is necessary to understand the reasons for the loss of the radio path and to be able to identify the signal loss levels for a given radio path. Path loss calculation is one of the major elements that we have to deduce. Thus, finding a precise propagation model for propagation losses is a leading matter that will be helpful for giving us the best path loss prediction when designing wireless communication networks. It is necessary to calculate the path loss exponent (n) as it is a fundamental factor for the design (19) Faysal: P L = 10.11 lnd+32.312Faysal: P L = 10.11 lnd+32.312Faysal: P L = 10.11 lnd+32.312 of radio signal propagation in wireless communications systems and is capable of predicting the accuracy of radio propagation behavior. Regression analysis is a powerful statistical technique which enables us to measure the relationship between independent or more interest variables. Despite there are many types of regression analysis at their core, they all examine the influence of one or more independent variables on a dependent variable. Usually, the formulation of statistical models requires a balance between providing as much detail as possible and extending utilitarian quantities with minimal complexity. By the end of this conclusion, we have to point out that the proposed models could be beneficial to researchers in the scope of site-specific planning during the analysis and design of the interface budget of wireless telecommunication systems.
Abbreviations LOS: Line of sight; NLOS: Non-line-of-sight; P m : Measured values; P r : Predicted values; P L : Path loss; L fs : Free space loss.