MOTORIST YIELD RATE MODEL AT UNSIGNALIZED CROSSINGS

Preliminary notes A research was conducted, for the paper, in two countries in the South-Eastern Europe region, at multiple pedestrian crossing locations in order to determine the value of motorist yield rate (MYR). The data gathered at 32 locations were used to define a mathematical model for MYR depending on location characteristics and vehicular and pedestrian traffic flow characteristics. Using regression analysis method, six versions of the model were created. The chosen model was tested at six locations and the results showed that the model matches the empirical data very well. The mathematical model for calculation of MYR could be applied in operational and planned analysis of level of service (LOS) at pedestrian crossings.


Introduction
One of the riskiest actions in traffic is a pedestrian crossing the road.For this reason, at intersections or other parts of street network pedestrian crossings are defined as special areas designated for pedestrians to cross the road.Pedestrian crossings where traffic is regulated by traffic signals are safer compared to unsignalized pedestrian crossings because the signal plan defines periods when pedestrians are allowed to cross the common road surface.Unlike those, at unsignalized pedestrian crossings, the crossing of pedestrians is not that strictly defined.In all countries traffic rules oblige drivers to yield for pedestrians at marked pedestrian crossings.However, even though this obligation exists, the behaviour of the drivers and the pedestrians depends on a large number of different factors, such as roadway geometry, vehicle speed, traffic regulations at pedestrian crossings, local traffic culture, law enforcement, etc.
Pedestrian crossings are traffic areas for which the quality of the conditions for pedestrian traffic is defined, i.e. the level of service (hereinafter: LOS), which is determined based on the average pedestrian delay to cross the street at a pedestrian crossing.Motorist yield rate was also used (hereinafter: MYR) (My) as a parameter in the calculation of pedestrian delay and calculation of LOS of pedestrian crossings using HCM 2010 method.MYR is calculated as a ratio of the number of vehicles that stopped or slowed down before a pedestrian crossing and the total number of vehicles that could have stopped or slowed down in order for pedestrians to cross the road [1].HCM 2010 gave recommendations for MYR values based on engineers' research, so that for unmarked pedestrian crossings the value of 0 should be adopted, and for clearly marked pedestrian crossings 0,5.Experiences with the recommended values of other parameters of traffic flow in the procedures to determine LOS showed that local conditions can significantly influence their values.The position of pedestrian crossings in the street network, characteristics and structure of traffic flow of vehicles and pedestrians, pedestrian crossing geometry, speed limit, etc. can affect the value of MYR.Adopting MYR values different from the recommended ones can significantly change the results of LOS analysis of pedestrian crossings.
For the needs of this paper, a research was conducted in two countries in the South-Eastern Europe region at multiple locations of pedestrian crossings in order to determine MYR values, with the goal of defining a mathematical model depending on location characteristics and characteristics of traffic flow of vehicles and pedestrians.The mathematical model for calculation of MYR could be applied in operational and planned analysis of LOS at pedestrian crossings.

Overview of previous research
Research done in the United States of America (USA) showed that factors such as road width, number of traffic lanes, allowed speed and built street environment affect MYR [1,2,3].Most of the researches studied MYR depending on the type of pedestrian crossing, in the sense of it being equipped with certain devices which can rarely be found in a city street network in South-Eastern Europe.In a comprehensive research of improvement of pedestrian safety at unsignalized pedestrian crossings, published by Transportation Research Board [1], MYR was also analyzed at locations that were grouped in three categories, depending on how technologically equipped each of the pedestrian crossings was: HAWK signals (High-intensity Activated cross WalK beacon); Overhead flashing beacon or Pedestrian crossing flags; and highvisibility signs and markings.Pedestrian crossings with red light signals gave extremely good results, with the percentage of stopped vehicles of over 94 %.The research team concluded that pedestrian crossings with the crossing regulated in this way are efficient because they send a clear message to drivers (red signal means STOP), so they have to stop and yield for pedestrians, which has been confirmed by research from other authors [4,3].Crossings with flags and traffic signs next to the pedestrian crossings also gave good results, because the percentage of vehicles that yielded for pedestrians was 65.% and 87.%, respectively.If pedestrians are crossing two traffic lanes, MYR expressed as percentage can go as high as 75.%, and if they cross four traffic lanes the percentage is between 30.% and 100.% [1].If the pedestrian crossing is visibly and clearly marked, and it also has a pedestrian refuge, the analyses show that such environment positively affects the increase of MYR, and especially at the parts of a street network with lower speed limits.Namely, MYR at the part of a street network with 40 km/h speed limit was over 60.%, while for 55 km/h speed limit it was only 17.% [1].
In the South-Eastern Europe region there are no researches that were done to determine LOS of pedestrian flows at unsignalized intersections, or recommendations for the value of the parameters that appear in the calculations.Worldwide research in this area is also scarce, because only HCM 2010 presented a method to determine LOS at pedestrian crossings where one of the parameters is MYR.Determining this parameter while calculating pedestrian delay and calculation of LOS leans on recommendations based on previous research, mostly from the USA [1,5,6].Tab. 1 gives an overview of calculated MYR values depending on the type of pedestrian crossings and the type of the crossing of pedestrians [6].The results of these researches and recommendations that came out from them should be applied with reservations to calculations that are done in different local traffic conditions.Previous research [7,8] showed that the values of MYR in the South-Eastern Europe region are significantly different from the value of this parameter that is recommended according to the HCM 2010 method.In the case when the calculations use the MYR values from HCM 2010, unreal values of pedestrian delay, i.e. different classes of LOS, are produced.When it comes to model design, the present MYR researches were directed towards designing probable models, i.e. determining the probability of vehicles stopping at a pedestrian crossing depending on variables connected to vehicle characteristics, pedestrians and traffic conditions [4,9,2].In the available references there are no researches that dealt with the problem of model design for MYR and selection of influential variables.Even if such models existed, they would have to be calibrated in order to customize them for local conditions in the South-Eastern Europe region.
These were the reasons for the research, with the goal of creating a unique mathematical model for calculation of MYR, which would consider the results of local measurements, because in that way the influence of specific characteristics of local environment and traffic flow characteristics would be valued.

Method and research locations
With the goal of gathering relevant data that would be used to create an appropriate base, a research was conducted at 38 locations, i.e. at unsignalized pedestrian crossings.The locations are in five different cities (Novi Sad, Zrenjanin, Subotica, Vrbas, Bijeljina) and in two countries (Republic of Serbia and Bosnia and Herzegovina).The chosen locations have different characteristics, such as: type of pedestrian crossing, pedestrian crossing geometry, built environment, traffic conditions, traffic regulations and traffic flow structure (Figure 1).
Measuring traffic flow parameters by processing videos is one of the oldest [10], but also the safest methods, that has proven to be an efficient way of gathering data needed for analysis in a large number of researches so far [11,12,13].For that purpose, traffic flow of vehicles and pedestrians at the locations of the chosen unsignalized pedestrian crossings was taped.The locations were taped during February and March 2015 during the afternoon rush hour, i.e. between 13:00 and 14:00 and between 14:00 and 15:00, depending on the location.SAMSUNG SMX-F33BP/ECD 42x optical zoom camera was used for the taping.
The collected videos provided traffic flow parameters needed for further analysis; those parameters are: pedestrian flow rate, vehicular flow rate and traffic flow structure (participation of freight vehicles and buses).At all the recorded locations a total of 12 786 pedestrians and 29 253 vehicles were recorded, while the percentage of freight vehicles and buses was up to 26,4 % and 8,7 %, respectively.By analyzing the videos, we obtained the number of vehicles that stopped/slowed down to yield for pedestrians at a pedestrian crossing, which was used to calculate MYR.For all pedestrian crossings, data on location characteristics was gathered as well, such as: number and width of traffic lanes, width and length of pedestrian crossing, existence of pedestrian refuge, vehicle direction, influence of parked vehicles and bicycle traffic, but also information on the applied measures at the locations, such as speed limit, traffic calming, and if the pedestrian crossing is in a school zone.The data gathered at 32 locations were used to create a database, as a base to design a mathematical model for MYR, while the other 6 locations were used to test (validate) the model.For processing of the videos and data analysis the following programs were used: KM Player 3.All gathered location characteristics, a total of twenty, were defined as starting values for design of a mathematical model, i.e. as independent variables (x 1 ,..., x 20 ), and the value of MYR (My) was defined as a dependent variable (y) (Tab.2).Out of the total of 20 independent variables, 11 are binary variables that take the value of 1 or 0, depending on whether a location has a certain characteristic or whether a certain statement is true.Considering that the research has been done in the two countries, The Republic of Serbia and Bosnia and Herzegovina, the variable x 1 (Country) has been introduced to determine whether there are significant statistical differences in the measured values of MYR depending on the country.That is a restriction for this model applying in other countries such as Czech Republic, Poland, Hungary, and Slovakia Republic etc. without further research on more locations in order to determine whether there are differences in the measured values of MYR.Similarly the variable x 2 (Total resident population) has been defined as a binary variable (The number of residents is less or more than 100 000).In further research, which would include more towns, the precise number of residents might be used as an independent variable.
The first phase in the development of the MYR model included application of statistical regression techniques to determine equations that describe the changes between the dependent and independent variables in the best way possible, with the goal of defining an equation that brings the difference between real (empirical) and model values down to the minimum.Multiple linear regression was used to design the MYR model.This approach is based on the assumption that there is a linear relationship between the dependent variable (y) and one or more independent variables (x 1 , x 2 , ..., x n ) that can be represented as: Different tests were used for statistical evaluation of the regression model: (t) test, correlation coefficient (r), coefficient of determination (R 2 ), standard error (S)...), where the total mark of the model validity was based on the results of individual tests, together with testing logical connections between the variables.Namely, when applying regression analyses, logical connections among the variables are extremely important because without that even a statistically correct model does not necessarily give good results in further application.
After the applied statistical methods in Minitab 17.0 program package and variation of several models, the most optimal model was chosen according to the criteria of statistical and logical evaluation, which was in accordance with the initial assumptions.After that, the model was tested with the data gathered at the 6 locations, where the values obtained with the model (My mod ) were compared with the actual values of My that were recorded at the pedestrian crossings in real traffic conditions.

Selection of variables for model design and analysis results
Based on the research results, a correlation matrix was created that contains correlation coefficients (r) for all combinations of variables for statistical significance α=0,95.Interpretation of the correlation matrix results is important for determining connections among the data, as well as for the selection of variables that will be included in the model.Different authors give different interpretations of the value of the correlation that is used to determine connections, i.e. links among parameters.
According to Petz [14] for rough approximation of the level of connection between two variables we can use the following classification: -r from 0,00 to ±0,20 means no or remote connection -r from ±0,20 to ±0,40 means slight connection -r from ±40 to ±0,70 means real significant connection -r from ±70 to ±1,00 means high or very high connection.
After analyzing the results of the calculated correlation matrix, independent variables that have the strongest influence on the dependent variable My (Tab.3) were selected, and those are: x 1 (country), x 9 (moving direction), x 16 (pedestrian flow rate), x 18 (number of vehicles/number of pedestrians), x 19 (mode share of buses) and x 20 (mode share of freight vehicles).Besides the selected independent variables with the strongest influence, further analysis of the correlation matrix selected three other variables that will be included in the model design.(total number of vehicles/number of pedestrians), due to relatively low p-value (p<0,1) variable x 17 was selected as a potential variable in the model.None of the variables from the group of geometrical characteristics of pedestrian crossing was proven to be statistically significant while influencing the dependent variable.After the analysis of earlier research that dealt with development and application of methods for increasing MYR, the most influential was application of certain technical measures and pedestrian crossing geometry, which directly influenced the traffic pattern.Assuming that geometrical and technical characteristics still influence MYR, out of all the variables we chose the ones that showed as most influential in the analysis, and those are the independent variable x 15 (length of pedestrian crossing) and variable x 8 (pedestrian refuge island).In this way, nine independent variables were chosen out of the total of twenty variables, and they were used for model design, and a part of the correlation matrix with variables separated is shown in Tab. 3.
The correlation matrix also points to the direction of the influence of different independent variables on the value of the dependent variable, so the following rules were noted: -Out of all the selected independent variables, the variable x 16 (pedestrian flow rate) and variable x 8 (pedestrian refuge island) have a positive correlation coefficient in relation to My.That means that with the value increase of the independent variable x 16 , i.e. pedestrian flow rate increase, or with the existence of a pedestrian refuge at a location (x 8 =1), the value of MYR also increases.-All other independent variables have negative correlation, i.e. by increasing their value, MYR decreases.That, in the first place, refers to the variables x 15 (length of pedestrian crossing), x 17 (vehicular flow rate), x 18 (number of vehicles/number of pedestrians), x 19 (mode share of buses) and x 20 (mode share of freight vehicles), because all those are continuous variables that can take any value.Independent variables x 1 (country) and x 9 (moving direction) are discrete variables and they can have the value of 0 or 1.That, for example, means that the existence of two-way movement of vehicles at the pedestrian crossing location (x 9 =1) influences the decrease of MYR, compared to the case when the vehicles move in one direction only (x 9 =0).
Model design was done by application of regression analysis by 'Best subsets' and 'Stepwise'methods, and a version of the model that included all the selected variables was also done (full model).The 'Best subsets' method is an automatic procedure that identifies and selects the best regression models with the chosen number of variables, while the 'Stepwise' method, i.e. regression analysis in steps, selectively chooses independent variables that significantly explain the dependent variable.
The full model (Version 1) was done with all nine variables that were selected for model design.The 'Best subsets'method was used for Versions 2, 3, 4 and 5 with three, four, five and six independent variables, respectively.In Version 6, 'Stepwise' method was used and in seven steps a model that separated five independent variables important for the model from the initial nine was created.
Tab. 4 shows the most important statistical values of all model versions, as well as the independent variables that were included in the procedure of model design in accordance with the applied statistical methods.The dependent variable in all versions was My.
Even though the model in Version 1 has an extremely high determination coefficient R 2 (89,78 %) and low pvalue (p<0,000), which is to be expected for the model that includes all variables, a certain number of variables did not show statistical significance for the created model.
In Version 2 and Version 3 the created models include variables that represent the characteristics of the structure of traffic flow and pedestrians, while there is no variable that represents geometrical-technical characteristics of a location, which, based on previous research, proved to be extremely important for their influence on MYR.Version 4, besides all variables from previous versions obtained through 'Best subsets' method, also included the variable that describes the movement direction of vehicles at a pedestrian crossing.Even though Version 4 satisfies all the criteria of statistical marks and criteria of the selection of independent variables in a model, other versions were done as well.Version 5 also included an independent variable that did not satisfy the conditions of statistical evaluation (p value and t test).Version 6 was obtained through 'Stepwise' method that was done in seven steps, and Tab. 5 shows the obtained model, the most important results of regression analysis, as well as the order of introduction and removal of single independent variables by the applied regression method.In the last step (7) the most optimal model was created, which contains five independent variables, and the results of the 'Stepwise' analysis completely match Version 4, because the same variables were chosen: x 9 (moving direction), x 16 (pedestrian flow rate), x 17 (vehicular flow rate), x 19 (mode share of buses) and x 20 (mode share of freight vehicles).Creation of the model in Version 6 by the stepwise regression analysis confirmed the selection of the model created in Version 4 by the 'Best subsets' method as the most optimal model that will be used in modelling of MYR.Namely, by looking into the statistical results for Version 4, i.e. for Version 6 that gave the same selection of variables, it can be concluded that the model with extremely high value of determination coefficient of R 2 =89,56 represents empirical data extremely well, with the lowest standard evaluation error out of all the potential models (S=0,0569512).Apart from the statistical evaluation, logical connections among the variables in a model were also analyzed, as well as the selection of variables that were used in the selected model, which is in accordance with the starting assumptions of the paper.
The formal structure of the chosen model is: Diagrams were created for the chosen model (Fig. 2).They show dependencies of MYR as a function of one of the independent variables in the model (assuming that other variables are constant).After selecting the model, regression equation was applied for calculation of MYR at 32 locations that were the data sources for model design.In that way the values obtained through the model and the real values determined at the locations were compared.Mean absolute error (MAE) for all locations is 0,044.Mean absolute percentage error (MAPE), i.e. relative error expressed as percentage is 12,6 %, which is an extremely low deviation and acceptable for the traffic flow parameters that were modelled at micro locations [15,16].

Model testing
After the chosen model was selected and analysed, it was also tested at six locations.The locations are in two countries (Republic of Serbia and Bosnia and Herzegovina) and in four cities (Novi Sad, Subotica, Zrenjanin, Bijeljina).The criterion for choosing the test locations was that they were diverse in their traffic flow structure, number of pedestrians, geometrical characteristics of pedestrian crossing, as well as the type of traffic and regulations.Tab.6 shows names of the locations where the testing was done, as well as the values of independent variables that were used to calculate the dependent variable My.
After the application of the selected model version and the appropriate values of independent variables obtained at six locations, the modelled values My mod (Tab.7) were calculated and compared with the real, i.e. measured values of My at the locations.Mean absolute error for all locations is 0,045, and mean absolute percentage error is 8,65 %.
The test results at 6 different locations, with different vehicular and pedestrian flow characteristics, showed a relatively small error between the measured values of MYR and the ones that were calculated based on the model suggested in this paper.Knowing the exact value of MYR for pedestrian crossings that are analyzed for LOS is necessary, and its value directly influences the traffic quality grade in a street network.Until now, the values of this parameter while calculating pedestrian delay were adopted according to recommendations from earlier research, conducted mostly in the USA.The research that was conducted within this paper served to define a mathematical model that can be applied in determining LOS for pedestrian crossings.The model to determine MYR was created based on the results of local measurements at over 30 locations and a sample of 12.786 pedestrians and 29.253 vehicles.Considering the statistical analyses of the results of the selected model, as well as the test results of the model at six locations, it can be concluded that the selected model matches the empirical data very well, i.e. the MYR values that were directly measured.Looking at the 32 locations that were used to collect the data and design the model, at almost 90.% of the locations the percentage error between the modelled and real values is lower than 20 %.At the other six locations, which were used to test the created mathematical model, the percentage error was also lower than 20 %, which is in accordance with recommendations for modelled values of traffic flow parameters.According to the above mentioned, the model could be used in all situations when the MYR value is not known for the locations where LOS is being determined, i.e. the existing method for calculation of pedestrian delay at pedestrian crossings could be supplemented by the created model for calculation of MYR.In that way the influences and specific qualities of the local environment and traffic flow characteristics would be valued, which was not the case until now; that would contribute to a more precise determination of LOS at pedestrian crossings.
The research could be continued in the field of nonlinear and dynamic modelling as well, a previous research in this filled could be used for choosing potential variables which would be included in the model [2,4,9].Also, further research might include the application of class of positive system [17] for creating macroscopic traffic models for large-scale dynamic road networks.

Figure 2
Figure 2 Relation of My and independent variables included in the model

Table 1
Effect of pedestrian crossing treatment on motorist yield rates

Table 2
List of initial dependent variables in the analysis of the independent variable My

Table 3
Part of correlation matrix with independent variables selected for model design

Table 5
Basic output results of 'Stepwise' analysis for Version 6

Table 6
Locations and values of independent variables necessary for model testing

Table 7
Testing the model: Measured values (My) and modeled values (Mymod)