Many governments worldwide officially designate hazardous road locations (HRLs) as “black spots” and devote dedicated funding to address them. The AusLink Black Spot Projects of the Australian Government are among the most elaborate in methodology and well-funded by the government (Australian Government 2008, 2009). Similar schemes like “blacksites” and the Priority Investigation Locations (PILs) exist in Hong Kong, New York, and many other administrations (Loo 2009).
However, what are HRLs? The fundamental concept is that these areas are hav- ing abnormally high incidences of traffic collisions involving death and injury than other locations. Taken together, HRLs constitute a small portion of the total network in terms of length but accounted for a much higher share of the traffic injury burden.
In the case of China, it was estimated that HRLs accounted for about 15%–25% of the road network but 40%–69% of all traffic collisions (Guo et al. 2003).
In light of the above, the identification, analysis, and treatment of HRLs is con- sidered as one of the most effective approaches to improve road safety (Deacon et al.
1975; Transportation Research Board 1982; Hoque and Andreassen 1986; Nicholson 1989; Ogden 1996; Elvik 1997). In this chapter, we shall focus on the identification process. The investigation/analysis and treatment of HRLs are dealt with in Chapters 10 through 16. The key methodological aspects of the identification process, in turn, include defining the sites, setting the criteria of “hazardous,” considering exposure and other factors as appropriate, and ranking the HRLs.
9.2.1 Onthe DefinitiOnOf SiteS
Early studies of the identification of HRLs do not follow a spatial algorithm. Typical examples are taking all junctions, or together with their nearby roads, as the unit of analysis. For instance, in the blacksite definition of the Hong Kong SAR Government until 2011, road junctions together with the 70 m of roads nearby had been con- sidered as the unit of analysis for identifying as “blacksites” (Loo 2009). While road junctions can be plotted on maps and have spatial coordinates, junctions of the road network are essentially treated as a nonspatial list in the entire process of HRL identification.
Scientific studies of defining sites generally follow the link-attribute and event- based approaches outlined in the previous chapter. With reference to the link-attribute approach, sites may be considered by dividing the whole road network into BSUs.
Then, each BSU is either taken independently or considered together with its con- tiguous BSUs as “sites.” For the former, subsequent HRLs identified are often called hot spots or black spots. For the latter, the HRLs identified are called hot zones or black zones (Thomas 1996; Flahaut et al. 2003; Geurts and Wets 2003; Brijs et al.
2006; Loo 2009; Yao et al. 2015). A noteworthy point is that the distinction of hot spots and hot zones in spatial analysis is not based on the length of HRLs. The dif- ference lies in the methodology. Using the link-attribute approach to illustrate, a hot zone consists of two or more contiguous hazardous BSUs. If each BSU is 100 m long, a hot zone will have a minimum length of 200 m. Depending on the spatial collision pattern, there is no theoretical maximum number of BSUs in a hot zone. However, a hot spot always consists of one BSU only. Its length depends on the length for the standard BSU, which may be much longer (say 500 m or 1 km). Moreover, some hot spots may be clustered or contiguous but network contiguity is not considered in the process of identification. Similarly, using the event-based approach, hot spots are identified without explicit consideration of network contiguity among reference points. The opposite is true for hot zones. Essentially, a hot zone is only found when there are spatially interdependent HRLs at contiguous reference points. Road seg- ments are considered as spatially independent objects in the hot zone methodology.
This definition is in contrast to the more traditional and nonspatial analysis of using hot spots to refer to short road segments (0.15 mile/0.24 km for intersection spots, and 0.3 mile/0.48 km for nonintersection spots) and sections or hot zones to refer to longer road segments (typically 3 miles/4.8 km) (Deacon et al. 1975).
Cluster Identifications in Networks 163
Theoretically, sites may also be defined as areas. However, areas are essentially planar space, including all land occupied by buildings or open space with no or lit- tle vehicular traffic. The smaller the spatial scale, the larger the intervening space.
Moreover, the smaller the spatial scale, the more heterogeneous the land uses and other socioeconomic environment, and the concentration of traffic collisions may not be attributable to one or several common causes applicable to that whole area. Hence, though the identification of HRLs can be conducted at the regional zonal level (e.g., Erdogan 2009), the most fruitful analysis is always at the local network level.
9.2.2 Settingthe Criteria
What is hazardous? Following Elvik (2007), there are three common groups of defi- nitions, that is, simple numerical, statistical, and model based.
9.2.2.1 Magic Figures
Simple numerical definitions are overwhelmingly popular among road safety administrations worldwide (Elvik 2006). In Norway, any site with a maximum length of 100 m where at least four injury collisions have been recorded during the last 5 years is considered as an HRL (Statens vegvesen 2006). Similar conclusion was made by Elvik (2008) after systematically surveying how HRLs were identi- fied in eight European countries: Austria, Denmark, Flanders, Germany, Hungary, Norway, Portugal, and Switzerland. In Kentucky of the United States, HRLs were considered as road segments of 0.1 mile (0.16 km) having three or more accidents in a 12-month period (Deacon et al. 1975). Similar method was adopted in the state of Arizona to identify HRLs by the Arizona Local Government Safety Project (ALGSP) Model (Carey 2001).
Using numerical definitions, “hazardous” is defined by collision frequency or count, sometimes taking into account injury severity of the traffic collision victims, rather than the collision potential based on risk and exposure. The use of observed collision frequency (Oi), compared to a predetermined critical number (CN), which may be the observed average of counts of comparison, is the most common. Location i is identified as unsafe if Oi exceeds the magic figure of CN. To illustrate, the magic figure approach identifies a road location i as an HRL, in the following manner:
HRLi= 1, if Oi>CN 0, otherwise
⎧⎨
⎪
⎩⎪ (9.1)
The value of CN, in turn, depends a lot on the absolute “tolerable” levels of traf- fic collisions in the society and the resources available to the road safety adminis- trations, because administrations have to have sufficient resources for tackling the HRLs identified.
9.2.2.2 Statistical Definitions
Statistical definitions recognize that traffic collisions are random events. In the 1970s, Hakkert and Mahalel (1978) proposed that hot spots should be defined as those sites whose collision frequency is significantly higher than expected at some prescribed
level of significance. In other words, CN is no longer a magic figure, but it involves generating the descriptive statistics of the empirical collision pattern, specifying the statistical significance level, and calculating the confidence interval at the specified significance level accordingly. Moreover, the rate or quality control method can be applied with collisions per some exposure measures defined by statistical methods (Deacon et al. 1975). Often, the observed collision frequency Oi is first divided by some exposure factors, say the traffic volume, AADTi, to calculate the observed col- lision rate (Ri = Oi/AADTi), before comparing it with a predetermined critical colli- sion rate (CR) for identifying the HRLs. CR is defined statistically and refers to the
“normal level of safety” expected for a road location (Elvik 2008).
The statistical approach will identify road location i with Oi (or Ri) outside the confidence intervals at a specified significance level. The essential idea is to ascer- tain whether an HRL’s poor collision records is or is not due to chance. The null hypothesis (H0) is therefore whether the difference of Di = Oi − Ei is due to chance.
At 95% confidence level, H0 is rejected when Oi lies outside the confidence inter- val of x± 1.96SEx, where x is the mean of all Oi and SEx is the standard error of mean (Elvik 1988). It is worthwhile to highlight that road safety researchers are not so much interested when Di is negative, that is, Oi being lower than Ei, and to test whether the good road safety record of i is due to chance. However, when Di is posi- tive, one is interested to know whether the high record of i is or is not due to chance.
Hence, a one-tail test is appropriate. When H0 is rejected, an HRL is identified HRLi= 1, if Oi>x+2.54SEx
0, otherwise
⎧⎨
⎪
⎩⎪
(9.2)
Typically, x is the global mean of all Ois (count data) and is same for all is within the road network. SEx is the standard error, which in turn is the standard deviation of all Oi divided by the total sample size or the total number of BSUs of that road network, N. When the rate-quality method is used, x is the global mean of all Ris (ratio data). SEx is the standard error of the rate data, which in turn is the standard deviation of all Ri divided by N. Conceptually, the statistical approach modifies the rationale of identifying HRLs as screening for sites with high collision records to sites with high collision intensity records over a certain level of statistical significance.
9.2.2.3 Model-Based Definitions
The final major group of model-based definitions defines CR based on some forms of collision prediction models. Moreover, the values of CR for different road seg- ments (CRi) in the same road network need not be the same. CRi is defined by models based on risk levels of other sites. These other sites can be further defined as compa- rable sites. With model-based definitions, the major aim of identifying HRLs further changes from screening road locations for high collision frequency or rate to those of screening road locations for high potential of collision reduction. As a result, some of the HRL identification literature with the model-based definitions prefer to use terms like “sites with promise” (Hauer 1996; Hauer et al. 2002).
Cluster Identifications in Networks 165
McGuigan (1981, 1982) was among the earliest to propose the use of potential for collision reduction (PCR) as the difference between the observed and expected num- ber of collisions at a site given exposure. In comparison with statistical definitions, the collision rate is more complicated than taking Oi over some exposure factor like AADTi, but is based on a more sophisticated understanding of many more relevant factors that pose road hazards, and the existence of that factor at the specific road location i.
Ei= (Kj*POi, j)
j=1
∑F (9.3)
where
Ei is the expected collision frequency at the ith road segment F is the total number of risk factors considered
Kj is the increased collision frequency per unit increase of the jth risk factor POi,j is the level of exposure of the jth risk factor at the ith road segment
Many more risk factors (such as the junction type or the presence of steep gradient) beyond road length and traffic volume are often considered. In situations where a local risk factor is not applicable, the risk exposure level is zero (POi,j = 0). The dif- ference between the observed and expected collision frequency (Di) is no longer simply used in the hypothesis testing of statistical significance. With statistical defi- nitions, Di is mainly analyzed to reject or do not reject the null hypothesis that the difference is or is not due to chance. With model-based definitions, the difference becomes a direct measure of PCR at the specific road location (PCRi):
PCRi= Oi − Ei (9.4)
PCRi is treated as an indicator of collision risk reduction potential and/or in ranking HRLs (Mahalel et al. 1982; Maher and Mountain 1988).
More recently, the Empirical Bayes (EB) approach has become more popular.
It takes Ei as a variable not depending directly on “theoretical” risk factors and exposure, but the “empirical” collision records of road locations belonging to the same relatively homogenous road type. Roadway elements are generally classified in different types (g = 1, 2, … , G). Each roadway element type will then have its own expected collision counts (Eg), which may simply be the average of all col- lision counts of that roadway type, or be modeled to depend on specific relevant risk factors, or be modeled to depend on previous collision records termed prior information. Among all model-based definitions, the EB methods are consid- ered as one of the most promising and preferred by statisticians and road safety researchers. Cheng and Washington (2005) used experimentally derived data to compare three hot spot identification methods—simple ranking, confidence interval, and EB. They considered EB to be much better but also much more complicated.