Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 32 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
32
Dung lượng
406,5 KB
Nội dung
An Assessment of Water Supply Outlook Forecasts in the Colorado River Basin Jean C Morrill1, Holly C Hartmann1 and Roger C Bales2,a Department of Hydrology and Water Resources, University of Arizona, Tucson, AZ, USA School of Engineering, University of California, Merced, CA, USA a Corresponding author 10/20/2022 Abstract A variety of forecast skill measures of interest to stakeholders were used to assess the strengths and weaknesses of seasonal water supply outlooks (WSO’s) at 55 sites in the Colorado River basin, and provide a baseline against which alternative and experimental forecast methods can be compared These included traditional scalar measures (linear correlation, linear root-mean square error and bias), categorical measures (false alarm rate, threat score), probabilistic measures (Brier score, rank probability score) and distribution-oriented measures (resolution, reliability and discrimination) Despite the shortcomings of the WSO’s they are generally an improvement over climatology The majority of forecast points have very conservative predications of seasonal flow, with below-average flows often over predicted and above-average flows under predicted Late-season forecasts at most locations are generally better than those issued in January There is a low false alarm rate for both low and high flows at most sites, however, these flows are not forecast nearly as often as they are observed Moderate flows have a very high probability of detection, but are forecast more often than they occur There is also good discrimination between high and low flows, i.e when high flows are forecast, low flows are not observed, and vice versa The diversity of forecast performance metrics reflects the multi-attribute nature of forecast and ensembles 10/20/2022 Introduction Seasonal water supply outlooks, or volume of total seasonal runoff, are routinely used by decision makers in the southwestern United States for making commitments for water deliveries, determining industrial and agriculture water allocation, and carrying out reservoir operations These forecasts are based primarily on statistical regression equations developed from monthly precipitation, recent snow-water equivalent, and a subset of past streamflow observations (Day, 1985) In the Colorado River Basin the National Weather Services Colorado Basin River Forecast Center (CBRFC) and the Natural Resources Conservation Service (NRCS) jointly issue seasonal water supply outlook (WSO) forecasts of naturalized, or unimpaired, flow, i.e the flow that would most likely occur in the absence of diversions These forecast were not always issued jointly (Hartmann et.a , 200X?) Currently WSO’s are issued once each month from January to June However, until the mid-1990s, the forecasts were only issued until May Each forecast contains: the most probable value for the forecast period, a comparison to a historical, climatological mean value (usually a 10-to 30-year mean), a reasonable maximum (usually the 10% exceedance value), and a reasonable minimum (usually the 90% exceedance value) In some locations with strongly skewed flow distributions, the comparison is to a historical median, rather than the mean The forecast period is the period of time over which the forecasted flow is predicted to occur It is not the same for all sites, all years at one location, or even all months in a single year In the past decade, the most common forecast period has been April-July for most sites in the upper Colorado River basin and January-May for lthe ower Colorado, for each month a forecast was issued However, previously many sites used AprilSeptember forecast periods, and prior to that the forecast period for January forecast was January-September, for February forecast the forecast period was February-September, etc Most of the sites at which forecasts are issued are impaired, i.e have diversion above the forecast and gauging location Therefore the CRBRFC combines measured discharges with historical estimates of diversion to reconstruct the unimpeded observed 10/20/2022 flow (Ref bulletins) Despite the shortcomings of this approach, it provides the best estimate against which to assess the skill of WSO’s Forecast verification is important for assessing forecast quality and performance, improving forecasting procedures, and providing users with information helpful in applying the forecasts (Murphy and Winkler, 1987) Decision makers take account of forecast skill in using forecast information and are interested in having access to a variety of skill measures (Bales et.al 2004; Franz et al., 2003) [Any additional result about skill??] The work reported here assesses the skill of forecasts relative to naturalized streamflow across the Colorado River basin Using a variety of methods of interest to stakeholders: traditional scalar measures (linear correlation, linear root-mean square error and bias), categorical measures (false alarm rate, threat score), probabilistic measures (Brier score, rank probability score) and distributive measures (resolution, reliability and discrimination) The purpose was to assess the strengths and weaknesses of the current water supply forecasts, and provide a baseline against which alternative and experimental forecast methods can be compared 1.1 Data and Methods Data WSO records from 136 forecast points on 84 water bodies were assembled, including some forecast locations that are no longer active NEED TO APPEND DATA Reconstructed flows were made available by the CRBRFC and NOAA (T Tolsdorf and Shumate, personal communication), however data were not available for all forecast locations Many current forecast points were established in 1993, and so not yet have good long-term records For this study we chose 54 sites having at least 10 years of both forecast and observed data (Figure 1) Another 33 sites have fewer than 10 years of data, but most are still active, and so should be more useful for statistical analysis in a few years time The earliest water supply forecasts used in this study were issued in 1953 at 22 of the 54 locations These 54 forecasting sites were divided in smaller basins (or in the case of Lake Powell, a single location), compatible with the divisions used by CBRFC in the tables and graphs accompanying the WSO forecasts (Table 1) The maximum number of years 10/20/2022 in the combined forecast and observation record was 48 (1953–2000), the minimum used was 21, and the median and average number of years were 46 and 41.5 respectively Each forecast includes the most likely value, a reasonable maximum (usually the 10% exceedance value), and a reasonable minimum (usually the 90% exceedance value) These were used to calculate the 30 and 70% exceedance values associated with each forecast Five forecast flow categories were calculated for each forecast, based on exceedance probability: 0-10%, >10-30%, >30-70%, >70-90%, and >90% The probability of the flow falling within each of these categories is 0.1, 0.2, 0.4, 0.2 and 0.1 respectively 1.2 Summary and correlation measures Summary measures are scalar measures of accuracy from forecasts of continuous variables, and include the mean absolute error (MAE) and mean square error (MSE): n ∑ f i − oi n i =1 n MSE = ∑ ( f i − oi ) n i =1 where for a given location, f is the forecast seasonal runoff for period i and o the MAE = (1) (2) naturalized observed flow for the same period Since MSE is computed by squaring the forecast errors, it is more sensitive to larger errors than is MAE It increases from zero for perfect forecasts to large positive values as the discrepancies between the forecast and observations become larger RMSE is the square root of the MSE Often an accuracy measure is not meaningful by itself, and is compared to a reference value, usually based on the historical record In order for a forecast technique to be worthwhile, it must generate better results than simply using the cumulative distribution of the climatological record, i.e assuming that the most likely flow next year is the average flow in the climatological record In order to judge this, skill scores are calculated for the accuracy measures: A − Aref × 100% (3) A perf − Aref where SSA,If A is a generic skill score, Aref is the accuracy of a reference set of values (e.g SS A = the climatological record) and Aperf is the value of A given by perfect forecasts If A=Aperf, 10/20/2022 SSA will be at its maximum 100% If A=Aref, then SSA=0%, indicating no improvement over the reference forecast If SSA 0, the forecast is a better predictor of flow than is the observed mean, but if NS C< 0, the observed mean is a better predictor and there is a lack of correlation between the forecast and observed values Discussion of correlation is often combined with that of the percent bias, which measures the difference between the average forecasted and observed values (Wilks, 1995): f −o × 100% (6) o which can assume positive (overforecasting), negative (underforecasting) or zero values Pbias = 1.3 Categorical Measures A categorical forecast states that one and only one set of possible events will occur Contingency tables are used to display the possible combinations of forecast and event pairs, and the count of each pair An event (e.g seasonal flow in the upper 30% of the observed distribution) that is successfully forecast (both forecast and observed) occurs a times An event that is forecast but not observed occurs b times, and an event that is observed but not forecast occurs c times An event that is not forecast and not observed for the same period occurs d times The total number of forecasts in the data set is n=a+b+c+d A perfectly accurate binary (2× 2) categorical forecast will have b = c =0 and a+d=n However, few forecasts are perfect Several measures can be used to examine the accuracy of the forecast, including hit rate, threat score, probability of detection and false alarm rate (Wilks, 1995) The hit rate is the proportion correct: a+d n and ranges from one (perfect) to zero (worst) HR = 10/20/2022 (7) The threat score, also known as the critical success index, is the proportion of correctly forecast events out of the total number of times the event was either forecast or observed, and does not take into account the accurate non-occurrence of events: a a+b+c It also ranges from one (perfect) to zero (worst) TS = (8) The probability of detection is the fraction of times when the event was correctly forecast to the number of times is actually occurred, or the probability of the forecast given the observation: a a+c A perfect POD is and the worst POD = (9) A related statistic is the false alarm rate, FAR, which is the fraction of forecasted events that not happen In terms of condition probability, it is the probability of not observing an event given the forecast: b (10) a+b Unlike the other categorical measure describe, the FAR has a negative orientation, with FAR = the best possible FAR being and the worst being The bias of the categorical forecasts compares the average forecast with the average observation, and is represented by the ratio of “yes” observations to “yes” forecasts: a+b (11) a+c A biased forecast has a value of 1, showing that the event occurred the same number of bias = times that it was forecast If the bias is greater than 1, the event is overforecast (forecast most often than observed); if the bias is less than one, the event is underforecast Since the bias does not actually show anything about whether the forecasts matched the observations, it is not an accuracy measure 1.4 Probabilistic Measures Whereas categorical forecasts contain no expression of uncertainty, probabilistic forecasts Linear error in probability space assesses forecast errors with respect to their difference in probability, rather than their overall magnitude: LEPS i = Fc ( f i ) − Fc ( oi ) 10/20/2022 (12) Fc(o) refers to the climatological cumulative distribution function of the observations, and Fc(f) to the corresponding distribution for the forecasgts The corresponding skill score is: ∑ F ( f ) − F (o ) = 1− ∑ 0.5 − F ( o ) n i =1 n SS LEPS c i c c i =1 i (13) i using the climatological median as reference forecast The Brier score is analogous to MSE : n ( f i − oi ) (14) ∑ n i =1 However, it compares the probability associated with a forecast event with whether or not BS = that event occurred instead of comparing the actual forecast and observation Therefore fi ranges from to 1, oi=1 if the event occurred or oi =0 if the event did not occur and BS=0 for perfect forecasts The corresponding skill score is: BS BS ref where the reference forecast is generally the climatological relative frequency SS BS = − (15) The ranked probability score (RPS) is essentially an extension of the Brier score to multi-event situations Instead of just looking at the probability associated with one event or condition, it looks simultaneously at the cumulative probability of multiple events occurring RPS uses the forecast cumulative probability: m Fm = ∑ f j , m=1,…,J (16) j =1 where fj is the forecast probability at each of the J non-exceedance categories In this paper, fj = {0.1 0.2 0.4 0.2 0.1} for the five non-exceedance intervals {0-10%, >10-30%, >30-70%, >70-90%, and >90%},so Fm = {0.1 0.3 0.7 0.9 and 1.0} and J=5 The observation occurs in only one of the flow categories, which will be given a value of 1; all the others are given a value of zero: m Om = ∑ o j , m=1,…,J (17) j =1 The RPS for a single forecast/observation pair is calculated from: J RPS i = ∑ ( Fm − Om ) (18) m =1 and the average RPS over a number of forecasts is calculated from: 10/20/2022 n RPS = ∑ RPS i (19) n i =1 A perfect forecast will assign all the probability to the same percentile in which the event occurs, which will result in RPS=0 The RPS has a lower bound of and an upper bound of J-1 RPS values are rewarded for the observation being closer to the highest probability category The RPS skill score is defined as: RPS RPS ref where RPSret is the reference value SS RPS =1 − (25) The Brier score focuses on how well the forecasts perform in a single flow category; RPS is a measure of overall forecast quality 1.5 Distributive Measures We used two distributive measures, reliability and discrimination, to assess the forecasts in various categories (i.e low, medium, high) The same five forecast probabilities used for RPS were used to represent the probability given to each of the three flow categories Our applicatiuon of these measures follows that outlined by Franz et al (2003) Reliability uses the conditional distribution (p(o|f)) and describes how often an observation occurred given a particular forecast Ideally, p (o = | f ) = f (Murphy and Winkler, 1987) That is, for a set of forecasts where a forecast probability value f was given to a particular observation o, the forecasts are considered perfectly reliable if the relative frequency of the observation equals the forecast probability (Murphy et al., 1992) For example, given all the times in which high flows were forecasted with a 50% probability, the forecasts would be considered perfectly reliable if the actual flows turned out to be high in 50% of the cases On a reliability diagram (Figure 2) the conditional distribution (p(o|f)) of a set of perfectly reliable forecasts will fall along the 1:1 line Forecasts that fall to the left of the line are underforecasting or not assigning enough probability to the subsequent observation Those that fall to the right of the line are overforecasting Forecasts that fall on the no-resolution line are unable to identify occasions when the event is more or less 10/20/2022 10 of high and low flows is not predicted as well, or as early, or both For example, at the Blue River Inflow to Dillon Reservoir and Williams Fork near Parshall, even in May the low flows are given over a 60% probability of not occurring during the times they were observed Most of the forecasts not exhibit much reliability, or even show much improvement in reliability over time Hit rates are generally better for the lowest 30% of flows (0.6 - 0.95) than the upper 30% (0.2 -0.8), 3.3 Gunnison / Dolores These five sites show a higher overall hit rate for high flows than the main stem of the Upper Colorado Best reliability occurs for the lowest 30% of flows at the Gunnison River inflow to Blue Mesa Reservoir and the East River at Almont (8 – Reliability 9128400) Discrimination of non-occurrence of extreme events is very good, but the events that occur are not being forecast, although four of the five sites a good (50 to 70%) job of predicting high flows in the April and May forecasts The occurrences of low flows are seldom accurately forecast 3.4 Upper Green At four of the five site (all except Henrys Fork near Manila), the forecasts are usually within a factor of (50% to 200%) of the observed value, with Green River Near Warren Bridge and Pine Creek Above Fremont Lake have forecast values closest to that of the naturalized streamflow During two of the low flow years,, 1979 and 1989) , forecast at Henrys Fork near Manila (#9229500) were as much as times the naturalized stream flow However, the flows at this site were generally less than 100 cfs, lower than at any of the other sites in this basin, and even small difference in the forecast can lead to large apparent discrepancies Despite these low-flow problems, high-flow forecasts at this site were extremely reliable, the best of any site in this basin Hit rates for high flows were overall better than for low flow, and the probability of detection was zero for most months and sites Pine Creek above Fremont Lake has the best discrimination of the occurrence of high flows (40-50% Mar-May) and low flows (80-100% March-May) of these five sites (Note to self: probably use this instead of Green R at Warren in Results portion) 10/20/2022 18 3.5 Yampa / White 1982-1984 was an extended period of above average flows in the Yampa/White River basin, characterized by low forecast/observed values at the six sites, while the years of lowest flows (1966, 1976-7, 1989-90, 1994) had the highest forecast/observed values, consistent with the pattern seen elsewhere Exteme flows are the least well forecast All the sites except the Little Snake River near Dixon (which has the shortest record) have very good reliability for predicted low flows during all the forecast months [If another reliability figure is needed – 9260000, Little Snake River Near Lily, is good] High flows are less reliable, but still better than climatology, for the most part Discrimination of high and lows flows is similar to that observed in other basin, with the accurate occurrence of low flows being forecast with strong certainty about 50% of the time in April and May High flows are forecast with much less certainty 3.6 Lower Green The Lower Green River basin has 11 sites, the largest number of any of the basins The poorest forecasts occurred at Strawberry River near Duchesne (USGS #9288180) and Duchesene River at Myton (#9295000) Both had at least four months with negative LEPS and RPS skill scores, indicating that the forecasts were not an improvement over the climatology For the Strawberry River near Duchesne, there was a high hit rate for low flows (0.8–0.9), a poor hit rate for high flows (0.3-0.5), and a low POD for either There was effectively no reliability and no discrimination (certainty of predication) for any of the flow classes For example, high flows were only given a 50% probability of occurring about 20% of the time they were observed, and a 0% probability of occurring the other 80% of the time Other sites had better-than-climatology, although still imperfect, forecasts Rock Creek near Mountain Home & Duchesne River above Knight Diversion had excellent April-May low flow discriminations For Green River at Green River, low flows generally had a 40-50% chance of not occurring when they were observed High flows always were given some possibility of occurring, although sometimes only 10-50%, when high flows were observed Huntington Creek near Huntington had low but nonzero POD for low flows, but some false alarms 10/20/2022 19 [Still trying to solve the Strawberry River problem – I have an idea to check out Will write this section then Lower Green has the largest number of sites and needs careful attention Am talking with Holly Otherwise I am gong to drop it and rerun my other analysis without it.] 3.7 San Juan River Basin The forecasted low flows in the San Juan river basin have a high hit rate (generally 0.7 – 0.9), while the forecasted high flows generally have a hit rate of only 0.4-0.6 However, the probability of detection of high and low flows is still poor at all the sites Discrimination of the non-occurrence of high flows during low-flow periods is excellent as early as February at six of the seven sites (every site except the Florida River flow to Lemon Reservoir) although the non-occurrence of low flows during high-flow periods I not as good (it is best at Piedra River near Arboles, Animas River near Durango, and the Florida River Inflow to Lemon Reservoir) However, low flows and high flows in the basin……… 3.8 Virgin River Basin One disadvantage of the two Virgin River sites is that both have fairly significant gaps in time However, this is an important watershed in the southwest and these sites should not be excluded from the study Extreme flows are not predicted well at either site, particularly early in the forecast season Neither site shows any discrimination of the occurrence or non-occurrence of low flows, but the high flows have decent discrimination from Mar-May Low flows tend to be severely overestimated by the tendency to forecast towards moderate flows In this area that is currently suffering from a prolonged drought, the ability to accurately forecast low flows would be welcomed 3.9 Gila River Basin Low-flow bias is very close to for many sites in the Gila River Basin (Salt River Near Roosevelt, San Francisco River at Clifton, San Francisco River near Glenwood, Tonto Creek above Gun Creek near Roosevelt, Gila River below Blue Creek near Virden, Verde River below Tangle Creek); HR, TS, and POD also tend to be higher than 0.5, indicating that low flows in the basin are often predicted accurately (include Figure ?) 10/20/2022 20 However, the TS and POS are near for high flows at all months, suggesting a consistent inability to accurate predict high-flow seasons in this basin, which would contribute to the large negative values observed in the Shafer and Huddleston (1984) skew coefficient Generally negative skill scores for the probabilistic measure (SSBS and SSRPS) values also indicate that flows in this basin are not well forecast The Gila River sites tend to show good discrimination of the non-occurrence of high events during times of low flow The Salt River Near Roosevelt (#9498500) and Gila near Gila (#9430500) show good reliability of forecasts of low flows February through April The reliability of forecast of moderate and high flows is still poor The other sites (San Francisco River, Tonto River, Verde River) show no pattern of reliability at all 3.10 Comparison with other measures Shafer and Huddleston (1984) examined average forecast error at over 500 forecast points in 10 western states They used summary statistical measures and found that forecast errors tended to be approximately normally distributed, but with a slightly negative skew They used summary statistical measures and found that forecast errors tended to be approximately normally distributed, but with a slightly negative skew that resulted from a few large negative errors (under-forecasts) with no corresponding large positive errors High errors were not always associated with poor skill scores, however Shafer and Huddleston (1984) used a similar calculation to examine forecast error and the distribution of forecast error in the analysis of seasonal streamflow forecasts Forecast error for a particular forecast/observation pair was defined as E= f −o × 100 o ref (20) where o ref is the published seasonal average runoff at the time of the forecast (also called the climatological average or reference value) Thy also defined a skew coefficient associated with the distribution of a set of errors: n G= ( n∑ Ei − E i =1 ) (n − 1)(n − 2)(σ E ) (21) × 100 where σ E is the standard deviation of errors 10/20/2022 21 Shafer and Huddleston (1984) noted that the presence of a few large negative errors not offset by correspondingly large positive errors results in a negatively skewed distribution at a forecast point, i.e large underforecasts rather than large overforecasts The also defined a forecast skill coefficient: n∑ ( oref − oi ) n C SH = i =1 n (22) n ∑ ( f i − oi ) i =1 A CSH value of 1.0 indicates that a prediction of the average streamflow would produce the same results as using the forecast A value of 2.0 implies that the forecast are twice as accurate as a constant prediction of the average A value less than implies that the forecast is les accurate than the climatology Shaefer and Huddleston (1984) compared forecast for two 15 years periods, 1951–65 and 1966–80, and concluded that a slight relative improvement (about 10%) in forecast ability occurred about the time computer became widely used in developing forecasts They attributed the gradual improvement in forecast skill to a combination of greater data processing capacity and the inclusion of data from additional hydrologically important sites They suggested that “modest improvement might be expected with the addition of satellite derived snow covered area or mean areal water equivalent data”, which were not readily available for most operation applications at the time Although satellite data of snow-covered area is now available, it is still not being used in operational models (Hartmann et al, 2002) Direct comparison with Shaefer and Huddleston (1984) is problematic, as their results were divided into states, rather than basins, and Colorado was paired with New Mexico According to their study, Arizona had the highest error (more than 55% for April streamflow forecasts), but also the highest skill (See Eqs and 11) Of the Colorado basin states, Wyoming (only part of which is in the basin) had the lowest forecast error (~20%) paired with the highest skill Following Shaefer and Huddleston (1984), we applied Eq XX) to our data We found similar trends The Gunnison / Dolores watershed, Upper Green watershed, and Lake Powell site consistently had absolute values of percent forecast errors less than 10%, the Gila watershed had percent errors ranging from 24 to 52%, and the five other 10/20/2022 22 watersheds mostly had errors between and 20% The largest improvement in forecast error occurred between January and April in the Virgin River Basin (May was less good, but still until 10%) and between January and March in the Gila River Basin (but April was extremely poor), although the March error in the Gila is still higher than at any other site Skill coefficients generally improved from January to May (except for April in the Gila watershed), from a Colorado basin-wide average of 1.31 to an average of 2.05 Despite the problems seen with some of the other forecast skill methods for Virgin River data, the Virgin River combined low forecast errors with high skill coefficients in April and May Conclusions Despite the shortcomings of the Water Supply Outlooks they are generally an improvement over climatology The majority of forecast points have very conservative predications of seasonal flow Below-average flows are often over predicted (forecast values are too high) and above average flows are under predicted (forecast values are too low) This problem is most severe for early forecasts (e.g January) at many locations, and improves somewhat with later forecasts (e.g May) For the low and high flows there is a low false alarm rate, which means than when low and high flows are forecast, these forecast are generally accurate However, for low and high flows there is also a low probability of detection at most sites, which indicates that these flows are not forecast nearly as often as they are observed Moderate flows have a very high probability of detection, but also a very high false alarm rate, indicating that the likelihood of moderate flows is overforecast There is also good discrimination between high and low flows, particularly with forecasts issued later in the year This means that when high flows are forecast, low flows are not observed, and vice versa However the probability that high and low flows will be accurately predicted, particularly early in year, is not as good The accuracy of forecasts tends to improve with each month, so that forecasts issued in May tend to be much more reliable than those issued in January Not all streams or areas show the same patterns and trends, but there is a lot of similarity in the relationship between forecast and observation, particularly in the Upper Colorado The changes in forecasting 10/20/2022 23 periods (most recently to April-July in the Upper Basin and forecasting month-May in the Lower Basin) did not affect the accuracy of the forecasts (NEED MORE – HH has page of notes for this) Acknowledgements Support for this research was provided by the NOAA-OGP supported Climate Assessment for the Southwest Project 10/20/2022 24 References (Ref bulletins) Hartmann et al, 2002a – AMS, 2002b Cliamte Research?) (Bales et.al 2004; Franz et al., 2003) T Tolsdorf and S Shumate, personal communication), (Legates and McCabe, 1999) (Nash and Sutcliffe, 1970), Legates, D R and G J McCabe, Jr 1999 Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation Day, G.N., 1985: Extended streamflow forecasting using NWSRFS Journal of Water Resources Planning and Management, 111(2), 157-170 Day, G.N., L.E Brazil, C.S McCarthy, and D.P Laurine, 1992: Verification of the National Weather Service extended streamflow prediction procedure Proceedings, AWRA 28th Annual Conference and Symposium, Reno, NV, 163-172 Murphy, A.H., and Winkler, R.L., 1992: Diagnostic verification of probability forecasts International Journal of Forecasting, 7, 435-455 Murphy, A.H., Brown, B.G., and Chen, Y., 1989: Diagnostic verification of temperature forecasts Weather and Forecasting, 4, 485-501 Murphy, A.H and Winkler, R.L., 1987: A general framework for forecast verification Monthly Weather Review, 115, 1330-1338 Palmer, P.L., 1988: The SCS snow survey water supply forecasting program: current operations and future directions Proceedings, Western Snow Conference, Kalispell, MT, 43-51 Riverside Technology, Inc., 1999: National Weather Service Extended Streamflow Prediction Verification System (ESPVS) U.S National Weather Service Shafer, B.A and Huddleston, J.M., 1984: Analysis of seasonal volume streamflow forecast errors in the western United States Proceedings, A Critical Assessment of Forecasting in Water Quality Goals in Western Water Resources Management, Bethesda, MD, American Water Resources Association, 117-126 10/20/2022 25 Wilks, D.S., 2001: Diagnostic Verification of the Climate Prediction Center Long-Lead Outlooks, 1995-98 Journal of Climate, 13, 2389-2403 Wilks, D.S., 1995: Forecast verification Statistical Methods in the Atmospheric Sciences, Academic Press, 467 p 10/20/2022 26 Table The 54 sites used in this study USGS # BASIN SIZE feet sq miles NAME ELEV 9295000 9299500 9315000 9317997 MAIN STEM UPPER COLORADO Colorado River Inflow to Lake Granby, CO Williams Fork near Parshall, CO Blue River Inflow to Dillon Reservoir, CO Blue River Inflow to Green Mountain Reservoir, CO Eagle River below Gypsum, CO Colorado River near Dotsero, CO Roaring Fork at Glenwood Springs, CO Colorado River near Cameo, CO Colorado River near Cisco, UT GUNNISON / DOLORES East River at Almont, CO Gunnison River Inflow to Blue Mesa Reservoir, CO Uncompahgre River at Colona, CO Gunnison River near Grans Junction, CO Dolores River at Dolores, CO UPPER GREEN Green River at Warren Bridge, WY Pine Creek above Fremont Lake, WY New Fork River near Big Piney, WY Fontenelle Reservoir Inflow, WY Henrys Fork near Manila, UT YAMPA / WHITE Yampa River at Steamboat Springs, CO Elk River at Clark, CO Yampa River near Maybell, CO Little Snake River near Dixon, WY Little Snake River near Lily, CO White River near Meeker, CO LOWER GREEN Ashley Creek near Vernal, UT West Fork Duchesne River near Hanna, UT, unimpaired Duchesne River near Tabonia, UT, unimpaired Rock Creek near Mountain Home, UT Duchesne River above Knight Diversion, UT Strawberry River near Duchesne, UT Lake FK River below Moon Lake near Mountain Home, UT Duchesne River at Myton, UT, unimpaired Whiterocks River near Whiterocks, UT Green River at Green River, UT Huntington Creek near Huntington, UT 9349800 SAN JUAN R BASIN Piedra River near Arboles, CO 9019000 9037500 9050700 9057500 9070000 9070500 9085000 9095500 9180500 9112500 9124800 9147500 9152500 9166500 9188500 9196500 9205000 9211150 9229500 9239500 9241000 9251000 9257000 9260000 9304500 9266500 9275500 9277500 9279000 9279150 9288180 9291000 10/20/2022 27 8050 7809 8760 7683 6275 6130 5721 4814 4090 312 184 335 599 944 4394 1451 8050 24100 YEARS USED 1953-00 1956-96 1972-00 1953-00 1974-00 1972-00 1953-00 1956-00 1956-00 8006 7149 6319 4628 6919 289 3453 448 7928 504 1956-00 1971-00 1953-00 1953-00 1953-00 7468 7450 6800 6506 6060 468 76 1230 4280 520 1956-00 1969-00 1974-00 1971-00 1971-94 6695 7268 5900 6331 5685 6300 604 216 3410 988 3730 755 1953-00 1953-93 1956-00 1980-00 1953-00 1953-00 6231 101 1953-00 7218 62 1974-00 6190 7250 5840 5722 353 147 623 917 1953-00 1964-00 1964-00 1953-00 7970 112 1953-00 5061 7200 4040 6450 2643 109 44850 178 1956-00 1953-00 1956-00 1953-00 6148 629 1971-00 9353500 9355200 9361500 9363100 9365500 9379500 9379900 9406000 9408150 9430500 9432000 9444000 9444500 9466500 9498500 9499000 9508500 Los Pinos River near Bayfield, CO San Juan River Inflow to Navajo Reservoir, NM Animas River at Durango, CO Florida River Inflow to Lemon Reservoir, CO La Plata River at Hesperus, CO San Juan River near Bluff, UT LAKE POWELL Lake Powell at Glen Canyon Dam, VIRGIN RIVER Virgin River near Virgin, UT Virgin River near Hurricane, UT GILA RIVER BASIN Gila River near Gila, NM Gila River below Blue Creek near Virden, NM San Francisco River near Glenwood, NM San Francisco River at Clifton, AZ Gila River at Calva, AZ Salt River near Roosevelt, AZ Tonto Creek above Gun Creek, near Roosevelt, AZ Verde River below Tangle Creek, above Horseshoe Dam, AZ 10/20/2022 28 7583 5655 6502 6470 8105 4048 270 3260 692 18 37 23000 1953-00 1963-00 1953-00 1953-00 1954-00 1956-00 3100 107700 1963-00 3500 2780 956 1499 1957-00 1972-00 4655 4090 4560 3436 2517 2177 2523 1864 2829 1653 2766 11470 4306 675 1964-00 1954-00 1964-00 1953-00 1963-98 1953-00 1955-00 2029 5859 1953-00 Table BASIN MAIN STEM UPPER CO GUNNISON / DOLORES UPPER GREEN YAMPA / WHITE LOWER GREEN SAN JUAN R BASIN LAKE POWELL VIRGIN RIVER GILA RIVER BASIN ALL MAIN STEM UPPER CO GUNNISON / DOLORES UPPER GREEN YAMPA / WHITE LOWER GREEN SAN JUAN R BASIN LAKE POWELL VIRGIN RIVER GILA RIVER BASIN ALL MAIN STEM UPPER CO GUNNISON / DOLORES UPPER GREEN YAMPA / WHITE LOWER GREEN SAN JUAN R BASIN LAKE POWELL VIRGIN RIVER GILA RIVER BASIN ALL 10/20/2022 JAN FEB MAR LOW FLOWS APR MAY 0.4 0.5 0.6 0.5 0.6 0.7 0.3 0.5 0.5 0.5 0.6 0.6 0.6 0.7 0.7 0.5 0.6 0.7 0.8 0.8 0.8 0.4 0.4 0.5 0.6 0.7 0.6 0.5 0.6 0.6 MODERATE FLOWS 0.6 0.8 0.6 0.7 0.8 0.7 0.8 0.7 0.6 0.7 0.7 0.8 0.7 0.8 0.9 0.8 0.8 0.5 NaN 0.8 0.2 0.2 0.3 0.3 0.1 0.2 0.3 0.2 0.4 0.4 0.2 0.2 0.5 0.5 0.2 0.3 0.3 0.3 0.3 0.3 HIGH FLOWS 0.3 0.3 0.2 0.3 0.5 0.3 0.5 0.4 0.4 0.4 0.4 0.5 0.3 0.4 0.6 0.4 0.5 0.5 0.4 0.4 0.5 0.6 0.5 0.5 0.7 0.6 0.5 0.4 NaN 0.6 0.4 0.5 0.4 0.6 0.6 0.4 0.6 0.2 0.4 0.5 0.5 0.6 0.6 0.6 0.7 0.5 0.6 0.6 0.6 0.6 0.6 0.7 0.6 0.6 0.8 0.7 0.6 0.6 0.6 0.7 0.7 0.8 0.7 0.7 0.8 0.8 0.6 0.7 NaN 0.8 0.5 0.5 0.6 0.5 0.6 0.4 0.6 0.4 0.5 0.5 29 Table (omit ??) Site Basin 9277500 9295000 9285000 9288180 9317997 9466500 9379900 9353500 9315000 9241000 LG LG LG LG LG Gila LP SJ LG Y/W 9070500 9037500 9070000 9095500 9188500 9019000 9211150 9205000 9363100 9406000 MS-UC MS-UC MS-UC MS-UC UG MS-UC UG UG SJ Vi 9288180 9257000 9285000 9211150 9277500 9317997 9363100 9299500 9498500 9295000 LG Y/W LG UG LG LG SJ LG Gi LG 9508500 9365500 9070000 9406000 9408150 9147500 9349800 9304500 9466500 9229500 Gi SJ MS-UC Vi Vi G/D SJ Y/W Gi UG 10/20/2022 Resolution* Rank Low Flows, Best Resolution DUCHESNE R NR TABIONA, UT UNIMPAIRED DUCHESNE R AT MYTON, UT UNIMPAIRED STRAWBERRY R NR SOLDIER SPRINGS UNIMP STRAWBERRY R NR DUCHESNE, UT HUNTINGTON CK NR HUNTINGTON, UT GILA RIVER AT CALVA LAKE POWELL AT GLEN CANYON DAM, AZ LOS PINOS RIVER NEAR BAYFIELD GREEN R AT GREEN RIVER, UT ELK RIVER AT CLARK Low Flows, Worst Resolution COLORADO RIVER NEAR DOTSERO WILLIAMS FORK NEAR PARSHALL EAGLE RIVER BELOW GYPSUM COLORADO RIVER NEAR CAMEO GREEN R AT WARREN BRIDGE COLORADO RIVER INFLOW TO LAKE GRANBY FONTENELLE RESERVOIR INFLOW NEW FORK RIVER NR BIG PINEY FLORIDA RIVER INFLOW TO LEMON RESERVOIR VIRGIN R NR VIRGIN, UT High Flows, Best Resolution STRAWBERRY R NR DUCHESNE, UT LITTLE SNAKE R NR DIXON STRAWBERRY R NR SOLDIER SPRINGS UNIMP FONTENELLE RESERVOIR INFLOW DUCHESNE R NR TABIONA, UT UNIMPAIRED HUNTINGTON CK NR HUNTINGTON, UT FLORIDA RIVER INFLOW TO LEMON RESERVOIR WHITEROCKS R NR WHITEROCKS, UT SALT RIVER NEAR ROOSEVELT DUCHESNE R AT MYTON, UT UNIMPAIRED High Flows, Worst Resolution VERDE R BLW TANGLE CK, ABV HORSESHOE DAM LA PLATA RIVER AT HESPERUS EAGLE RIVER BELOW GYPSUM VIRGIN R NR VIRGIN, UT VIRGIN R NR HURRICANE, UT UNCOMPAHGRE RIVER AT COLONA PIEDRA RIVER NEAR ARBOLES WHITE RIVER NEAR MEEKER GILA RIVER AT CALVA HENRYS FK NR MANILA, UT 30 *sum of high and low probabilities 0.95 0.94 0.94 0.81 0.79 0.76 0.76 0.74 0.74 0.73 10 0.54 0.54 0.54 0.52 0.49 0.47 0.44 0.43 0.43 0.41 46 47 48 49 50 51 52 53 54 55 0.95 0.91 0.89 0.85 0.81 0.77 0.74 0.72 0.70 0.68 10 0.50 0.49 0.49 0.49 0.49 0.47 0.44 0.43 0.34 0.25 46 47 48 49 50 51 52 53 54 55 List of figures Location of 136 water supply outlook forecast points in the Colorado River Basin The 54 points used in this study are shown by (distinguishing characteristics) Example reliability diagram Example discrimination diagram For each year at New Fork River Near Big Piney: (a)Forecast/observed values with each circle representing a different month , (b) observed/average observed values,(c) years used in computing the climatological average on which the forecast is based , and (d) forecast period associated with each month, with the top hatch representing the first month and the lower hatch marking the last month of the forecast period The left column show forecast versus observed values, for each month Each point represents a single year, with a different symbol for each forecast period The 1:1 line is provided for reference The right column shows f i / oi against oi / o The horizontal lines at 0.8 and 1.2 are provided for reference R2 (left column) and NSC (right column) associated with forecasts issued in January (top) through May (bottom) for the N sites used each month There were 55 sites used in January-April and only 47 used in May, because the Gila River Basin sites not issue May forecasts x NSC in April for entire area and sub regions Skew coefficient G (equation 9) in April for entire CRB and each sub region Frequency histograms of Hit Rate for observations in the lowest 30% of flows, the middle 40% of flows, and the upper 30% of flows, plus the cumulative frequency Frequency histograms of Threat Score Rate for observations in the lowest 30% of flows, the middle 40% of flows, and the upper 30% of flows, plus the cumulative frequency 10 Frequency histograms of False Alarm Rate for observations in the lowest 30% of flows, the middle 40% of flows, and the upper 30% of flows, plus the cumulative frequency 11 Frequency histograms of Probability of Detection for observations in the lowest 30% of flows, the middle 40% of flows, and the upper 30% of flows, plus the cumulative frequency 10/20/2022 31 12 The LEPS value (top left), Brier Score and Ranked Probability score (bottom left), and the associated skill scores (right column) for the New Fork River Near Big Piney (Site #9205000) The circles are the climatological scores, the crosses the forecast scores 13 Monthly Average a) LEPS skill scores b) Brier skill scores and c) Rank probability skill scores for each basin 14 Flow histograms (resolution) for Green River Near Warren Bridge (Site #9188500) 15 Reliability diagrams for Green River Near Warren Bridge (Site #9188500) 16 Discrimination diagrams for Green River Near Warren Bridge (Site #9188500) Discrimination diagrams for Colorado River Near Dotsero (Site #9070500) Reliability diagrams for Gunnison River Inflow to Blue Mesa Reservoir (Site #9124800) 10/20/2022 32 ... example, the February skill scores in the lower Green River and the San Juan River basins are lower than those in January Five of the sites in the Gila River basin have negative SSBS values in March,... usually averaging less than 0.5, with values less than or equal to 0.3 in January and March at many of the basins Low and high flows have the poorest resolution in the Virgin River basin The best... Green, and the Yampa and White River basins (Henry’s Fork near Manila Duchesne River at Myton, and Little Snake River near Dixon, respectively) However, four of the remaining San Juan River basin