Meteorol Atmos Phys (2013) 122:55–64 DOI 10.1007/s00703-013-0278-0 ORIGINAL PAPER A study of the connection between tropical cyclone track and intensity errors in the WRF model Du Duc Tien • Thanh Ngo-Duc • Hoang Thi Mai Chanh Kieu • Received: 27 December 2012 / Accepted: 20 August 2013 / Published online: September 2013 Ó Springer-Verlag Wien 2013 Abstract This study examines the dependence of the tropical cyclone (TC) intensity errors on the track errors in the Weather Research and Forecasting (WRF-ARW) model By using the National Centers for Environmental Prediction global final analysis as the initial and boundary conditions for cloud-resolving simulations of TC cases that have small track errors, it is found that the 2- and 3-day intensity errors in the North Atlantic basin can be reduced to 15 and 19 % when the track errors decrease to 55 and 76 %, respectively, whereas the 1-day intensity error shows no significant reduction despite more than 30 % decrease of the 1-day track error For the NorthWestern Pacific basin, the percentage of intensity reduction is somewhat similar with the 2- and 3-day intensity errors improved by about 15 and 19 %, respectively This suggests that future improvement of the TC track forecast skill in the WRF-ARW model will be beneficial to the intensity forecast However, the substantially smaller percentages of intensity improvement than those of the Responsible editor: M Kaplan D D Tien Research and Development Division, National Center for Hydro-Meteorological Forecasting, Hanoi, Vietnam T Ngo-Duc Department of Meteorology, Hanoi College of Science, Vietnam National University, Hanoi 10000, Vietnam H T Mai Á C Kieu Laboratory for Weather and Climate Forecasting, Hanoi College of Science, Vietnam National University, Hanoi 10000, Vietnam C Kieu (&) I M Systems Group at NOAA/NWS/NCEP/EMC, College Park, MD 20740, USA e-mail: chanhkq@vnu.edu.vn; chanh.kieu@noaa.gov track error improvement indicate that ambient environment tends to play a less important role in determining the TC intensity as compared to other factors related to the vortex initialization or physics representations in the WRF-ARW model Introduction Despite a steady improvement in the tropical cyclone (TC) track forecast skill over the last few decades, progress in TC intensity forecast skill has been slow to date with marginal improvements mostly observed at the 48 and 72 h forecast times (DeMaria et al 2007; National Hurricane Center (NHC)1) Such a stagnation of the intensity forecast skill is intriguing as various observational analyses as well as theoretical and modeling studies have shown that TC development is rather sensitive to several environmental factors such as sea surface temperature (SST), vertical wind shear, topography, cold air intrusion, or tropical waves activities (Gray 1968; George and Gray 1976; Emanuel 1986; Holland 1997; Chen and Yau 2003; Mandal et al 2007; Hill and Lackmann 2009; Wang 2009; Zhan et al 2012) With the continuous decrease of the track errors at all forecast lead times, one may expect some advance in the intensity forecast skill According to the NHC official report, the mean 3-day maximum surface wind (VMAX) forecast error for the North Atlantic (NATL) basin has been, nevertheless, constant around 9.5 m s-1 since 1990 A question of the linkage between the TC track and intensity forecast skill thus remains elusive This question is of significance as it is desirable to know if a The NHC official reports of the track and intensity errors can be found at: http://www.nhc.noaa.gov/verification/verify5.shtml 123 56 future 50 % improvement of the track forecast skill in a TC dynamical model2 could help reduce its intensity forecast errors inherently Generally speaking, TC intensity forecast errors in a dynamical model can be attributed to three main factors: (1) a poor initial vortex representation of TCs; (2) inadequate representations of the TC physical processes; and (3) wrong ambient environment due to erroneous track forecasts (see e.g., Knaff et al 2003; Bender et al 2007; DeMaria et al 2007; Gall et al 2012) Of the three, the first two factors appear to be the most dominant as previous studies have demonstrated that a poorly initialized vortex often undergoes some unrealistic spin-up or sometimes dissipates quickly before it could develop more consistent dynamics (see, e.g., Bender et al 1993; Kurihara et al 1993; Davidson and Weber 2000; Kwon et al 2002; Kieu and Zhang 2009; Nguyen and Chen 2011) Likewise, the TC vortex can readily drift away from the true development if deficient physical parameterizations are utilized As shown in many studies, model errors associated with such inadequate parameterizations of various physical processes could lead to very different TC strength, even with the same initial condition (see, e.g., Liu et al 1997; Braun and Tao 2000; Shen et al 2000; Davis and Bosart 2002; Kieu and Zhang 2010; Pattnaik and Krishnamurti 2007; Osuri et al 2011) While much of recent efforts in improving the TC intensity forecast have focused on the first two factors, the influence of the ambient environment on the intensity forecast errors in TC dynamical models has so far received the least attention as it is difficult to isolate the intensity errors related specifically to erroneous track forecasts from the other factors under realistic environment conditions By investigating different combinations of the best track datasets and the global forecasting system (GFS) forecasts/ analyses during the 2002–2009 hurricane seasons with a statistical-dynamical model, DeMaria (2010) demonstrated that forecast errors in the Logistic Regression Model (LGEM; DeMaria et al 2007) could be improved by more than 15 % at days and 35 % at days by reducing forecast track error to zero Assuming a linear relationship of the track and intensity errors, DeMaria suggested that a 50 % reduction in track errors could correspond to a 17 % reduction in the intensity errors This result is significant as it indicates that future improvement of the track forecast skill can be beneficial for the intensity forecast skill However, DeMaria’s direct use of the post-analysis best track as an input for the statistical model is a highly idealized assumption that is not achievable with TC dynamical In this study, a TC dynamical model is understood as a numerical forecasting model that is based on a set of the full physics primitive equations These dynamical models are different from statistical (or statistical-dynamical) forecasting models that rely on statistical regression or empirical relationships 123 D D Tien et al models This is because the real atmosphere always exhibits some degree of uncertainties that prevent the TC models from forecasting TC tracks perfectly In this study, we wish to address the connection between the track and intensity errors in the Weather Research and Forecasting (WRF-ARW, V3.2, Skamarock et al 2005) model, using the National Centers for Environmental Prediction (NCEP) global final analysis (FNL) dataset (US National Centers for Environmental Prediction) The objective is to examine how much intensity error reduction the WRF-ARW model could achieve if its TC track errors are reduced as much as possible This question is tackled by analyzing the intensity errors for a set of TCs whose simulated tracks could fit most closely with the best track analyses Separate experiments are conducted for the NATL basin and the Western North Pacific (WPAC) basin to explore the degree to which the intensity errors are related to the track errors for different basins The rest of the paper is organized as follows Section describes the input data, model configuration, experiment setups, and methodology for separating the intensity uncertainties caused by the erroneous tracks Section presents the main results and some discussions are given in the final concluding section Experiment description 2.1 Model The model chosen in this study is a non-hydrostatic version of the Weather Research and Forecasting (WRF-ARW) model (V3.2), which is configured with a two-way interactive, movable, multi-nested (36/12/4 km) grid Due to the large demand of computational resources and data storage, the nested-grid domains are limited to 31 r levels in the vertical direction, and the (x, y) dimensions of 155 155, 151 151, and 151 151 grid points for the 36-, 12-, and 4-km domain, respectively The outermost domain covers an area of *5,600 km 5,600 km, and the 4-km innermost domain spans an area of 600 km 600 km around storm centers and is configured to follow storm centers automatically, using the tracking algorithm provided in the WRF model (see Fig 1) Note that both the intermediate and the innermost domain move with storm centers, and so the model domains change with storms and model integration The model lateral boundary conditions are updated every h A set of physical parameterizations used in all experiments include (a) the modified Kain-Fritsch and BettsMiller-Janjic cumulus parameterization scheme for the 36and 12-km resolution domains; (b) the Yonsei University and Mellor-Yamada-Janjic planetary boundary layer (PBL) TC track and intensity errors in the WRF model 57 data from the Joint Typhoon Warning Center (JTWC) center is used for verification purposes.4 While there are some inconsistencies between different best track datasets for different basins, these above two datasets are of relatively high quality that could provide both the observed TC track and intensity reliably As a reference to examine the intensity errors, the official 10-m wind errors released by the NHC for the NATL basin in the 2010 season is used for comparisons These official 1-, 2-, and 3-day VMAX errors are 5.5, 7.5, and 9.5 m s-1, respectively 2.3 Methodology Fig Illustration of the model domain configurations for simulation of Typhoon Megi (2010) valid at 0000 UTC 17 Oct 2010 Note that the outermost domain is fixed in time, whereas the intermediate (dashed box) and innermost domains (solid box) follow Megi centers every h Initial position of the outermost domain is configured to be at the center of Megi valid at the initial time Solid contours denote the sea level pressure (every hPa) within the innermost domain at the initial time parameterization with the Monin–Obukhov surface layer scheme; (c) the Rapid Radiative Transfer Model (RRTM) scheme for the longwave radiation and the Goddard and Dudhia scheme for the shortwave radiation; and (d) the Lin et al., WSM3, and WSM6 scheme for the cloud microphysics There is no cumulus parameterization for the 4-km resolution domain 2.2 Data The model initial and boundary conditions data for all simulations are taken from the FNL analysis with the 1° 1° resolution such that the best environmental conditions are maintained for the entire simulated period For the best track data needed for evaluating the TC track and intensity errors, the official HURDAT data archive3 is used for all experiments in the NATL basin This best track dataset is well calibrated and contains necessary TC information such as latitudes and longitudes of TC centers, the maximum surface wind (VMAX), or the minimum sea level pressure (PMIN) In the WPAC basin, the best track Available at: http://www.aoml.noaa.gov/hrd/hurdat/Data_Storm.html To isolate part of the intensity errors associated with erroneous storm environment from those caused by inadequate TC initialization or unrealistic model parameterizations, the FNL dataset is used as initial and boundary conditions for all WRF high-resolution simulations such that good track simulations can be maximized The necessity of using the FNL dataset should be emphasized because it is difficult to obtain a track forecast that fits well with observation if real-time products such as the GFS forecasts are used While the current freely accessible FNL data have fairly low resolution, this dataset to some extent represents well the true atmosphere that is often considered adequate for providing acceptable initial and lateral boundary conditions for higher-resolution simulations A recent study of Mohanty et al (2010) showed that the FNL data indeed provide better TC tracks than use of either GFS or Indian Global Forecast products To establish first a climatology of the TC track and intensity errors with the model configuration described in Sect 2.1, two baseline experiments during the 2007–2010 TC seasons, one for the WPAC basin (WPAB) and the other for the North Atlantic basin (NATB), are conducted Because the 1° 1° FNL data does not reflect fully the true atmosphere, it is anticipated that not all TC simulations can produce good tracks; some simulations have large track errors, while the other may have smaller track errors at some lead times After the baseline experiments with the corresponding intensity errors are achieved, a subset of TCs for which its simulated tracks fit closely with the best track analysis is then singled out so that the intensity errors associated with this subset can be calculated and compared to the baseline errors To be more specific for selecting good track simulations, a track is classified as ‘‘a good track’’ if it meets all three conditions: (1) the 1-day track error is \30 km; (2) the 2-day error \50 km; and (3) the 3-day error \70 km These thresholds are quite significant, but acceptable as they are much smaller than the current official track forecast errors, which are 82, 148, and Available at: http://www.usno.navy.mil/JTWC 123 58 189 km for the 1-, 2-, and 3-day lead times, respectively The good-track criteria are applied separately for the NATL basin (NATG) and the WPAC basin (WPAG) In principle, one could impose the criteria for track selection as strictly as possible, so that the simulated tracks best fit with observed tracks However, our trials with different values of track error thresholds showed that a too strict filter would result in a very small sample size For example, imposing a 3-day track error in the NATL basin smaller than 10 km would give a sample with only out of 90 cases listed in Table As such, the above thresholds are chosen to compromise between the fit of the simulated tracks and the sample size Note also that it is necessary to apply all of these above criteria for the 1-, 2-, and 3-day track errors simultaneously such that any alteration to the near-storm environment caused by the storms with potentially large track errors at a given lead time can be minimized The TC cycles chosen in this study are based on two criteria (1) the life cycle5 of any TC has to be at least days so that the simulated 3-day track and intensity errors can be verified; and (2) two consecutive cycles for one TC must be at least day apart to reduce the serial correlation (Aberson and DeMaria 1994) For each simulation, an ensemble of 21 members with different combinations of the planetary boundary, microphysics, cumulus parameterization, and radiation schemes are employed such that the member with the smallest 3-day track error can be captured Note that because TC tracks are influenced greatly by model deficiencies as well as the quality of the initial or boundary condition, there are many storms that not possess any good track simulation during their entire life cycle As seen in Tables and 2, the largest percentage of the good track cases for both the NATL and WPAC basin that can be obtained with our set of simulations is only *32 % of the total number of simulations Since the connection between the intensity and the track forecast skill could vary among different ocean basins, two sets of TCs in the NATL and WPAC basin during the 2007–2010 hurricane seasons are examined separately Our ensemble approach has two main advantages as compared to the use of a single specific model configuration: (1) it removes the systematic bias of model errors associated with one particular model configuration because each simulation has its own model physics combination; and (2) it helps randomize the model characteristics so that the climatology of the track/intensity is best represented, given the limited number of storms Although the above criteria for selecting the good track simulations could help to single out the storms that are A life cycle of a TC is defined in this study as the beginning and the ending of its record in the best track dataset 123 D D Tien et al embedded in the best ambient environment possible, it should be mentioned that a potential wrong ambient environment may still develop within the WRF model due to its inherent model errors even for perfect track simulations There is no exclusive way to prevent such model bias development, unless the model is perfect Recent studies with the use of an ensemble of multiple physics for TC forecasts appear to show that such multi-physics ensemble could help alleviate the problem of model errors (see e.g., Meng and Zhang 2007; Kieu et al 2012) This type of multiple-physics ensemble approach, however, reduces the capability to capture the good tracks, as different ensemble members tend to have different storm movements In this study, no attempt has been made to exclude such model errors as the main focus here is on the relative improvement of the intensity errors between the good track simulations and the general reference simulations In this study, the impacts of the model errors are assumed to be the same among all experiments More detailed analysis of this multiple physics ensemble approach will be presented in our upcoming study While the FNL dataset is considered as one of the best possible representations of the large-scale atmosphere among the reanalysis datasets, it is worth noting that the FNL dataset does not accurately represent TC structure at the mesoscale and below (cf Fig 3) Since the main goal of this study is to examine the relative improvement of the intensity errors between good track cases and the baseline track simulations, we not however attempt to correct the initial vortex representation in any of our experiments Results Figure shows the absolute mean 1-, 2-, and 3-day VMAX errors for the NATB baseline experiment from 2007–2010 (see Table for the list of TCs and the corresponding total number of simulations for each TC) One notices first that the intensity errors in the NATB experiment are fairly large, especially the 1-day error that could reach 10 m s-1 Such large-intensity errors in the NATB experiment are expected due to several sources of uncertainties in NATB including inadequate representation of the initial vortex, sub-optimal choices of model physics, erroneous simulated tracks, low-grade model configurations, and the imperfection of the NCEP FNL analysis dataset Of these, poor vortex initialization appears to be the most dominant factor in causing the large 1-day error This can be seen in almost all simulations, in which incipient vortices interpolated directly from the FNL analysis are typically about 20 % weaker than the observed intensity (in terms of VMAX) To show this point, Fig shows the average difference of the maximum 10-m wind and minimum sea level pressure TC track and intensity errors in the WRF model Table List of the TCs during the 2007–2010 seasons in the North Atlantic basin that are used in the NATB experiment The criteria of the 1-, 2-, and 3-day absolute track errors for the NATG subset are, respectively, B30, 50, and 70 km (in the third column), and B20, 35, and 50 km for the sensitivity analysis with higher criteria for selecting good tracks (last column) Tropical cyclone 59 Total number of simulations in the NATB experiment Number of good track simulations in the NATG subset Number of good track simulations in the sensitivity analysis Dean (2007) Felix (2007) 0 Noel (2007) Erin (2007) 0 Gabrielle (2007) 0 Ingrid (2007) 1 Karen (2007) Bertha (2008) 1 1 Cristobal (2008) 0 Omar (2008) 0 Fay (2008) Gustav (2008) Hanna (2008) Ike (2008) 1 Bill (2009) Fred (2009) 0 Ida (2009) Alex (2010) 2 Igor (2010) 10 4 Colin (2010) 1 Earl (2010) 90 32 20 Danielle (2010) Total between the FNL analysis and the best track observation at the initial time for all cases listed in Table One can see that the difference is relatively small when TCs are weak, but increases rapidly for strong-intensity phases This is especially serious in the WPAC basin in which the difference between FNL analysis and best track is as large as 35 m s-1 in some cases Apparently, such large incipient difference has significant influence on TC development, which explains the large 1-day intensity errors seen in Fig Similar to intensity errors, the track errors in the NATB experiment are still significant despite the use of the FNL analysis with the 1-, 2-, and 3-day mean errors of roughly 45, 112, and 298 km Such significant track errors are anticipated because the inherent errors of the WRF model may result in imperfect storm development, causing the storms to drift away from the real atmosphere, no matter how well the boundary conditions are represented in the FNL analysis In addition, lack of vortex representation in the FNL dataset could also influence the simulated tracks As the good-track simulations are sorted out (see Table for the specific good track selections), the mean VMAX errors show some noticeable improvement Except for the 1-day error that shows no significant change, the 2- and 3-day VMAX errors are reduced from 12.7 and 12.9 m s-1 in the NATB experiment to *10.8, and 10.4 m s-1 in the NATG sample at 90 % significance,6 respectively (Fig 2, solid lines) Comparison of the relative ratios of the track and intensity errors between NATB and NATG samples shows that the 55 and 76 % reduction of the 2-, and 3-day track errors correspond to a reduction of *15 and 19 % in the intensity errors at the 90 % significance level This indicates that a large portion of the intensity errors are not determined simply by the storm track, but attributed more to other factors such as initial condition or model physics Note that the impact of the inferior vortex representation or inadequate model physics exists in both the baseline and the good track sample (see Fig 3), because there is no simple way to isolate these factors in the two samples The fact that both the insufficient vortex initialization and potential model errors associated with the physics representation are included in both the NATL and NATG samples indicates that any intensity error reduction in the NATG sample should therefore be Statistical significance is evaluated by using the non-parametric hypothesis test 123 60 Table List of the TCs during the 2007–2010 seasons in the North-Western Pacific basin that are used in the WPAB experiment D D Tien et al Tropical cyclone Number of good track simulations in the sensitivity analysis 1 Sepat (2007) Fitow (2007) 0 Hagibis (2007) Man-Yi (2007) Mitag (2007) Krosa (2007) Lekima (2007) 2 1 Nari (2007) 1 Wipha (2007) 1 Fengshen (2008) 0 Fung-Wong (2008) Halong (2008) 0 Jangmi (2008) 1 Higos (2008) Hagupit (2008) 0 Nakri (2008) 1 Maysak (2008) 1 14 Morakot (2009) 0 Vamco (2009) 2 Ketsana (2009) Megi (2010) 0 Chaba (2010) Conson (2010) 0 Chanthu (2010) 92 30 16 Total considered as a direct consequence of the smaller track errors in the NATG sample To further examine the dependence of the intensity errors on the track errors, a stricter set of criteria for selecting good tracks is tested in which a simulated track is now considered as a good track if its 1-, 2-, and 3-day absolute track errors are smaller than or equal to 20, 35, and 50 km This corresponds to 56, 68, and 83 % reduction of the track errors relative to the NATB mean track errors (see Table 1) These stringent criteria reduce the number of good track cases to only 20 cases (Table 1) As seen in Fig 2, the intensity errors corresponding to this new track filter not, however, seem to decrease any further in spite of the better track selection Of course, the sample size of 20 cases is too small to have any definite result for this situation, but one can notice at least that further reduction of the track errors does not help reduce intensity errors at any lead time at the 90 % level A limit seems to exist : even the perfect tracks could not help reduce the intensity errors further in the NATL basin 123 Number of good track simulations in the WPAG subset Usagi (2007) Parma (2009) The criteria of the 1-, 2-, and 3-day absolute track errors for the good track selection in the WPAG subset and in the sensitivity analysis are similar to those in Table Total number of simulations in the WPAB experiment In terms of PMIN errors, a similar behavior to VMAX errors is observed; the 2- and 3-day PMIN errors decrease from 15.5 and 16.9 hPa in the NATB sample to *13.5 and 15.1 hPa in the NATG sample This corresponds to 15 and 11 % decrease of the 2-, and 3-day errors, respectively (Fig 2b) Note, however, that the percentage of PMIN error reduction does not seem to match with that of the VMAX errors due to uncertainty in the minimum sea level pressure calculation in the WRF model This is because the sea level pressure is a diagnostic variable that is determined by several prognostic variables including geopotential height, pressure, temperature, and water vapor mixing ratio Thus, the resulting improvement of the PMIN errors should be degraded as these errors are the sum of the relative errors from other prognostic variables In addition, there is also significant uncertainty in the observations of PMIN in the best track, which may contribute further to the fluctuations in the statistics as well These explain the slightly smaller improvement of PMIN as compared to VMAX When a stricter set of criteria for good track selection are applied TC track and intensity errors in the WRF model NATB NATG Sensitivity 61 (a) (b) Fig a The absolute errors of the simulated maximum 10-m wind (VMAX, columns, unit: m s-1) for the NATB experiment (dark gray), the NATG subset with the 1-, 2-, and 3-day absolute track errors B30, 50, and 70 km (medium gray), and the sensitivity sample with the 1-, 2-, and 3-day track errors B20, 35, and 50 km (light gray); b similar to a but for the absolute minimum sea level pressure errors (PMIN, unit hPa) Superimposed are the corresponding track errors (lines) for the NATB experiment (circle), the NATG sample (square), and the sensitivity test (diamond) The error bars denote the 90 % confidence intervals, and the percentages of the improvement are provided next to the x-axis (diamond line in Fig 2b), the percentage of reduction is also nearly unchanged as observed for the VMAX errors Again, the change in the 1-day PMIN error is not significant in both the NATG sample and the sensitivity sample when stronger criteria for track selection are applied Unlike the NATL basin, VMAX errors in the WPAC basin include several features not evident in the NATL basin cases First, the VMAX errors in the WPAB experiments are higher than those in NATB at all times (Fig 4a) In particular, the 1-day VMAX error is considerably larger than the corresponding error in the NATL basin due to much weaker incipient vortices initialized from the FNL analysis in the WPAC basin (cf Fig 3) On average, WPAC initial vortices are 30–35 % weaker than the observed intensity There are some cases for which the strength of the initial vortices is not even half of the observed storms (e.g., Typhoon Sepat initialized at 0000 UTC 16 August 2007 or Typhoon Nari initialized at 0000 UTC 14 September 2008) This insufficient vortex initialization causes significant impacts to the VMAX errors for the Fig Mean initial difference of the maximum 10 m wind for the baseline sample (dark shaded) and the good track sample (dark striped), and the minimum sea level pressure for the baseline sample (gray) and the good track sample (gray striped) between the FNL analysis and the best track data for the North Atlantic basin and North-Western Pacific (lower panel) The average difference is calculated from all storms listed in Tables and 2, and is stratified according to the storm initial intensity The numbers next to the x-axis denote the number of cases for each bin first 24 h in the WPAB basin as compared to the NATL basin Despite the inferior vortex initialization, it is of interest to notice that the 2- and 3-day VMAX errors show slightly more improvement after the bad track simulations in the WPAB sample are eliminated (i.e., the WPAG sample) Although the 1-day VMAX error does not show any convincing decrease as seen in the NATG experiments, the 2and 3-day VMAX errors decrease from 10.6 and 12.1 m s-1 in the WPAB sample to 9.1 and 9.8 m s-1 in the WPAG sample, which correspond to 15 and 19 % decrease in the VMAX errors, respectively This is noteworthy as it indicates that the better TC tracks in WPAC tend to help improve the intensity forecast as efficiently as in the NATL basin Recent studies by Kehoe et al (2007) show that most of the large track error cases in WPAC are associated to some degree with the subtropical high system that tends to steer storms into an inimical environment As a result, it is expected that improvement in the track forecast could lead to more noteworthy changes in the intensity errors in this basin With the 1-, 2-, and 3-day track errors in the WPAB experiment of *58, 131, and 315 km, it is seen that 123 62 similar to the NATL basin the percentages of the intensity error reduction are much smaller than those for track errors The fact that both the NATL and WPAC basin exhibit similar smaller intensity error reduction despite more than 50 % improvement of the track accuracy indicates again that a large portion of the intensity errors is determined by other factors such as model initialization or model physics In particular, the model physics tends to play a major role in determining the predictability of the TC intensity at the times longer than days as this is the factor that controls not only the physics of TCs, but also the characteristics of the ambient environment that the TCs are embedded in There appears to be growing evidence of the important role of model physics at long ranges beyond days For example, real-time experiments with the hurricane WRF model conducted at NCEP7 showed that intensity errors from different models with and without vortex initialization appear to be comparable after days into integration regardless of the vortex initial strength This implies that the impacts of vortex initialization tend to be most influential for the first 36–48 h Of course, such initial impacts could vary from case to case, but this suggests that the roles of model physics should be more important at the longer range Of further significance is that although there is virtually no additional decrease of the intensity errors in the NATL basin when higher criteria for selecting good tracks are applied in the sensitivity analysis (the diamond solid line in Fig 2a), there seems to be some extra decrease of the VMAX errors when these higher criteria are used in the WPAC basin (Fig 4a) This is seen most clearly in terms of the PMIN errors for which we notice that the smaller track errors in the sensitivity analysis could help reduce the 3-day PMIN errors up to 21 % (Fig 4b) This result is consistent with the larger improvement of the VMAX errors in the WPAG sample as compared to the NATB sample, and demonstrates that TC intensity in the WPAC basin appears to be more sensitive to the track errors Therefore, any significant improvement in the track forecast skill is likely more beneficial to the intensity forecasts in WPAC than in the NATL basin Discussions and conclusions In this study, the dependence of the tropical cyclone (TC) intensity errors on the track errors in the WRF-ARW model has been investigated Two baseline experiments during the 2007–2010 TC seasons in the North Atlantic (NATL) and Reports of numerous NCEP Hurricane WRF (HWRF) model realtime performance are available at: http://www.emc.ncep.noaa.gov/ HWRF/weeklies 123 D D Tien et al WPAB WPAG Sensitivity (a) (b) Fig Similar to Fig but for the WPAC basin North-Western Pacific (WPAC) basin were first conducted to establish the climatology of the track and intensity errors for the WRF-ARW model, using a triple-nested stormfollowing high-resolution configuration with the NCEP final analysis (FNL) as the initial and boundary conditions Examination of the maximum 10-m wind (VMAX) and the minimum sea level pressure (PMIN) errors in the NATL and WPAC basins showed that the 1-day intensity error is substantial in both basins due mostly to inadequate vortex representation inherited in the FNL dataset This issue is more apparent in the WPAC basin where both the coverage and quality of in situ observation data are low Despite the significant intensity errors, the overall track errors in both WPAC and NATL basins are, however, comparable to the current official best track errors, indicating the importance of the FNL dataset in providing proper steering environment for the TC movement through the lateral boundary conditions By using a random physics ensemble approach to remove the model errors related to bias in model physics parameterization, it was found that the 2- and 3-day intensity errors can be reduced significantly as compared to the baseline experiments for both the WPAC and NATL basin after the large track error simulations were sorted out by selecting only simulations with 1-, 2-, and 3-day track errors smaller than 30, 50, and 70 km In terms of VMAX, the 2-, and 3-day errors for the good-track simulations are improved by 15 and 19 % for the NATL basin, whereas the 1-day intensity error shows no significant reduction Such TC track and intensity errors in the WRF model intensity improvement is, however, much smaller than the track improvement for the corresponding times, which are 55 and 76 %, respectively This indicates that a large portion of TC intensity uncertainty is determined by other factors such as vortex initialization or the internal TC physics that is not well represented in the WRF model For the WPAC basin, it was found that the reduction of the intensity errors after eliminating bad track simulations is somewhat similar to that in the NATL basin; the 2- and 3-day VMAX errors decrease by about 15 and 19 % when the large track error simulations are excluded The VMAX errors appear to decrease even further when the higher criteria for good track selection are applied, while the same criteria for good track selection could not help improve the intensity errors in the NATL basin Such intensity improvement in WPAC indicates that ambient environment tends to play an important role in the TC intensity forecast in this area Thus, future improvement in the track forecast skill is expected to be favorable to the intensity skill in both the WPAC and the NATL basin Given the results obtained so far with the WRF-ARW model, a lingering question is why there has been little improvement in the intensity forecast skill for the last 30 years despite a remarkable progress in the track forecast skill Note that the official intensity forecast skill is not simply derived from dynamical models, but typically from a statistical-dynamical model or subjective guidance that involves some empirical constraints So, it is not possible to apply the results obtained with a dynamical model directly to the official skill However, the intimate dependence of the official forecasts on the dynamical models suggests that several issues related to the dynamical models could help explain the stagnation of the intensity forecast skill First, with the 3-day official track error reduced from 450 km to *250 km during the last 30 years, our results suggest that the corresponding improvement in the intensity skill associated with such improvement of track forecast skill may have been at most a few percent These intensity improvements are perhaps too small and could have been blurred by the growing complexity of the operational TC models Furthermore, various inherent uncertainties in the current TC models could have blocked the slight progress in the intensity errors associated with better track forecasts As seen from the good-track samples in both the WPAC and NATL basins, 70 % reduction of the track errors could only deliver about 15 % reduction of the intensity errors even with the help of the NCEP FNL dataset Thus, the intensity forecast skill would not improve much, unless there was some significant progress in TC model physics Second, our conclusions obtained with the WRF model are strictly limited to the track and intensity simulations rather than the true track and intensity forecasts due to the 63 use of the NCEP FNL analysis data As mentioned in Sect 2, this final analysis dataset is essential to obtain as many good track simulations as possible within our computational resource With about 90 TCs and 35 cases with good track simulations in each basin, it is clear that our results may not be entirely conclusive and should be therefore considered only as an upper limit for evaluating the track-intensity connection in the WRF model In addition, there is potentially some considerable difference in the error statistics between strong versus weak storms that our study could not explain due to the small sample size In particular, simulations with strong storms could possess larger intensity errors due to much larger initial intensity difference In this study, we have however not performed any stratification of storm statistics because the total number of cases after imposing the criteria for selecting the good track cases was too small (*35 cases totally for each basin) Our implicit assumption was that the samples of both reference and the good-track simulations are sufficiently homogenous for all range of storm initial intensity, model physics, and boundary influences such that the relative improvement between the general statistics and the good-track statistics can be realized Roles of model vortex initialization and assimilation of additional sources of observation to enhance the storm environment will be examined in our upcoming study Acknowledgments We would like to thank Buck Sampson at Naval Research Laboratory-Monterey for his various valuable suggestions and corrections We would like also to extend our thanks to the two anonymous reviewers for their very constructive comments and suggestions, which helped improve the manuscript greatly This research was supported by the Vietnam Ministry of Science and Technology Foundation DT.NCCB-DHUD.2011-G10 The FNL data for this study are from the Research Data Archive (RDA) which is maintained by the Computational and Information Systems Laboratory (CISL) at the National Center for Atmospheric Research (NCAR) References Aberson SD, DeMaria M (1994) Verification of a nested barotropic hurricane track forecast model (VICBAR) Mon Weather Rev 122:2804–2815 Bender MA, Ross RJ, Tuleya RE, Kurihara Y (1993) Improvements in tropical cyclone track and intensity forecasts using the GFDL initialization system Mon Weather Rev 121:2046–2061 Bender MA, Ginis I, Tuleya R, Thomas B, Marchok T (2007) The operational GFDL coupled hurricane–ocean prediction system and a summary of its performance Mon Weather Rev 135:3965–3989 Braun SA, Tao WK (2000) Sensitivity of high-resolution simulations of Hurricane Bob (1991) to planetary boundary layer parameterizations Mon Weather Rev 128:3941–3961 Chen YS, Yau MK (2003) Asymmetric structures in a simulated landfalling hurricane J Atmos Sci 60:2294–2312 Davidson NE, Weber HC (2000) The BMRC high-resolution tropical cyclone prediction system: TC-LAPS Mon Weather Rev 128:1245–1265 123 64 Davis C, Bosart LF (2002) Numerical simulations of the genesis of Hurricane Diana (1984) Part II: sensitivity of track and intensity prediction Mon Weather Rev 130:1100–1124 DeMaria M (2010) Tropical cyclone intensity change predictability estimates using a statistical-dynamical model In: 29th AMS conference on hurricanes and tropical meteorology, Tucson, AZ https://ams.confex.com/ams/29Hurricanes/techprogram/paper_ 167916.htm DeMaria M, Knaff JA, Sampson C (2007) Evaluation of long-term trends in tropical cyclone intensity forecasts Meteorol Atmos Phys 97:19–28 Emanuel KA (1986) An air–sea interaction theory for tropical cyclones Part I: steady-state maintenance J Atmos Sci 43:585–604 Gall R, Franklin J, Marks F, Rappaport EN, Toepfer F (2012) The hurricane forecast improvement project Bull Am Meteorol Soc doi:10.1175/BAMS-D-12-00071.1 George JE, Gray WM (1976) Tropical cyclone motion and surrounding parameter relationships J Appl Meteorol 15:1252–1264 Gray WM (1968) Global view of the origin of tropical disturbances and storms Mon Weather Rev 96:669–700 Hill KA, Lackmann GM (2009) Influence of environmental humidity on tropical cyclone size Mon Weather Rev 137:3294–3315 Holland GJ (1997) The maximum potential intensity of tropical cyclones J Atmos Sci 54:2519–2541 Kehoe RM, Boothe MA, Elsberry RL (2007) Dynamical tropical cyclone 96- and 120-h track forecast errors in the western North Pacific Weather Forecast 22:520–538 Kieu CQ, Zhang DL (2009) An analytical model for the rapid intensification of tropical cyclones Q J R Meteorol Soc 135:1336–1349 Kieu CQ, Zhang DL (2010) Genesis of Tropical Storm Eugene (2005) associated with the ITCZ breakdowns Part III: sensitivity to different initial conditions J Atmos Sci 67:1745–1758 Kieu CQ, Truong NM, Mai HT, Ngo-Duc T (2012) Sensitivity of the track and intensity forecasts of Typhoon Megi (2010) to satellitederived atmospheric motion vectors with the ensemble Kalman filter J Atmos Ocean Technol 29:1794–1810 Knaff JA, Kossin JP, DeMaria M (2003) Annular hurricanes Weather Forecast 18:204–223 123 D D Tien et al Kurihara Y, Bender MA, Ross RJ (1993) An initialization scheme of hurricane models by vortex specification Mon Weather Rev 121:2030–2045 Kwon HJ, Won SH, Suh AS, Chung HS (2002) GFDL-Type typhoon Initialization in MM5 Mon Weather Rev 130:2966–2974 Liu Y, Zhang DL, Yau MK (1997) A multiscale numerical study of Hurricane Andrew (1992) Part I: explicit simulation and verification Mon Weather Rev 125:3073–3093 Mandal M, Mohanty UC, Sinha P, Ali MM (2007) Impact of sea surface temperature in modulating movement and intensity of tropical cyclones Nat Hazards 41:413–427 Meng Z, Zhang F (2007) Tests of an ensemble Kalman filter for mesoscale and regional-scale data assimilation Part II: imperfect model experiments Mon Weather Rev 135:1403–1423 Mohanty UC, Osuri KK, Routray A, Mohapatra M, Pattanayak S (2010) Simulation of Bay of Bengal tropical cyclones with WRF model: impact of initial and boundary conditions Mar Geod 33:294–314 Nguyen VH, Chen Y-L (2011) High-resolution initialization and simulations of typhoon morakot (2009) Mon Wea Rev 139:1463–1491 http://dx.doi.org/10.1175/2011MWR3505.1 Osuri KK, Mohanty UC, Routray A, Kulkarni MA, Mohapatra M (2011) Customization of WRF-ARW model with physical parameterization schemes for the simulation of tropical cyclones over North Indian Ocean Nat Hazards doi:10.1007/s11069-0119862-0 Pattnaik S, Krishnamurti TN (2007) Impact of cloud microphysical processes on hurricane intensity, part 1: control run Meteorol Atmos Phys 97:1–4 Shen W, Tuleya RE, Ginis I (2000) A sensitivity study of the thermodynamic environment on GFDL model hurricane intensity: implications for global warming J Clim 13:109–121 Skamarock WC, Klemp JB, Dudhia J, Gill DO, Barker DM, Wang W, Powers JG (2005) A description of the advanced research WRF Version NCAR Technical Note TN-468 ? STR Wang Y (2009) How outer spiral rainbands affect tropical cyclone structure and intensity? J Atmos Sci 66:1250–1273 Zhan R, Wang Y, Ying M (2012) Seasonal forecasts of tropical cyclone activity over the western North Pacific: a review Trop Cyclone Res Rev 3:307–324 ... model For the WPAC basin, it was found that the reduction of the intensity errors after eliminating bad track simulations is somewhat similar to that in the NATL basin; the 2- and 3-day VMAX errors. .. environmental conditions are maintained for the entire simulated period For the best track data needed for evaluating the TC track and intensity errors, the official HURDAT data archive3 is used for all... configuration with the NCEP final analysis (FNL) as the initial and boundary conditions Examination of the maximum 10-m wind (VMAX) and the minimum sea level pressure (PMIN) errors in the NATL and