**Advertisement**

ABSTRACT

This study evaluates and compares the performances of several variants of the popular ensemble

(

1. Introduction

Providing accurate and timely forecasts of storm surge is a problem of critical importance. We consider the problem of improving the relative accuracy of short-range forecasts of storm surge using sophisticated models solved on numerically coarse grids to provide timely predictions of elevated water levels. While coarse discretizations of models may be used to quickly forecast storm surge, we expect large numerical errors to arise due to the dis- cretizations. We implement various data assimilation methodologies to compare the relative performances and capabilities of these schemes in improving the accuracy of forecasts. Below, we summarize the recent history of storm surge events that has spurred the mathematical de- velopment of state-of-the-art hydrodynamic models.

The effects of storm surge from a number of extreme weather events dating back several decades have motivated efforts to accurately forecast water elevations in order to minimize both the impact on economic activities and the loss of human life. In 1953, a catastrophic storm in the

The modeling and numerical simulation of storm surge has undergone several stages of evolution since the 1953

To reduce the computational cost, a parallel archi- tecture and advanced numerical discretization schemes were adopted in the advanced circulation (ADCIRC) storm surge model, enabling 20 min of wall clock time per day of real-time simulation on very fine grids using 16 384 cores (

Data assimilation (DA) methodologies can enrich model simulations and predictions by constraining their outputs with available observations. DA methods generally fall into one of two categories: variational methods that are essentially least squares model data fitting methods and sequential methods based on the

Different EnKF variants were developed in recent years. Depending on whether or not the observations are perturbed before assimilation, it is customary to classify these variants of the EnKF as belonging to one of two types (Tippett et al. 2003): stochastic EnKF (SEnKF; see, e.g., Burgers et al. 1998; Houtekamer and Mitchell 1998) or deterministic ensemble square root filters (SR-EnKF; see, e.g., Anderson 2001; Bishop et al. 2001; Whitaker and Hamill 2002; Hoteit et al. 2002). A SEnKF essentially updates each forecast ensemble member with perturbed observations using the

Specific to storm surge modeling resulting from hurri- canes, the SEIK filter was recently applied to the short- range forecasting problem using the extensively validated ADCIRC model (

In practice, the ensemble sizes of EnKFs are signifi- cantly smaller than the numerical dimension of the system state [e.g., ensemble sizes are often O(10-100), while the dimension of state vectors can be in the mil- lions]. Hence, the sample error covariance matrix is al- ways singular. This presents a challenge for the EnKFs to use discrepancies between model forecasts and data to accurately update the system state. Indeed, a small ensemble size could lead to systematically underesti- mated variances and spuriously large cross covariances in the sample error covariance matrix (Hamill et al. 2001). These specific issues often limit the performance of an EnKF. To mitigate these undesirable effects, it is cus- tomary to introduce the auxiliary techniques of co- variance inflation (Anderson and Anderson 1999) and localization (Hamill et al. 2001). Specifically, covariance inflation partly addresses the issue of underestimation of the variances, while covariance localization tends to tackle the problems of singularity and overestimation of the cross covariances. In this work, more attention will be paid to the investigation of the efficiency and use- fulness of localization methods in the context of storm surge forecasting since the amount of available data in storm surge forecasting is limited by the number of ob- servations sensors deployed over coastal regions.

The paper is organized as follows: Section 2 presents an overview of the various EnKFs implemented in this study. Section 3 describes the auxiliary techniques of inflation and localization. An overview of the storm surge model, ADCIRC, is presented in section 4.In section 5, the performances of the various filters for forecasting the storm surge of Hurricane Ike are ana- lyzed. Concluding remarks follow in section 6.

2. Ensemble

Consider the state estimation problem for the fol- lowing abstract system:

... (2.1a)

... (2.1b)

Here, xk 2 Rmx is the mx-dimensional system state at time instant k, yk 2 Rmy is the corresponding measurement (observation) of xk, uk 2 Rmx is the dynamical noise, and vk 2 Rmy is the observation noise. The transition operator Mk,k21: Rmx /Rmx maps xk21 to xk, and the observation operator Hk: Rmx /Rmy projects xk from the state space onto the observation space. When Mk,k21 and Hk are linear operators, for example, matrices, it is common to rewrite them in a different font style, as Mk,k21 and Hk, respectively, to distinguish them from the operators in the nonlinear cases (see, e.g., the appendix). It is also assumed that uk and vk are independent white noise of mean zero and covariance matrices Qk and Rk, respectively.

The EnKFs estimate the system state xk at time in- stant k, given the observations Yk 5 fyk, yk21, ...g up to and including time k and some prior knowledge of the system state xi at some instant i # k. If both the dy- namical and observation systems are linear, the mini- mum variance [and maximum a posteriori (MAP)] solution to the state estimation problem is determined by the

The literature provides many variations on the implementation of the classical EnKF. In this study, we confine ourselves to the following variants: the SEnKF and three SR-EnKFs, namely, the ETKF, EAKF, and SEIK. For conciseness, we outline the main procedures of these filters in the appendix. To avoid complicating the discussion, we have focused on introducing the ''plain'' forms of these ensemble filters in the appendix without covariance inflation or localization. However, these two important auxiliary techniques are adopted in all of the numerical experiments and are briefly dis- cussed in section 3 below.

3. Two auxiliary techniques in the EnKF

When an EnKF is used for data assimilation in large- scale models, more often than not we can only afford to implement the filter with a relatively small ensemble size. This results in some undesirable effects such as rank deficiency, underestimation of variances of the system state, and overestimation of the corresponding cross covariances (Hamill et al. 2009; Whitaker and Hamill 2002). It is customary to introduce covariance inflation (Anderson and Anderson 1999) and localization (Hamill et al. 2001) in order to mitigate these effects.

Covariance inflation addresses the problem of vari- ance underestimation (Anderson and Anderson 1999). The motivation for covariance inflation is based on the observation that the sample variances of the system state tend to be underestimated with a relatively small ensemble size (and often neglected model errors), so we deliberately inflate the variances by a prescribed amount.1 In many situations, proper covariance inflation not only improves the estimation accuracy of the filter (Anderson and Anderson 1999), but also enhances its robustness from the point of view of robust filtering (Luo and Hoteit 2011) or ''residual nudging'' in the observa- tion space (Luo and Hoteit 2013). Various inflation methods have been proposed and studied in the literature (see, e.g., Altaf et al. 2013; Anderson and Anderson 1999; Anderson 2007, 2009; Bocquet and Sakov 2012; Hamill and Whitaker2011; Luo and Hoteit 2011, 2013; Meng and Zhang 2007; Miyoshi 2011; Whitaker and Hamill 2012; Zhang et al. 2004). A numerical comparison of different inflation schemes is beyond the scope of the current work. In this study we adopt the conventional inflation scheme originally proposed by Anderson and Anderson (1999) in all of the numerical experiments. Specifically, we imple- mented this scheme in such a way that the forecast sample covariance is (in effect) multiplied by a con- stant factor l2 [ (1 1 d)2 for a positive scalar d.

Localization is introduced into the EnKF in order to tackle the problems of rank deficiency and spuriously large cross covariances between different state variables (Hamill et al. 2001). One popular localization method is covariance localization (CL; see, e.g., Hamill et al. 2001). In this method, a tapering matrix based on the distances between the grid points of a physical model is computed. The

4. The ADCIRC model

The ADCIRC model (Luettich and Westerink 2005) solves the shallow water equations (SWEs) that describe the changes in sea surface elevation and depth- integrated horizontal flow on spatial domains such as the

Many hindcast studies of hurricanes from 1965 to 2008 have been used to verify and validate the ADCIRC model (see, e.g., Westerink et al. 2008; Bunya et al. 2010; Dietrich et al. 2010; Kennedy et al. 2011; Hope et al. 2013). The model may be run in forecast mode where data on the hurricane track and forward speed, and wind characteristics (wind speed, central pressure, and radius- to-maximum winds), are obtained every 6 h from the

5. Numerical experiments

In this section, results of the various EnKFs dis- cussed in section 2-all equipped with the LA and inflation techniques-are presented. We use meteo- rological data from Hurricane Ike, which at its peak was a category 4 hurricane and was a category 2 hur- ricane upon making landfall along the upper

a. Configuration

The assimilation experiments are conducted using two different configurations of ADCIRC. The first configu- ration uses a fine-resolution grid including the

The second configuration contains model errors (with respect to the hindcast configuration) and is used as the forecast model in the filters. The forecast model is con- figured using a coarser-resolution grid including only the

Since the results of the hindcast studies have been validated, the corresponding global output is considered as the truth and is compared to the solution of the coarse model to evaluate and compare the performance of the various ensemble filters. In all the experiments, we set the standard deviation of the measurement noise of the hindcast data to produce an assumed 95% confidence interval of 60.01 m, as in

For the coarse model, after a 24-h spinup period between

To generate a representative initial ensemble with a small number of ensemble members, we apply an empirical orthogonal function (EOF) analysis by second- order exact sampling as done in earlier studies (Pham 2001; Hoteit et al. 2013). We simulated the ADCIRC model for 60 days using only tidal forcing to eliminate all transient behavior and recorded the model state every 5 h. The perturbations of these states from their mean are used to define a sample covariance matrix P from which the initial ensemble members are drawn. The ratio ... (with sj being the jth eigenvalue of P) represents the relative error in the square L2 norm of approximations to the state in an (n 2 1)-dimensional space and is useful in determining the ensemble size n given a prescribed L2 error tolerance, which is also the percentage of variance retained by the EOFs. In the experiments below, we start with an ensemble size of n 5 10 that retains approximately 90% of the variance of this sequence of states suggesting, as expected, that the water elevation exhibits a low-dimensional structure when forced with tidal data.

b. Results and discussion

To quantify and compare the various filter perfor- mances, an rms error metric is used. Figure 5 plots the average rms errors of the maximum water level forecasts for the Ike simulations using the SEnKF and the three SR-EnKFs with different values of inflation factor l and radii (in kilometers). The assimilation results show that the SR-EnKFs perform very well with an ensemble of 10 members, though, as expected, the results are dependent on the localization radius. The optimal size for the LA varies from 25 to 100 km for all the SR-EnKFs.

The rms error of the SEIK filter varies from 0.58 to 0.75 m, with the smallest rms error obtained using l 5 1.2 and a radius of 100 km. Overall, the SEIK filter is able to reduce the rms error by almost 27% as compared to the forecasted average rms error when no localiza- tion is used. The ETKF and EAKF exhibit similar trends. The smallest rms error for the ETKF is obtained using l 5 1.2 and a radius of 25 km and for the EAKF using l 5 1.3 and a radius of 100 km. The SEIK and the ETKF showed very similar trends, while the EAKF provides comparable results with appropriate choices of localization and inflation. The EAKF is more sensitive to (and requires larger values of) the inflation. In par- ticular, the EAKF requires stronger localization radii than the ETKF and SEIK and failed to provide signifi- cant improvements with large radii. Such a difference in behavior can possibly be attributed to the serial assim- ilation of the observations in the EAKF when it is equipped with LA. For any filter using a 2000-km radius (which is a large radius compared to the size of the

By comparison, improvements are not as pronounced in the SEnKF with an ensemble of 10 members. The rms errors for the SEnKF vary between 0.66 and 0.75 m, with the smallest rms error obtained using l 5 1.2 and a ra- dius of 500 km. Overall, no clear pattern of improve- ment is found with the SEnKF compared to the forecasted average rms error when LA is used for the three SR-EnKFs. It is likely that the large rms errors in the SEnKF are due to the observation sampling errors being amplified with the use of a small ensemble size in these runs, which is a documented phenomenon (Nerger et al. 2005).

Figure 6 shows the average rms errors of the maxi- mum water level forecasts using the SEnKF and the ETKF for ensembles with N 5 10, N 5 20, and N 5 40 members, respectively. Here, the SEnkF is compared only against the ETKF based on the results with an ensemble of size N 5 10, where all three SR-EnKFs demonstrated comparable performances. As expected, the results show that the SEnKF performs better with increasing ensemble size, and a pattern becomes visible in the rms errors when the ensemble size reaches 40, as we get close to the number of assimilated observations. The rms errors for the SEnKF now vary between 0.54 and 0.75 m, with the smallest rms error obtained using l 5 1.1 and a radius of 100 km. Although the re- sults from the ETKF remain comparatively better than the SEnKF, we expect that the SEnKF will converge to similar results with larger ensemble sizes. It is evident from Fig. 6 that the ETKF forecasts are only slightly improved as we increase the ensemble size, but the im- provements are not as pronounced as in the SEnKF.

While the averaged rms errors provide a summary statistic of the estimation errors, they fail to provide useful information about the time or location where they occur. We are also interested in certain pointwise errors of maximum water level forecasts along the coast (298- 29.88N, 94.48-95.258W; see Fig. 7) and forecasts of water elevations at particular times along the coast. Specifi- cally, the forecast errors in the times leading up to the landfall event for Hurricane Ike are of particular im- portance and interest. Since it is not possible to study each configuration, the figures presented below illus- trate the improvements in the errors obtained using the ETKF compared to the SEnKF for 2-h forecasts of the storm surge using the best values of the inflation factor l and radii in the LA.

Figures 7 and 8 show plots of the errors between the true forecasts and analysis of water elevations at

Figures 9 and 10 show plots of the hydrographs of data from the hindcast at two stations close to the landfall areas. In these hydrographs, the stars denote the true measurements at the assimilation times, the plus signs denote the forecasted results with the 95% confidence intervals represented by the vertical dashed lines cen- tered at plus signs, and the circles are the analyzed re- sults for the ETKF filter with l 5 1.2 and a radius of 25km and the SEnKF filter with l 5 1.1 and a radius of 100 km, respectively. We observe that forecast errors increase right before or during the surge. The analysis steps bring the model closer to the truth over the entire assimilation window. In particular, the ETKF filter performs very well, providing accurate forecast updates. Overall, the estimated uncertainties are quite reason- able with the truth falling within the estimated 95% confidence intervals.

Finally, Fig. 11 compares the forecast ensemble stan- dard deviation and rms errors between the forecast en- semble members and the truth for the three stations close to the landfall area during the landfall period. These results are again for the best choices of inflation factor and localization radii (i.e., the ETKF filter with l 5 1.2 and a radius of 25 km and the SEnKF filter with l 5 1.1 and a radius of 100 km). We observe that the ensemble variances are generally comparable to the rms error. The ETKF produces rms errors that are consis- tently the smallest during the storm period compared to the SEnKF. By comparison, the rms error in the SEnKF is more consistent with the forecasted ensemble vari- ances, particularly during the period of few hours pre- ceding the landfall.

6. Conclusions

We investigated and compared the impacts of co- variance inflation and localization on four ensemble

The assimilation results also suggest that the (deter- ministic) square root ensemble

Acknowledgments. Research reported in this publica- tion was supported by the

1 Covariance inflation is also done through the forgetting factor in Pham et al. (1998).

REFERENCES

Altaf, M. U., T.

Anderson, J. L., 2001: An ensemble adjustment

_____, 2007: An adaptive covariance inflation error correction algorithm for ensemble filters. Tellus, 59A, 210-224, doi:10.1111/j.1600-0870.2006.00216.x.

_____, 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters. Tellus, 61A, 72-83, doi:10.1111/ j.1600-0870.2008.00361.x.

_____, and

Beezley, J. D., and

Bennet, A., 1992: Inverse Methods in Physical Oceanography.

Berg, R., 2009: Tropical cyclone report: Hurricane Ike. National Hurricane Center Rep., 55 pp.

Bishop, C. H.,

Blain, C. A., J. J. Westrink, and

Blake, E. S.,

Bocquet, M., and P. Sakov, 2012: Combining inflation-free and it- erative ensemble

Bunya, S., and Coauthors, 2010: A high-resolution coupled riverine flow, tide, wind, wind wave, and storm surge model for southern

Burgers, G.,

Cohn, S. E., and R. Todling, 1996: Approximate data assimilation schemes for stable and unstable dynamics. J. Meteor. Soc.

_____,

Dietrich, J. C., and Coauthors, 2010: A high-resolution coupled riverine flow, tide, wind, wind wave, and storm surge model for southern

_____, and Coauthors, 2011: Hurricane Gustav (2008) waves and storm surge: Hindcast, synoptic analysis, and validation in southern

Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using

_____, 2003: The ensemble

Fleming, J. G.,

Greybush, S. J.,

Hamill, T. M., and

_____, _____, and

_____, _____,

Heaps, N. S., 1983: Storm surges, 1967-1982. Geophys.

Hope, M. E., and Coauthors, 2013: Hindcast and validation of Hurricane Ike (2008) waves, forerunner, and storm surge. J. Geophys. Res. Oceans, 118, 4424-4460, doi:10.1002/ jgrc.20314.

Horn, R., and

Hoteit, I.,

_____, T. Hoar, G. Gopalakrishnan,

Houtekamer, P. L., and

Hunt, B. R.,

Janji^c, T.,

Kalman, R., 1960: A new approach to linear filtering and pre- diction problems. J. Fluids Eng., 82, 35-45, doi:10.1115/ 1.3662552.

Kennedy, A., and Coauthors, 2011: Origin of the Hurricane Ike forerunner surge. Geophys. Res. Lett., 38, L08608, doi:10.1029/ 2011GL047090.

Luettich, R., and

Luo, X., and

_____, and I. Hoteit, 2011: Robust ensemble filtering and its relation to covariance inflation in the ensemble Kal- man filter. Mon. Wea. Rev., 139, 3938-3953, doi:10.1175/ MWR-D-10-05068.1.

_____, and _____, 2012: Ensemble Kalman filtering with residual nudging. Tellus, 64A, 17130, doi:10.3402/tellusa.v64i0.17130.

_____, and _____, 2013: Covariance inflation in the ensemble

Meng, Z., and F. Zhang, 2007: Tests of an ensemble

Miyoshi, T., 2011: The Gaussian approach to adaptive covariance inflation and its implementation with the local ensemble transform

Nerger, L., W. Hiller, and J. SchrÖter, 2005: A comparison of error subspace

_____, T. Janjic^, J. SchrÖter, and W. Hiller, 2012a: A regulated lo- calization scheme for ensemble-based

_____, _____, _____, and _____, 2012b: A unification of ensemble square root

Pham, D. T., 2001: Stochastic methods for sequential data assimilation in strongly nonlinear systems. Mon. Wea. Rev., 129, 1194-1207, doi:10.1175/1520-0493(2001)129,1194: SMFSDA.2.0.CO;2.

_____,

Sakov, P., and

Simon, D., 2006: Optimal State Estimation:

Tippett, M. K.,

Verlaan, M., and A. W. Heemink, 1997: Tidal flow forecasting using reduced rank square root filters. Stochastic Hydrol. Hydraul., 11, 349-368, doi:10.1007/BF02427924.

Wang, X.,

Westerink, J. J., and Coauthors, 2008: A basin- to channel-scale unstructured grid hurricane storm surge model applied to southern

Whitaker, J. S., and

_____, and _____, 2012: Evaluating methods to account for system errors in ensemble data assimilation. Mon. Wea. Rev., 140, 3078-3089, doi:10.1175/MWR-D-11-00276.1.

Wolf, P., 2003: 1953 U.K. floods: 50-year retrospective. Risk Man- agement Solutions Rep., 11 pp. [Available online at http://storage. pardot.com/15772/68008/fl_1953_uk_floods_50_retrospective.pdf.]

Zhang, F.,

Zupanski, M., 2005: Maximum likelihood ensemble filter: Theo- retical aspects. Mon. Wea. Rev., 133, 1710-1726, doi:10.1175/ MWR2946.1.

*

of Technology, Delft,

1

#

@

&

**

(Manuscript received

Corresponding author address: I. Hoteit, KAUST, 4700

E-mail: ibrahim.hoteit@kaust.edu.sa

(