Kondrashov D., M. Ghil, (2006), Spatio-temporal filling of missing points in geophysical data sets, Nonlinear Processes In Geophysics, 13, 151–159, doi:10.5194/npg-13-151-2006

## Abstract

The majority of data sets in the geosciences are obtained from observations and measurements of natural systems, rather than in the laboratory. These data sets are often full of gaps, due to to the conditions under which the measurements are made. Missing data give rise to various problems, for example in spectral estimation or in specifying boundary conditions for numerical models. Here we use Singular Spectrum Analysis (SSA) to fill the gaps in several types of data sets. For a univariate record, our procedure uses only temporal correlations in the data to fill in the missing points. For a multivariate record, multi-channel SSA (M-SSA) takes advantage of both spatial and temporal correlations. We iteratively produce estimates of missing data points, which are then used to compute a self-consistent lag-covariance matrix; cross-validation allows us to optimize the window width and number of dominant SSA or M-SSA modes to fill the gaps. The optimal parameters of our procedure depend on the distribution in time (and space) of the missing data, as well as on the variance distribution between oscillatory modes and noise. The algorithm is demonstrated on synthetic examples, as well as on data sets from oceanography, hydrology, atmospheric sciences, and space physics: global sea-surface temperature, flood-water records of the Nile River, the Southern Oscillation Index (SOI), and satellite observations of relativistic electrons.## Authors (sorted by name)

Ghil Kondrashov## Journal / Conference

Nonlinear Processes In Geophysics## Acknowledgments

It is a pleasure to thank R. Vautard for the original suggestion of using the Toeplitz form of the lag-covariance matrix in the presence of data gaps. D. Percival and T. De Putter kindly provided several sets of Nile River records in digitized form; see Kondrashov et al. (2005a) for details. We are also grateful to Y. Shprits for providing the CRRES measurements and for useful discussions. This work is supported by NSF grant ATM00-81231.## Grants

ATM-0082131## Bibtex

@Article{npg-13-151-2006,
AUTHOR = {Kondrashov, D. and Ghil, M.},
TITLE = {Spatio-temporal filling of missing points in geophysical data sets},
JOURNAL = {Nonlinear Processes in Geophysics},
VOLUME = {13},
YEAR = {2006},
NUMBER = {2},
PAGES = {151--159},
URL = {https://www.nonlin-processes-geophys.net/13/151/2006/},
DOI = {10.5194/npg-13-151-2006},
abstract = {The majority of data sets in the geosciences are obtained from observations and measurements of natural systems, rather than in the laboratory. These data sets are often full of gaps, due to to the conditions under which the measurements are made. Missing data give rise to various problems, for example in spectral estimation or in specifying boundary conditions for numerical models. Here we use Singular Spectrum Analysis (SSA) to fill the gaps in several types of data sets. For a univariate record, our procedure uses only temporal correlations in the data to fill in the missing points. For a multivariate record, multi-channel SSA (M-SSA) takes advantage of both spatial and temporal correlations. We iteratively produce estimates of missing data points, which are then used to compute a self-consistent lag-covariance matrix; cross-validation allows us to optimize the window width and number of dominant SSA or M-SSA modes to fill the gaps. The optimal parameters of our procedure depend on the distribution in time (and space) of the missing data, as well as on the variance distribution between oscillatory modes and noise. The algorithm is demonstrated on synthetic examples, as well as on data sets from oceanography, hydrology, atmospheric sciences, and space physics: global sea-surface temperature, flood-water records of the Nile River, the Southern Oscillation Index (SOI), and satellite observations of relativistic electrons.
}
}