A brief introduction to Data Assimilation
================================================================================
+.. index:: single: Data Assimilation
+.. index:: single: true state
+.. index:: single: observation
+.. index:: single: a priori
+
**Data Assimilation** is a general framework for computing the optimal estimate
of the true state of a system, over time if necessary. It uses values obtained
both from observations and *a priori* models, including information about their
Fields reconstruction or measures interpolation
-----------------------------------------------
+.. index:: single: parameters identification
+
Fields reconstruction consists in finding, from a restricted set of real
measures, the physical field which is the most *consistent* with these measures.
This consistency is to understand in terms of interpolation, that is to say that
-the field, we want to reconstruct using data assimilation on measures, has to
+the field we want to reconstruct, using data assimilation on measures, has to
fit at best the measures, while remaining constrained by the overall
calculation. The calculation is thus an *a priori* estimation of the field that
we seek to identify.
atmosphere, which indicates for example that the pressure at a point can not
take any value independently of the value at this same point in previous time.
We must therefore make the reconstruction of a field at any point in space, in
-order "consistent" with the evolution equations and measures of the previous
+a manner "consistent" with the evolution equations and measures of the previous
time steps.
Parameters identification or calibration
----------------------------------------
+.. index:: single: fields reconstruction
+
The identification of parameters by data assimilation is a form of calibration
which uses both the measurement and an *a priori* estimation (called the
"*background*") of the state that one seeks to identify, as well as a
Simple description of the data assimilation framework
-----------------------------------------------------
+.. index:: single: background
+.. index:: single: background error covariances
+.. index:: single: observation error covariances
+.. index:: single: covariances
+
We can write these features in a simple manner. By default, all variables are
vectors, as there are several parameters to readjust.
According to standard notations in data assimilation, we note
-:math:`\mathbf{x}^a` the optimal unknown parameters that is to be determined by
+:math:`\mathbf{x}^a` the optimal parameters that is to be determined by
calibration, :math:`\mathbf{y}^o` the observations (or experimental
-measurements) that we must compare the simulation outputs, :math:`\mathbf{x}^b`
-the background (*a priori* values, or regularization values) of searched
-parameters, :math:`\mathbf{x}^t` unknown ideals parameters that would give as
-output exactly the observations (assuming that the errors are zero and the model
-exact).
-
-In the simplest case, static, the steps of simulation and of observation can be
-combined into a single observation operator noted :math:`H` (linear or
-nonlinear), which transforms the input parameters :math:`\mathbf{x}` to results
-:math:`\mathbf{y}` to be compared to observations :math:`\mathbf{y}^o`.
-Moreover, we use the linearized operator :math:`\mathbf{H}` to represent the
-effect of the full operator :math:`H` around a linearization point (and we omit
-thereafter to mention :math:`H` even if it is possible to keep it). In reality,
-we have already indicated that the stochastic nature of variables is essential,
-coming from the fact that model, background and observations are incorrect. We
-therefore introduce errors of observations additively, in the form of a random
-vector :math:`\mathbf{\epsilon}^o` such that:
+measurements) that we must compare to the simulation outputs,
+:math:`\mathbf{x}^b` the background (*a priori* values, or regularization
+values) of searched parameters, :math:`\mathbf{x}^t` the unknown ideals
+parameters that would give exactly the observations (assuming that the errors
+are zero and the model is exact) as output.
+
+In the simplest case, which is static, the steps of simulation and of
+observation can be combined into a single observation operator noted :math:`H`
+(linear or nonlinear), which transforms the input parameters :math:`\mathbf{x}`
+to results :math:`\mathbf{y}` to be compared to observations
+:math:`\mathbf{y}^o`. Moreover, we use the linearized operator
+:math:`\mathbf{H}` to represent the effect of the full operator :math:`H` around
+a linearization point (and we omit thereafter to mention :math:`H` even if it is
+possible to keep it). In reality, we have already indicated that the stochastic
+nature of variables is essential, coming from the fact that model, background
+and observations are incorrect. We therefore introduce errors of observations
+additively, in the form of a random vector :math:`\mathbf{\epsilon}^o` such
+that:
.. math:: \mathbf{y}^o = \mathbf{H} \mathbf{x}^t + \mathbf{\epsilon}^o
The errors represented here are not only those from observation, but also from
the simulation. We can always consider that these errors are of zero mean. We
-can then define a matrix :math:`\mathbf{R}` of the observation error covariance
+can then define a matrix :math:`\mathbf{R}` of the observation error covariances
by:
.. math:: \mathbf{R} = E[\mathbf{\epsilon}^o.{\mathbf{\epsilon}^o}^T]
error function (in variational assimilation) or from the filtering correction (in
assimilation by filtering).
-In **variational assimilation**, one classically attempts to minimize the
-following function :math:`J`:
+In **variational assimilation**, in a static case, one classically attempts to
+minimize the following function :math:`J`:
.. math:: J(\mathbf{x})=(\mathbf{x}-\mathbf{x}^b)^T.\mathbf{B}^{-1}.(\mathbf{x}-\mathbf{x}^b)+(\mathbf{y}^o-\mathbf{H}.\mathbf{x})^T.\mathbf{R}^{-1}.(\mathbf{y}^o-\mathbf{H}.\mathbf{x})
which is usually designed as the "*3D-VAR*" function. Since covariance matrices
are proportional to the variances of errors, their presence in both terms of the
function :math:`J` can effectively weight the differences by confidence in the
-background or observations. The parameters :math:`\mathbf{x}` realizing the
-minimum of this function therefore constitute the analysis :math:`\mathbf{x}^a`.
-It is at this level that we have to use the full panoply of function
-minimization methods otherwise known in optimization. Depending on the size of
-the parameters vector :math:`\mathbf{x}` to identify and ot the availability of
-gradient and Hessian of :math:`J`, it is appropriate to adapt the chosen
-optimization method (gradient, Newton, quasi-Newton ...).
+background or observations. The parameters vector :math:`\mathbf{x}` realizing
+the minimum of this function therefore constitute the analysis
+:math:`\mathbf{x}^a`. It is at this level that we have to use the full panoply
+of function minimization methods otherwise known in optimization. Depending on
+the size of the parameters vector :math:`\mathbf{x}` to identify and of the
+availability of gradient and Hessian of :math:`J`, it is appropriate to adapt
+the chosen optimization method (gradient, Newton, quasi-Newton...).
In **assimilation by filtering**, in this simple case usually referred to as
-"*BLUE*"(for "*Best Linear Unbiased Estimator*"), the :math:`\mathbf{x}^a`
+"*BLUE*" (for "*Best Linear Unbiased Estimator*"), the :math:`\mathbf{x}^a`
analysis is given as a correction of the background :math:`\mathbf{x}^b` by a
term proportional to the difference between observations :math:`\mathbf{y}^o`
and calculations :math:`\mathbf{H}\mathbf{x}^b`:
It is indicated here that these methods of "*3D-VAR*" and "*BLUE*" may be
extended to dynamic problems, called respectively "*4D-VAR*" and "*Kalman
-filter*". They can take account of the evolution operator to establish an
+filter*". They can take into account the evolution operator to establish an
analysis at the right time steps of the gap between observations and simulations,
and to have, at every moment, the propagation of the background through the
evolution model. Many other variants have been developed to improve the
calculation size and time.
Going further in the data assimilation framework
-++++++++++++++++++++++++++++++++++++++++++++++++
+------------------------------------------------
To get more information about all the data assimilation techniques, the reader
can consult introductory documents like [Argaud09], on-line training courses or
lectures like [Bouttier99] and [Bocquet04] (along with other materials coming
from geosciences applications), or general documents like [Talagrand97],
-[Tarantola87], [Kalnay03] and [WikipediaDA].
+[Tarantola87], [Kalnay03], [Ide97] and [WikipediaDA].
Note that data assimilation is not restricted to meteorology or geo-sciences, but
is widely used in other scientific domains. There are several fields in science
*inverse problems*, *bayesian estimation*, *optimal interpolation*,
*mathematical regularisation*, *data smoothing*, etc. These terms can be used in
bibliographical searches.
-
-.. [Argaud09] Argaud J.-P., Bouriquet B., Hunt J., *Data Assimilation from Operational and Industrial Applications to Complex Systems*, Mathematics Today, pp.150-152, October 2009
-
-.. [Bouttier99] Bouttier B., Courtier P., *Data assimilation concepts and methods*, Meteorological Training Course Lecture Series, ECMWF, 1999, http://www.ecmwf.int/newsevents/training/rcourse_notes/pdf_files/Assim_concepts.pdf
-
-.. [Bocquet04] Bocquet M., *Introduction aux principes et méthodes de l'assimilation de données en géophysique*, Lecture Notes, 2004-2008, http://cerea.enpc.fr/HomePages/bocquet/assim.pdf
-
-.. [Tarantola87] Tarantola A., *Inverse Problem: Theory Methods for Data Fitting and Parameter Estimation*, Elsevier, 1987
-
-.. [Talagrand97] Talagrand O., *Assimilation of Observations, an Introduction*, Journal of the Meteorological Society of Japan, 75(1B), pp.191-209, 1997
-
-.. [Kalnay03] Kalnay E., *Atmospheric Modeling, Data Assimilation and Predictability*, Cambridge University Press, 2003
-
-.. [WikipediaDA] Wikipedia/Data_assimilation: http://en.wikipedia.org/wiki/Data_assimilation