doc/theory.rst

   1 .. _section_theory:
   2
   3 ================================================================================
   4 A brief introduction to Data Assimilation
   5 ================================================================================
   6
   7 .. index:: single: Data Assimilation
   8 .. index:: single: true state
   9 .. index:: single: observation
  10 .. index:: single: a priori
  11
  12
  13 **Data Assimilation** is a general framework for computing the optimal estimate
  14 of the true state of a system, over time if necessary. It uses values obtained
  15 both from observations and *a priori* models, including information about their
  16 errors.
  17
  18 In other words, data assimilation merges measurement data, the observations,
  19 with *a priori* physical and mathematical knowledge, embedded in numerical
  20 models, to obtain the best possible estimate of the true state and of its
  21 stochastic properties. Note that this true state can not be reached, but can
  22 only be estimated. Moreover, despite the fact that used information are
  23 stochastic by nature, data assimilation provides deterministic techniques in
  24 order to realize the estimation.
  25
  26 Two main types of applications exist in data assimilation being covered by the
  27 same formalism: **parameters identification** and **fields reconstruction**.
  28 Before introducing the `Simple description of the data assimilation framework`_
  29 in a next section, we describe briefly these two types. At the end, some
  30 references allow `Going further in the data assimilation framework`_.
  31
  32 Fields reconstruction or measures interpolation
  33 -----------------------------------------------
  34
  35 .. index:: single: parameters identification
  36
  37 Fields reconstruction consists in finding, from a restricted set of real
  38 measures, the physical field which is the most *consistent* with these measures.
  39
  40 This consistency is to understand in terms of interpolation, that is to say that
  41 the field we want to reconstruct, using data assimilation on measures, has to
  42 fit at best the measures, while remaining constrained by the overall
  43 calculation. The calculation is thus an *a priori* estimation of the field that
  44 we seek to identify.
  45
  46 If the system evolves in time, the reconstruction has to be established on every
  47 time step, as a whole. The interpolation process in this case is more
  48 complicated since it is temporal, not only in terms of instantaneous values of
  49 the field.
  50
  51 A simple example of fields reconstruction comes from of meteorology, in which we
  52 look for value of variables such as temperature or pressure in all points of the
  53 spatial domain. We have instantaneous measurements of these quantities at
  54 certain points, but also a history set of these measures. Moreover, these
  55 variables are constrained by evolution equations for the state of the
  56 atmosphere, which indicates for example that the pressure at a point can not
  57 take any value independently of the value at this same point in previous time.
  58 We must therefore make the reconstruction of a field at any point in space, in
  59 a manner "consistent" with the evolution equations and measures of the previous
  60 time steps.
  61
  62 Parameters identification or calibration
  63 ----------------------------------------
  64
  65 .. index:: single: fields reconstruction
  66
  67 The identification of parameters by data assimilation is a form of calibration
  68 which uses both the measurement and an *a priori* estimation (called the
  69 "*background*") of the state that one seeks to identify, as well as a
  70 characterization of their errors. From this point of view, it uses all available
  71 information on the physical system (even if assumptions about errors are
  72 relatively restrictive) to find the "*optimal*" estimation from the true state.
  73 We note, in terms of optimization, that the background realizes a mathematical
  74 regularization of the main problem of identification.
  75
  76 In practice, the two gaps "*calculation-background*" and
  77 "*calculation-measures*" are added to build the calibration correction of
  78 parameters or initial conditions. The addition of these two gaps requires a
  79 relative weight, which is chosen to reflect the trust we give to each piece of
  80 information. This confidence is measured by the covariance of the errors on the
  81 background and on the observations. Thus the stochastic aspect of information,
  82 measured or *a priori*, is essential for building the calibration error
  83 function.
  84
  85 Simple description of the data assimilation framework
  86 -----------------------------------------------------
  87
  88 .. index:: single: background
  89 .. index:: single: background error covariances
  90 .. index:: single: observation error covariances
  91 .. index:: single: covariances
  92
  93 We can write these features in a simple manner. By default, all variables are
  94 vectors, as there are several parameters to readjust.
  95
  96 According to standard notations in data assimilation, we note
  97 :math:`\mathbf{x}^a` the optimal parameters that is to be determined by
  98 calibration, :math:`\mathbf{y}^o` the observations (or experimental
  99 measurements) that we must compare to the simulation outputs,
 100 :math:`\mathbf{x}^b` the background (*a priori* values, or regularization
 101 values) of searched parameters, :math:`\mathbf{x}^t` the unknown ideals
 102 parameters that would give exactly the observations (assuming that the errors
 103 are zero and the model is exact) as output.
 104
 105 In the simplest case, which is static, the steps of simulation and of
 106 observation can be combined into a single observation operator noted :math:`H`
 107 (linear or nonlinear), which transforms the input parameters :math:`\mathbf{x}`
 108 to results :math:`\mathbf{y}` to be compared to observations
 109 :math:`\mathbf{y}^o`. Moreover, we use the linearized operator
 110 :math:`\mathbf{H}` to represent the effect of the full operator :math:`H` around
 111 a linearization point (and we omit thereafter to mention :math:`H` even if it is
 112 possible to keep it). In reality, we have already indicated that the stochastic
 113 nature of variables is essential, coming from the fact that model, background
 114 and observations are incorrect. We therefore introduce errors of observations
 115 additively, in the form of a random vector :math:`\mathbf{\epsilon}^o` such
 116 that:
 117
 118 .. math:: \mathbf{y}^o = \mathbf{H} \mathbf{x}^t + \mathbf{\epsilon}^o
 119
 120 The errors represented here are not only those from observation, but also from
 121 the simulation. We can always consider that these errors are of zero mean. We
 122 can then define a matrix :math:`\mathbf{R}` of the observation error covariances
 123 by:
 124
 125 .. math:: \mathbf{R} = E[\mathbf{\epsilon}^o.{\mathbf{\epsilon}^o}^T]
 126
 127 The background can also be written as a function of the true value, by
 128 introducing the error vector :math:`\mathbf{\epsilon}^b`:
 129
 130 .. math:: \mathbf{x}^b = \mathbf{x}^t + \mathbf{\epsilon}^b
 131
 132 where errors are also assumed of zero mean, in the same manner as for
 133 observations. We define the :math:`\mathbf{B}` matrix of background error
 134 covariances by:
 135
 136 .. math:: \mathbf{B} = E[\mathbf{\epsilon}^b.{\mathbf{\epsilon}^b}^T]
 137
 138 The optimal estimation of the true parameters :math:`\mathbf{x}^t`, given the
 139 background :math:`\mathbf{x}^b` and the observations :math:`\mathbf{y}^o`, is
 140 then the "*analysis*" :math:`\mathbf{x}^a` and comes from the minimisation of an
 141 error function (in variational assimilation) or from the filtering correction (in
 142 assimilation by filtering).
 143
 144 In **variational assimilation**, in a static case, one classically attempts to
 145 minimize the following function :math:`J`:
 146
 147 .. math:: J(\mathbf{x})=(\mathbf{x}-\mathbf{x}^b)^T.\mathbf{B}^{-1}.(\mathbf{x}-\mathbf{x}^b)+(\mathbf{y}^o-\mathbf{H}.\mathbf{x})^T.\mathbf{R}^{-1}.(\mathbf{y}^o-\mathbf{H}.\mathbf{x})
 148
 149 which is usually designed as the "*3D-VAR*" function. Since covariance matrices
 150 are proportional to the variances of errors, their presence in both terms of the
 151 function :math:`J` can effectively weight the differences by confidence in the
 152 background or observations. The parameters vector :math:`\mathbf{x}` realizing
 153 the minimum of this function therefore constitute the analysis
 154 :math:`\mathbf{x}^a`. It is at this level that we have to use the full panoply
 155 of function minimization methods otherwise known in optimization. Depending on
 156 the size of the parameters vector :math:`\mathbf{x}` to identify and of the
 157 availability of gradient and Hessian of :math:`J`, it is appropriate to adapt
 158 the chosen optimization method (gradient, Newton, quasi-Newton...).
 159
 160 In **assimilation by filtering**, in this simple case usually referred to as
 161 "*BLUE*" (for "*Best Linear Unbiased Estimator*"), the :math:`\mathbf{x}^a`
 162 analysis is given as a correction of the background :math:`\mathbf{x}^b` by a
 163 term proportional to the difference between observations :math:`\mathbf{y}^o`
 164 and calculations :math:`\mathbf{H}\mathbf{x}^b`:
 165
 166 .. math:: \mathbf{x}^a = \mathbf{x}^b + \mathbf{K}(\mathbf{y}^o - \mathbf{H}\mathbf{x}^b)
 167
 168 where :math:`\mathbf{K}` is the Kalman gain matrix, which is expressed using
 169 covariance matrices in the following form:
 170
 171 .. math:: \mathbf{K} = \mathbf{B}\mathbf{H}^T(\mathbf{H}\mathbf{B}\mathbf{H}^T+\mathbf{R})^{-1}
 172
 173 The advantage of filtering is to explicitly calculate the gain, to produce then
 174 the *a posteriori* covariance analysis matrix.
 175
 176 In this simple static case, we can show, under the assumption of Gaussian error
 177 distributions, that the two *variational* and *filtering* approaches are
 178 equivalent.
 179
 180 It is indicated here that these methods of "*3D-VAR*" and "*BLUE*" may be
 181 extended to dynamic problems, called respectively "*4D-VAR*" and "*Kalman
 182 filter*". They can take into account the evolution operator to establish an
 183 analysis at the right time steps of the gap between observations and simulations,
 184 and to have, at every moment, the propagation of the background through the
 185 evolution model. Many other variants have been developed to improve the
 186 numerical quality or to take into account computer requirements such as
 187 calculation size and time.
 188
 189 Going further in the data assimilation framework
 190 ------------------------------------------------
 191
 192 To get more information about all the data assimilation techniques, the reader
 193 can consult introductory documents like [Argaud09], on-line training courses or
 194 lectures like [Bouttier99] and [Bocquet04] (along with other materials coming
 195 from geosciences applications), or general documents like [Talagrand97],
 196 [Tarantola87], [Kalnay03], [Ide97] and [WikipediaDA].
 197
 198 Note that data assimilation is not restricted to meteorology or geo-sciences, but
 199 is widely used in other scientific domains. There are several fields in science
 200 and technology where the effective use of observed but incomplete data is
 201 crucial.
 202
 203 Some aspects of data assimilation are also known as *parameter estimation*,
 204 *inverse problems*, *bayesian estimation*, *optimal interpolation*,
 205 *mathematical regularisation*, *data smoothing*, etc. These terms can be used in
 206 bibliographical searches.