3 ================================================================================
4 Reference description of the ADAO commands and keywords
5 ================================================================================
7 This section presents the reference description of the ADAO commands and
8 keywords available through the GUI or through scripts.
10 Each command or keyword to be defined through the ADAO GUI has some properties.
11 The first property is to be *required*, *optional* or only factual, describing a
12 type of input. The second property is to be an "open" variable with a fixed type
13 but with any value allowed by the type, or a "restricted" variable, limited to
14 some specified values. The EFICAS editor GUI having build-in validating
15 capacities, the properties of the commands or keywords given through this GUI
16 are automatically correct.
18 The mathematical notations used afterward are explained in the section
19 :ref:`section_theory`.
21 Examples of using these commands are available in the section
22 :ref:`section_examples` and in example files installed with ADAO module.
24 List of possible input types
25 ----------------------------
27 .. index:: single: Dict
28 .. index:: single: Function
29 .. index:: single: Matrix
30 .. index:: single: ScalarSparseMatrix
31 .. index:: single: DiagonalSparseMatrix
32 .. index:: single: String
33 .. index:: single: Script
34 .. index:: single: Vector
36 Each ADAO variable has a pseudo-type to help filling it and validation. The
37 different pseudo-types are:
40 This indicates a variable that has to be filled by a dictionary, usually
41 given either as a string or as a script.
44 This indicates a variable that has to be filled by a function, usually given
45 as a script or a component method.
48 This indicates a variable that has to be filled by a matrix, usually given
49 either as a string or as a script.
51 **ScalarSparseMatrix**
52 This indicates a variable that has to be filled by a unique number, which
53 will be used to multiply an identity matrix, usually given either as a
54 string or as a script.
56 **DiagonalSparseMatrix**
57 This indicates a variable that has to be filled by a vector, which will be
58 over the diagonal of an identity matrix, usually given either as a string or
62 This indicates a script given as an external file. It can be described by a
63 full absolute path name or only by the file name without path. If the file
64 is given only by a file name without path, and if a study directory is also
65 indicated, the file is searched in the given directory.
68 This indicates a string giving a literal representation of a matrix, a
69 vector or a vector serie, such as "1 2 ; 3 4" for a square 2x2 matrix.
72 This indicates a variable that has to be filled by a vector, usually given
73 either as a string or as a script.
75 **VectorSerie** This indicates a variable that has to be filled by a list of
76 vectors, usually given either as a string or as a script.
78 When a command or keyword can be filled by a script file name, the script has to
79 contain a variable or a method that has the same name as the one to be filled.
80 In other words, when importing the script in a YACS Python node, it must create
81 a variable of the good name in the current namespace.
83 Reference description for ADAO calculation cases
84 ------------------------------------------------
86 List of commands and keywords for an ADAO calculation case
87 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
89 .. index:: single: ASSIMILATION_STUDY
90 .. index:: single: Algorithm
91 .. index:: single: AlgorithmParameters
92 .. index:: single: Background
93 .. index:: single: BackgroundError
94 .. index:: single: ControlInput
95 .. index:: single: Debug
96 .. index:: single: EvolutionError
97 .. index:: single: EvolutionModel
98 .. index:: single: InputVariables
99 .. index:: single: Observation
100 .. index:: single: ObservationError
101 .. index:: single: ObservationOperator
102 .. index:: single: Observers
103 .. index:: single: OutputVariables
104 .. index:: single: Study_name
105 .. index:: single: Study_repertory
106 .. index:: single: UserDataInit
107 .. index:: single: UserPostAnalysis
109 The first set of commands is related to the description of a calculation case,
110 that is a *Data Assimilation* procedure or an *Optimization* procedure. The
111 terms are ordered in alphabetical order, except the first, which describes
112 choice between calculation or checking. The different commands are the
115 **ASSIMILATION_STUDY**
116 *Required command*. This is the general command describing the data
117 assimilation or optimization case. It hierarchically contains all the other
121 *Required command*. This is a string to indicate the data assimilation or
122 optimization algorithm chosen. The choices are limited and available through
123 the GUI. There exists for example "3DVAR", "Blue"... See below the list of
124 algorithms and associated parameters in the following subsection `Options
125 and required commands for calculation algorithms`_.
127 **AlgorithmParameters**
128 *Optional command*. This command allows to add some optional parameters to
129 control the data assimilation or optimization algorithm. It is defined as a
130 "*Dict*" type object, that is, given as a script. See below the list of
131 algorithms and associated parameters in the following subsection `Options
132 and required commands for calculation algorithms`_.
135 *Required command*. This indicates the background or initial vector used,
136 previously noted as :math:`\mathbf{x}^b`. It is defined as a "*Vector*" type
137 object, that is, given either as a string or as a script.
140 *Required command*. This indicates the background error covariance matrix,
141 previously noted as :math:`\mathbf{B}`. It is defined as a "*Matrix*" type
142 object, a "*ScalarSparseMatrix*" type object, or a "*DiagonalSparseMatrix*"
143 type object, that is, given either as a string or as a script.
146 *Optional command*. This indicates the control vector used to force the
147 evolution model at each step, usually noted as :math:`\mathbf{U}`. It is
148 defined as a "*Vector*" or a *VectorSerie* type object, that is, given
149 either as a string or as a script. When there is no control, it has to be a
153 *Required command*. This define the level of trace and intermediary debug
154 information. The choices are limited between 0 (for False) and 1 (for
158 *Optional command*. This indicates the evolution error covariance matrix,
159 usually noted as :math:`\mathbf{Q}`. It is defined as a "*Matrix*" type
160 object, a "*ScalarSparseMatrix*" type object, or a "*DiagonalSparseMatrix*"
161 type object, that is, given either as a string or as a script.
164 *Optional command*. This indicates the evolution model operator, usually
165 noted :math:`M`, which describes a step of evolution. It is defined as a
166 "*Function*" type object, that is, given as a script. Different functional
167 forms can be used, as described in the following subsection `Requirements
168 for functions describing an operator`_. If there is some control :math:`U`
169 included in the evolution model, the operator has to be applied to a pair
173 *Optional command*. This command allows to indicates the name and size of
174 physical variables that are bundled together in the control vector. This
175 information is dedicated to data processed inside an algorithm.
178 *Required command*. This indicates the observation vector used for data
179 assimilation or optimization, previously noted as :math:`\mathbf{y}^o`. It
180 is defined as a "*Vector*" or a *VectorSerie* type object, that is, given
181 either as a string or as a script.
184 *Required command*. This indicates the observation error covariance matrix,
185 previously noted as :math:`\mathbf{R}`. It is defined as a "*Matrix*" type
186 object, a "*ScalarSparseMatrix*" type object, or a "*DiagonalSparseMatrix*"
187 type object, that is, given either as a string or as a script.
189 **ObservationOperator**
190 *Required command*. This indicates the observation operator, previously
191 noted :math:`H`, which transforms the input parameters :math:`\mathbf{x}` to
192 results :math:`\mathbf{y}` to be compared to observations
193 :math:`\mathbf{y}^o`. It is defined as a "*Function*" type object, that is,
194 given as a script. Different functional forms can be used, as described in
195 the following subsection `Requirements for functions describing an
196 operator`_. If there is some control :math:`U` included in the observation,
197 the operator has to be applied to a pair :math:`(X,U)`.
200 *Optional command*. This command allows to set internal observers, that are
201 functions linked with a particular variable, which will be executed each
202 time this variable is modified. It is a convenient way to monitor variables
203 of interest during the data assimilation or optimization process, by
204 printing or plotting it, etc. Common templates are provided to help the user
205 to start or to quickly make his case.
208 *Optional command*. This command allows to indicates the name and size of
209 physical variables that are bundled together in the output observation
210 vector. This information is dedicated to data processed inside an algorithm.
213 *Required command*. This is an open string to describe the study by a name
217 *Optional command*. If available, this directory is used as base name for
218 calculation, and used to find all the script files, given by name without
219 path, that can be used to define some other commands by scripts.
222 *Optional command*. This commands allows to initialize some parameters or
223 data automatically before data assimilation algorithm processing.
226 *Optional command*. This commands allows to process some parameters or data
227 automatically after data assimilation algorithm processing. It is defined as
228 a script or a string, allowing to put post-processing code directly inside
229 the ADAO case. Common templates are provided to help the user to start or
230 to quickly make his case.
232 Options and required commands for calculation algorithms
233 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
235 .. index:: single: 3DVAR
236 .. index:: single: Blue
237 .. index:: single: ExtendedBlue
238 .. index:: single: EnsembleBlue
239 .. index:: single: KalmanFilter
240 .. index:: single: ExtendedKalmanFilter
241 .. index:: single: UnscentedKalmanFilter
242 .. index:: single: LinearLeastSquares
243 .. index:: single: NonLinearLeastSquares
244 .. index:: single: ParticleSwarmOptimization
245 .. index:: single: QuantileRegression
247 .. index:: single: AlgorithmParameters
248 .. index:: single: Bounds
249 .. index:: single: CostDecrementTolerance
250 .. index:: single: GradientNormTolerance
251 .. index:: single: GroupRecallRate
252 .. index:: single: MaximumNumberOfSteps
253 .. index:: single: Minimizer
254 .. index:: single: NumberOfInsects
255 .. index:: single: ProjectedGradientTolerance
256 .. index:: single: QualityCriterion
257 .. index:: single: Quantile
258 .. index:: single: SetSeed
259 .. index:: single: StoreInternalVariables
260 .. index:: single: StoreSupplementaryCalculations
261 .. index:: single: SwarmVelocity
263 Each algorithm can be controlled using some generic or specific options given
264 through the "*AlgorithmParameters*" optional command, as follows for example::
266 AlgorithmParameters = {
267 "Minimizer" : "LBFGSB",
268 "MaximumNumberOfSteps" : 25,
269 "StoreSupplementaryCalculations" : ["APosterioriCovariance","OMA"],
272 This section describes the available options algorithm by algorithm. If an
273 option is specified for an algorithm that doesn't support it, the option is
274 simply left unused. The meaning of the acronyms or particular names can be found
275 in the :ref:`genindex` or the :ref:`section_glossary`. In addition, for each
276 algorithm, the required commands/keywords are given, being described in `List of
277 commands and keywords for an ADAO calculation case`_.
282 *"Background", "BackgroundError",
283 "Observation", "ObservationError",
284 "ObservationOperator"*
286 StoreInternalVariables
287 This boolean key allows to store default internal variables, mainly the
288 current state during iterative optimization process. Be careful, this can be
289 a numerically costly choice in certain calculation cases. The default is
292 StoreSupplementaryCalculations
293 This list indicates the names of the supplementary variables that can be
294 available at the end of the algorithm. It involves potentially costly
295 calculations. The default is a void list, none of these variables being
296 calculated and stored by default. The possible names are in the following
297 list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
298 "SigmaBck2", "SigmaObs2", "MahalanobisConsistency"].
303 *"Background", "BackgroundError",
304 "Observation", "ObservationError",
305 "ObservationOperator"*
307 StoreInternalVariables
308 This boolean key allows to store default internal variables, mainly the
309 current state during iterative optimization process. Be careful, this can be
310 a numerically costly choice in certain calculation cases. The default is
313 StoreSupplementaryCalculations
314 This list indicates the names of the supplementary variables that can be
315 available at the end of the algorithm. It involves potentially costly
316 calculations. The default is a void list, none of these variables being
317 calculated and stored by default. The possible names are in the following
318 list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
319 "SigmaBck2", "SigmaObs2", "MahalanobisConsistency"].
321 **"LinearLeastSquares"**
324 *"Observation", "ObservationError",
325 "ObservationOperator"*
327 StoreInternalVariables
328 This boolean key allows to store default internal variables, mainly the
329 current state during iterative optimization process. Be careful, this can be
330 a numerically costly choice in certain calculation cases. The default is
333 StoreSupplementaryCalculations
334 This list indicates the names of the supplementary variables that can be
335 available at the end of the algorithm. It involves potentially costly
336 calculations. The default is a void list, none of these variables being
337 calculated and stored by default. The possible names are in the following
343 *"Background", "BackgroundError",
344 "Observation", "ObservationError",
345 "ObservationOperator"*
348 This key allows to choose the optimization minimizer. The default choice
349 is "LBFGSB", and the possible ones are "LBFGSB" (nonlinear constrained
350 minimizer, see [Byrd95]_ and [Zhu97]_), "TNC" (nonlinear constrained
351 minimizer), "CG" (nonlinear unconstrained minimizer), "BFGS" (nonlinear
352 unconstrained minimizer), "NCG" (Newton CG minimizer).
355 This key allows to define upper and lower bounds for every control
356 variable being optimized. Bounds can be given by a list of list of pairs
357 of lower/upper bounds for each variable, with possibly ``None`` every time
358 there is no bound. The bounds can always be specified, but they are taken
359 into account only by the constrained minimizers.
362 This key indicates the maximum number of iterations allowed for iterative
363 optimization. The default is 15000, which is very similar to no limit on
364 iterations. It is then recommended to adapt this parameter to the needs on
365 real problems. For some minimizers, the effective stopping step can be
366 slightly different due to algorithm internal control requirements.
368 CostDecrementTolerance
369 This key indicates a limit value, leading to stop successfully the
370 iterative optimization process when the cost function decreases less than
371 this tolerance at the last step. The default is 1.e-7, and it is
372 recommended to adapt it to the needs on real problems.
374 ProjectedGradientTolerance
375 This key indicates a limit value, leading to stop successfully the iterative
376 optimization process when all the components of the projected gradient are
377 under this limit. It is only used for constrained minimizers. The default is
378 -1, that is the internal default of each minimizer (generally 1.e-5), and it
379 is not recommended to change it.
381 GradientNormTolerance
382 This key indicates a limit value, leading to stop successfully the
383 iterative optimization process when the norm of the gradient is under this
384 limit. It is only used for non-constrained minimizers. The default is
385 1.e-5 and it is not recommended to change it.
387 StoreInternalVariables
388 This boolean key allows to store default internal variables, mainly the
389 current state during iterative optimization process. Be careful, this can be
390 a numerically costly choice in certain calculation cases. The default is
393 StoreSupplementaryCalculations
394 This list indicates the names of the supplementary variables that can be
395 available at the end of the algorithm. It involves potentially costly
396 calculations. The default is a void list, none of these variables being
397 calculated and stored by default. The possible names are in the following
398 list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
399 "SigmaObs2", "MahalanobisConsistency"].
401 **"NonLinearLeastSquares"**
405 "Observation", "ObservationError",
406 "ObservationOperator"*
409 This key allows to choose the optimization minimizer. The default choice
410 is "LBFGSB", and the possible ones are "LBFGSB" (nonlinear constrained
411 minimizer, see [Byrd95]_ and [Zhu97]_), "TNC" (nonlinear constrained
412 minimizer), "CG" (nonlinear unconstrained minimizer), "BFGS" (nonlinear
413 unconstrained minimizer), "NCG" (Newton CG minimizer).
416 This key allows to define upper and lower bounds for every control
417 variable being optimized. Bounds can be given by a list of list of pairs
418 of lower/upper bounds for each variable, with possibly ``None`` every time
419 there is no bound. The bounds can always be specified, but they are taken
420 into account only by the constrained minimizers.
423 This key indicates the maximum number of iterations allowed for iterative
424 optimization. The default is 15000, which is very similar to no limit on
425 iterations. It is then recommended to adapt this parameter to the needs on
426 real problems. For some minimizers, the effective stopping step can be
427 slightly different due to algorithm internal control requirements.
429 CostDecrementTolerance
430 This key indicates a limit value, leading to stop successfully the
431 iterative optimization process when the cost function decreases less than
432 this tolerance at the last step. The default is 1.e-7, and it is
433 recommended to adapt it to the needs on real problems.
435 ProjectedGradientTolerance
436 This key indicates a limit value, leading to stop successfully the iterative
437 optimization process when all the components of the projected gradient are
438 under this limit. It is only used for constrained minimizers. The default is
439 -1, that is the internal default of each minimizer (generally 1.e-5), and it
440 is not recommended to change it.
442 GradientNormTolerance
443 This key indicates a limit value, leading to stop successfully the
444 iterative optimization process when the norm of the gradient is under this
445 limit. It is only used for non-constrained minimizers. The default is
446 1.e-5 and it is not recommended to change it.
448 StoreInternalVariables
449 This boolean key allows to store default internal variables, mainly the
450 current state during iterative optimization process. Be careful, this can be
451 a numerically costly choice in certain calculation cases. The default is
454 StoreSupplementaryCalculations
455 This list indicates the names of the supplementary variables that can be
456 available at the end of the algorithm. It involves potentially costly
457 calculations. The default is a void list, none of these variables being
458 calculated and stored by default. The possible names are in the following
459 list: ["BMA", "OMA", "OMB", "Innovation"].
464 *"Background", "BackgroundError",
465 "Observation", "ObservationError",
466 "ObservationOperator"*
469 This key allow to give an integer in order to fix the seed of the random
470 generator used to generate the ensemble. A convenient value is for example
471 1000. By default, the seed is left uninitialized, and so use the default
472 initialization from the computer.
477 *"Background", "BackgroundError",
478 "Observation", "ObservationError",
479 "ObservationOperator"*
482 This key allows to choose the type of estimation to be performed. It can be
483 either state-estimation, named "State", or parameter-estimation, named
484 "Parameters". The default choice is "State".
486 StoreInternalVariables
487 This boolean key allows to store default internal variables, mainly the
488 current state during iterative optimization process. Be careful, this can be
489 a numerically costly choice in certain calculation cases. The default is
492 StoreSupplementaryCalculations
493 This list indicates the names of the supplementary variables that can be
494 available at the end of the algorithm. It involves potentially costly
495 calculations. The default is a void list, none of these variables being
496 calculated and stored by default. The possible names are in the following
497 list: ["APosterioriCovariance", "BMA", "Innovation"].
499 **"ExtendedKalmanFilter"**
502 *"Background", "BackgroundError",
503 "Observation", "ObservationError",
504 "ObservationOperator"*
507 This key allows to define upper and lower bounds for every control variable
508 being optimized. Bounds can be given by a list of list of pairs of
509 lower/upper bounds for each variable, with extreme values every time there
510 is no bound. The bounds can always be specified, but they are taken into
511 account only by the constrained minimizers.
514 This key allows to define the method to take bounds into account. The
515 possible methods are in the following list: ["EstimateProjection"].
518 This key allows to choose the type of estimation to be performed. It can be
519 either state-estimation, named "State", or parameter-estimation, named
520 "Parameters". The default choice is "State".
522 StoreInternalVariables
523 This boolean key allows to store default internal variables, mainly the
524 current state during iterative optimization process. Be careful, this can be
525 a numerically costly choice in certain calculation cases. The default is
528 StoreSupplementaryCalculations
529 This list indicates the names of the supplementary variables that can be
530 available at the end of the algorithm. It involves potentially costly
531 calculations. The default is a void list, none of these variables being
532 calculated and stored by default. The possible names are in the following
533 list: ["APosterioriCovariance", "BMA", "Innovation"].
535 **"UnscentedKalmanFilter"**
538 *"Background", "BackgroundError",
539 "Observation", "ObservationError",
540 "ObservationOperator"*
543 This key allows to define upper and lower bounds for every control variable
544 being optimized. Bounds can be given by a list of list of pairs of
545 lower/upper bounds for each variable, with extreme values every time there
546 is no bound. The bounds can always be specified, but they are taken into
547 account only by the constrained minimizers.
550 This key allows to define the method to take bounds into account. The
551 possible methods are in the following list: ["EstimateProjection"].
554 This key allows to choose the type of estimation to be performed. It can be
555 either state-estimation, named "State", or parameter-estimation, named
556 "Parameters". The default choice is "State".
558 Alpha, Beta, Kappa, Reconditioner
559 These keys are internal scaling parameters. "Alpha" requires a value between
560 1.e-4 and 1. "Beta" has an optimal value of 2 for gaussian priori
561 distribution. "Kappa" requires an integer value, and the right default is
562 obtained by setting it to 0. "Reconditioner" requires a value between 1.e-3
563 and 10, it defaults to 1.
565 StoreInternalVariables
566 This boolean key allows to store default internal variables, mainly the
567 current state during iterative optimization process. Be careful, this can be
568 a numerically costly choice in certain calculation cases. The default is
571 StoreSupplementaryCalculations
572 This list indicates the names of the supplementary variables that can be
573 available at the end of the algorithm. It involves potentially costly
574 calculations. The default is a void list, none of these variables being
575 calculated and stored by default. The possible names are in the following
576 list: ["APosterioriCovariance", "BMA", "Innovation"].
578 **"ParticleSwarmOptimization"**
581 *"Background", "BackgroundError",
582 "Observation", "ObservationError",
583 "ObservationOperator"*
586 This key indicates the maximum number of iterations allowed for iterative
587 optimization. The default is 50, which is an arbitrary limit. It is then
588 recommended to adapt this parameter to the needs on real problems.
591 This key indicates the number of insects or particles in the swarm. The
592 default is 100, which is a usual default for this algorithm.
595 This key indicates the part of the insect velocity which is imposed by the
596 swarm. It is a positive floating point value. The default value is 1.
599 This key indicates the recall rate at the best swarm insect. It is a
600 floating point value between 0 and 1. The default value is 0.5.
603 This key indicates the quality criterion, minimized to find the optimal
604 state estimate. The default is the usual data assimilation criterion named
605 "DA", the augmented ponderated least squares. The possible criteria has to
606 be in the following list, where the equivalent names are indicated by "=":
607 ["AugmentedPonderatedLeastSquares"="APLS"="DA",
608 "PonderatedLeastSquares"="PLS", "LeastSquares"="LS"="L2",
609 "AbsoluteValue"="L1", "MaximumError"="ME"]
612 This key allow to give an integer in order to fix the seed of the random
613 generator used to generate the ensemble. A convenient value is for example
614 1000. By default, the seed is left uninitialized, and so use the default
615 initialization from the computer.
617 StoreInternalVariables
618 This boolean key allows to store default internal variables, mainly the
619 current state during iterative optimization process. Be careful, this can be
620 a numerically costly choice in certain calculation cases. The default is
623 StoreSupplementaryCalculations
624 This list indicates the names of the supplementary variables that can be
625 available at the end of the algorithm. It involves potentially costly
626 calculations. The default is a void list, none of these variables being
627 calculated and stored by default. The possible names are in the following
628 list: ["BMA", "OMA", "OMB", "Innovation"].
630 **"QuantileRegression"**
635 "ObservationOperator"*
638 This key allows to define the real value of the desired quantile, between
639 0 and 1. The default is 0.5, corresponding to the median.
642 This key allows to choose the optimization minimizer. The default choice
643 and only available choice is "MMQR" (Majorize-Minimize for Quantile
647 This key indicates the maximum number of iterations allowed for iterative
648 optimization. The default is 15000, which is very similar to no limit on
649 iterations. It is then recommended to adapt this parameter to the needs on
652 CostDecrementTolerance
653 This key indicates a limit value, leading to stop successfully the
654 iterative optimization process when the cost function or the surrogate
655 decreases less than this tolerance at the last step. The default is 1.e-6,
656 and it is recommended to adapt it to the needs on real problems.
658 StoreInternalVariables
659 This boolean key allows to store default internal variables, mainly the
660 current state during iterative optimization process. Be careful, this can be
661 a numerically costly choice in certain calculation cases. The default is
664 StoreSupplementaryCalculations
665 This list indicates the names of the supplementary variables that can be
666 available at the end of the algorithm. It involves potentially costly
667 calculations. The default is a void list, none of these variables being
668 calculated and stored by default. The possible names are in the following
669 list: ["BMA", "OMA", "OMB", "Innovation"].
671 Reference description for ADAO checking cases
672 ---------------------------------------------
674 List of commands and keywords for an ADAO checking case
675 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
677 .. index:: single: CHECKING_STUDY
678 .. index:: single: Algorithm
679 .. index:: single: AlgorithmParameters
680 .. index:: single: CheckingPoint
681 .. index:: single: Debug
682 .. index:: single: ObservationOperator
683 .. index:: single: Study_name
684 .. index:: single: Study_repertory
685 .. index:: single: UserDataInit
687 The second set of commands is related to the description of a checking case,
688 that is a procedure to check required properties on information somewhere else
689 by a calculation case. The terms are ordered in alphabetical order, except the
690 first, which describes choice between calculation or checking. The different
691 commands are the following:
694 *Required command*. This is the general command describing the checking
695 case. It hierarchically contains all the other commands.
698 *Required command*. This is a string to indicate the data assimilation or
699 optimization algorithm chosen. The choices are limited and available through
700 the GUI. There exists for example "FunctionTest", "AdjointTest"... See below
701 the list of algorithms and associated parameters in the following subsection
702 `Options and required commands for checking algorithms`_.
704 **AlgorithmParameters**
705 *Optional command*. This command allows to add some optional parameters to
706 control the data assimilation or optimization algorithm. It is defined as a
707 "*Dict*" type object, that is, given as a script. See below the list of
708 algorithms and associated parameters in the following subsection `Options
709 and required commands for checking algorithms`_.
712 *Required command*. This indicates the vector used,
713 previously noted as :math:`\mathbf{x}^b`. It is defined as a "*Vector*" type
714 object, that is, given either as a string or as a script.
717 *Required command*. This define the level of trace and intermediary debug
718 information. The choices are limited between 0 (for False) and 1 (for
721 **ObservationOperator**
722 *Required command*. This indicates the observation operator, previously
723 noted :math:`H`, which transforms the input parameters :math:`\mathbf{x}` to
724 results :math:`\mathbf{y}` to be compared to observations
725 :math:`\mathbf{y}^o`. It is defined as a "*Function*" type object, that is,
726 given as a script. Different functional forms can be used, as described in
727 the following subsection `Requirements for functions describing an
731 *Required command*. This is an open string to describe the study by a name
735 *Optional command*. If available, this repertory is used to find all the
736 script files that can be used to define some other commands by scripts.
739 *Optional command*. This commands allows to initialize some parameters or
740 data automatically before data assimilation algorithm processing.
742 Options and required commands for checking algorithms
743 +++++++++++++++++++++++++++++++++++++++++++++++++++++
745 .. index:: single: AdjointTest
746 .. index:: single: FunctionTest
747 .. index:: single: GradientTest
748 .. index:: single: LinearityTest
750 .. index:: single: AlgorithmParameters
751 .. index:: single: AmplitudeOfInitialDirection
752 .. index:: single: EpsilonMinimumExponent
753 .. index:: single: InitialDirection
754 .. index:: single: ResiduFormula
755 .. index:: single: SetSeed
757 We recall that each algorithm can be controlled using some generic or specific
758 options given through the "*AlgorithmParameters*" optional command, as follows
761 AlgorithmParameters = {
762 "AmplitudeOfInitialDirection" : 1,
763 "EpsilonMinimumExponent" : -8,
766 If an option is specified for an algorithm that doesn't support it, the option
767 is simply left unused. The meaning of the acronyms or particular names can be
768 found in the :ref:`genindex` or the :ref:`section_glossary`. In addition, for
769 each algorithm, the required commands/keywords are given, being described in
770 `List of commands and keywords for an ADAO checking case`_.
776 "ObservationOperator"*
778 AmplitudeOfInitialDirection
779 This key indicates the scaling of the initial perturbation build as a vector
780 used for the directional derivative around the nominal checking point. The
781 default is 1, that means no scaling.
783 EpsilonMinimumExponent
784 This key indicates the minimal exponent value of the power of 10 coefficient
785 to be used to decrease the increment multiplier. The default is -8, and it
786 has to be between 0 and -20. For example, its default value leads to
787 calculate the residue of the scalar product formula with a fixed increment
788 multiplied from 1.e0 to 1.e-8.
791 This key indicates the vector direction used for the directional derivative
792 around the nominal checking point. It has to be a vector. If not specified,
793 this direction defaults to a random perturbation around zero of the same
794 vector size than the checking point.
797 This key allow to give an integer in order to fix the seed of the random
798 generator used to generate the ensemble. A convenient value is for example
799 1000. By default, the seed is left uninitialized, and so use the default
800 initialization from the computer.
806 "ObservationOperator"*
808 NumberOfPrintedDigits
809 This key indicates the number of digits of precision for floating point
810 printed output. The default is 5, with a minimum of 0.
813 This key indicates the number of time to repeat the function evaluation. The
817 This key requires the activation, or not, of the debug mode during the
818 function evaluation. The default is True, the choices are True of False.
824 "ObservationOperator"*
826 AmplitudeOfInitialDirection
827 This key indicates the scaling of the initial perturbation build as a vector
828 used for the directional derivative around the nominal checking point. The
829 default is 1, that means no scaling.
831 EpsilonMinimumExponent
832 This key indicates the minimal exponent value of the power of 10 coefficient
833 to be used to decrease the increment multiplier. The default is -8, and it
834 has to be between 0 and -20. For example, its default value leads to
835 calculate the residue of the scalar product formula with a fixed increment
836 multiplied from 1.e0 to 1.e-8.
839 This key indicates the vector direction used for the directional derivative
840 around the nominal checking point. It has to be a vector. If not specified,
841 this direction defaults to a random perturbation around zero of the same
842 vector size than the checking point.
845 This key indicates the residue formula that has to be used for the test. The
846 default choice is "Taylor", and the possible ones are "Taylor" (residue of
847 the Taylor development of the operator, which has to decrease with the power
848 of 2 in perturbation) and "Norm" (residue obtained by taking the norm of the
849 Taylor development at zero order approximation, which approximate the
850 gradient, and which has to remain constant).
853 This key allow to give an integer in order to fix the seed of the random
854 generator used to generate the ensemble. A convenient value is for example
855 1000. By default, the seed is left uninitialized, and so use the default
856 initialization from the computer.
862 "ObservationOperator"*
864 AmplitudeOfInitialDirection
865 This key indicates the scaling of the initial perturbation build as a vector
866 used for the directional derivative around the nominal checking point. The
867 default is 1, that means no scaling.
869 EpsilonMinimumExponent
870 This key indicates the minimal exponent value of the power of 10 coefficient
871 to be used to decrease the increment multiplier. The default is -8, and it
872 has to be between 0 and -20. For example, its default value leads to
873 calculate the residue of the scalar product formula with a fixed increment
874 multiplied from 1.e0 to 1.e-8.
877 This key indicates the vector direction used for the directional derivative
878 around the nominal checking point. It has to be a vector. If not specified,
879 this direction defaults to a random perturbation around zero of the same
880 vector size than the checking point.
883 This key indicates the residue formula that has to be used for the test. The
884 default choice is "CenteredDL", and the possible ones are "CenteredDL"
885 (residue of the difference between the function at nominal point and the
886 values with positive and negative increments, which has to stay very small),
887 "Taylor" (residue of the Taylor development of the operator normalized by
888 the nominal value, which has to stay very small), "NominalTaylor" (residue
889 of the order 1 approximations of the operator, normalized to the nominal
890 point, which has to stay close to 1), and "NominalTaylorRMS" (residue of the
891 order 1 approximations of the operator, normalized by RMS to the nominal
892 point, which has to stay close to 0).
895 This key allow to give an integer in order to fix the seed of the random
896 generator used to generate the ensemble. A convenient value is for example
897 1000. By default, the seed is left uninitialized, and so use the default
898 initialization from the computer.
900 Requirements for functions describing an operator
901 -------------------------------------------------
903 The operators for observation and evolution are required to implement the data
904 assimilation or optimization procedures. They include the physical simulation
905 numerical simulations, but also the filtering and restriction to compare the
906 simulation to observation. The evolution operator is considered here in its
907 incremental form, representing the transition between two successive states, and
908 is then similar to the observation operator.
910 Schematically, an operator has to give a output solution given the input
911 parameters. Part of the input parameters can be modified during the optimization
912 procedure. So the mathematical representation of such a process is a function.
913 It was briefly described in the section :ref:`section_theory` and is generalized
914 here by the relation:
916 .. math:: \mathbf{y} = O( \mathbf{x} )
918 between the pseudo-observations :math:`\mathbf{y}` and the parameters
919 :math:`\mathbf{x}` using the observation or evolution operator :math:`O`. The
920 same functional representation can be used for the linear tangent model
921 :math:`\mathbf{O}` of :math:`O` and its adjoint :math:`\mathbf{O}^*`, also
922 required by some data assimilation or optimization algorithms.
924 Then, **to describe completely an operator, the user has only to provide a
925 function that fully and only realize the functional operation**.
927 This function is usually given as a script that can be executed in a YACS node.
928 This script can without difference launch external codes or use internal SALOME
929 calls and methods. If the algorithm requires the 3 aspects of the operator
930 (direct form, tangent form and adjoint form), the user has to give the 3
931 functions or to approximate them.
933 There are 3 practical methods for the user to provide an operator functional
934 representation. These methods are chosen in the "*FROM*" field of each operator
935 having a "*Function*" value as "*INPUT_TYPE*", as shown by the following figure:
937 .. eficas_operator_function:
938 .. image:: images/eficas_operator_function.png
942 **Choosing an operator functional representation**
944 First functional form: using "*ScriptWithOneFunction*"
945 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
947 .. index:: single: ScriptWithOneFunction
948 .. index:: single: DirectOperator
949 .. index:: single: DifferentialIncrement
950 .. index:: single: CenteredFiniteDifference
952 The first one consist in providing only one potentially non-linear function, and
953 to approximate the tangent and the adjoint operators. This is done by using the
954 keyword "*ScriptWithOneFunction*" for the description of the chosen operator in
955 the ADAO GUI. The user have to provide the function in a script, with a
956 mandatory name "*DirectOperator*". For example, the script can follow the
959 def DirectOperator( X ):
960 """ Direct non-linear simulation operator """
966 In this case, the user can also provide a value for the differential increment,
967 using through the GUI the keyword "*DifferentialIncrement*", which has a default
968 value of 1%. This coefficient will be used in the finite difference
969 approximation to build the tangent and adjoint operators. The finite difference
970 approximation order can also be chosen through the GUI, using the keyword
971 "*CenteredFiniteDifference*", with 0 for an uncentered schema of first order,
972 and with 1 for a centered schema of second order (of twice the first order
973 computational cost). The keyword has a default value of 0.
975 This first operator definition allow easily to test the functional form before
976 its use in an ADAO case, greatly reducing the complexity of implementation.
978 **Important warning:** the name "*DirectOperator*" is mandatory, and the type of
979 the X argument can be either a python list, a numpy array or a numpy 1D-matrix.
980 The user has to treat these cases in his script.
982 Second functional form: using "*ScriptWithFunctions*"
983 +++++++++++++++++++++++++++++++++++++++++++++++++++++
985 .. index:: single: ScriptWithFunctions
986 .. index:: single: DirectOperator
987 .. index:: single: TangentOperator
988 .. index:: single: AdjointOperator
990 The second one consist in providing directly the three associated operators
991 :math:`O`, :math:`\mathbf{O}` and :math:`\mathbf{O}^*`. This is done by using
992 the keyword "*ScriptWithFunctions*" for the description of the chosen operator
993 in the ADAO GUI. The user have to provide three functions in one script, with
994 three mandatory names "*DirectOperator*", "*TangentOperator*" and
995 "*AdjointOperator*". For example, the script can follow the template::
997 def DirectOperator( X ):
998 """ Direct non-linear simulation operator """
1002 return something like Y
1004 def TangentOperator( (X, dX) ):
1005 """ Tangent linear operator, around X, applied to dX """
1009 return something like Y
1011 def AdjointOperator( (X, Y) ):
1012 """ Adjoint operator, around X, applied to Y """
1016 return something like X
1018 Another time, this second operator definition allow easily to test the
1019 functional forms before their use in an ADAO case, reducing the complexity of
1022 **Important warning:** the names "*DirectOperator*", "*TangentOperator*" and
1023 "*AdjointOperator*" are mandatory, and the type of the X, Y, dX arguments can be
1024 either a python list, a numpy array or a numpy 1D-matrix. The user has to treat
1025 these cases in his script.
1027 Third functional form: using "*ScriptWithSwitch*"
1028 +++++++++++++++++++++++++++++++++++++++++++++++++
1030 .. index:: single: ScriptWithSwitch
1031 .. index:: single: DirectOperator
1032 .. index:: single: TangentOperator
1033 .. index:: single: AdjointOperator
1035 This third form give more possibilities to control the execution of the three
1036 functions representing the operator, allowing advanced usage and control over
1037 each execution of the simulation code. This is done by using the keyword
1038 "*ScriptWithSwitch*" for the description of the chosen operator in the ADAO GUI.
1039 The user have to provide a switch in one script to control the execution of the
1040 direct, tangent and adjoint forms of its simulation code. The user can then, for
1041 example, use other approximations for the tangent and adjoint codes, or
1042 introduce more complexity in the argument treatment of the functions. But it
1043 will be far more complicated to implement and debug.
1045 **It is recommended not to use this third functional form without a solid
1046 numerical or physical reason.**
1048 If, however, you want to use this third form, we recommend using the following
1049 template for the switch. It requires an external script or code named
1050 "*Physical_simulation_functions.py*", containing three functions named
1051 "*DirectOperator*", "*TangentOperator*" and "*AdjointOperator*" as previously.
1052 Here is the switch template::
1054 import Physical_simulation_functions
1055 import numpy, logging
1058 for param in computation["specificParameters"]:
1059 if param["name"] == "method":
1060 method = param["value"]
1061 if method not in ["Direct", "Tangent", "Adjoint"]:
1062 raise ValueError("No valid computation method is given")
1063 logging.info("Found method is \'%s\'"%method)
1065 logging.info("Loading operator functions")
1066 Function = Physical_simulation_functions.DirectOperator
1067 Tangent = Physical_simulation_functions.TangentOperator
1068 Adjoint = Physical_simulation_functions.AdjointOperator
1070 logging.info("Executing the possible computations")
1072 if method == "Direct":
1073 logging.info("Direct computation")
1074 Xcurrent = computation["inputValues"][0][0][0]
1075 data = Function(numpy.matrix( Xcurrent ).T)
1076 if method == "Tangent":
1077 logging.info("Tangent computation")
1078 Xcurrent = computation["inputValues"][0][0][0]
1079 dXcurrent = computation["inputValues"][0][0][1]
1080 data = Tangent(numpy.matrix(Xcurrent).T, numpy.matrix(dXcurrent).T)
1081 if method == "Adjoint":
1082 logging.info("Adjoint computation")
1083 Xcurrent = computation["inputValues"][0][0][0]
1084 Ycurrent = computation["inputValues"][0][0][1]
1085 data = Adjoint((numpy.matrix(Xcurrent).T, numpy.matrix(Ycurrent).T))
1087 logging.info("Formatting the output")
1088 it = numpy.ravel(data)
1089 outputValues = [[[[]]]]
1091 outputValues[0][0][0].append(val)
1094 result["outputValues"] = outputValues
1095 result["specificOutputInfos"] = []
1096 result["returnCode"] = 0
1097 result["errorMessage"] = ""
1099 All various modifications could be done from this template hypothesis.
1101 Special case of controled evolution operator
1102 ++++++++++++++++++++++++++++++++++++++++++++
1104 In some cases, the evolution or the observation operators are required to be
1105 controled by an external input control, given a priori. In this case, the
1106 generic form of the incremental evolution model is slightly modified as follows:
1108 .. math:: \mathbf{y} = O( \mathbf{x}, \mathbf{u})
1110 where :math:`\mathbf{u}` is the control over one state increment. In this case,
1111 the direct operator has to be applied to a pair of variables :math:`(X,U)`.
1112 Schematically, the operator has to be set as::
1114 def DirectOperator( (X, U) ):
1115 """ Direct non-linear simulation operator """
1119 return something like X(n+1) or Y(n+1)
1121 The tangent and adjoint operators have the same signature as previously, noting
1122 that the derivatives has to be done only partially against :math:`\mathbf{x}`.
1123 In such a case with explicit control, only the second functional form (using
1124 "*ScriptWithFunctions*") and third functional form (using "*ScriptWithSwitch*")
1127 Requirements to describe covariance matrices
1128 --------------------------------------------
1130 Multiple covariance matrices are required to implement the data assimilation or
1131 optimization procedures. The main ones are the background error covariance
1132 matrix, noted as :math:`\mathbf{B}` and the observation error covariance matrix,
1133 noted as :math:`\mathbf{R}`. Such a matrix is required to be a squared symetric
1134 semi-definite positive matrix.
1136 There are 3 practical methods for the user to provide a covariance matrix. These
1137 methods are chosen as the "*INPUT_TYPE*" of each defined covariance matrix, as
1138 shown by the following figure:
1140 .. eficas_covariance_matrix:
1141 .. image:: images/eficas_covariance_matrix.png
1145 **Choosing covariance matrix representation**
1147 First matrix form: using "*Matrix*" representation
1148 ++++++++++++++++++++++++++++++++++++++++++++++++++
1150 .. index:: single: Matrix
1151 .. index:: single: BackgroundError
1152 .. index:: single: EvolutionError
1153 .. index:: single: ObservationError
1155 This first form is the default and more general one. The covariance matrix
1156 :math:`\mathbf{M}` has to be fully specified. Even if the matrix is symetric by
1157 nature, the entire :math:`\mathbf{M}` matrix has to be given.
1159 .. math:: \mathbf{M} = \begin{pmatrix}
1160 m_{11} & m_{12} & \cdots & m_{1n} \\
1161 m_{21} & m_{22} & \cdots & m_{2n} \\
1162 \vdots & \vdots & \vdots & \vdots \\
1163 m_{n1} & \cdots & m_{nn-1} & m_{nn}
1166 It can be either a Python Numpy array or a matrix, or a list of lists of values
1167 (that is, a list of rows). For example, a simple diagonal unitary background error
1168 covariance matrix :math:`\mathbf{B}` can be described in a python script as::
1170 BackgroundError = numpy.eye(...)
1174 BackgroundError = [[1, 0 ... 0], [0, 1 ... 0] ... [0, 0 ... 1]]
1176 Second matrix form: using "*ScalarSparseMatrix*" representation
1177 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1179 .. index:: single: ScalarSparseMatrix
1180 .. index:: single: BackgroundError
1181 .. index:: single: EvolutionError
1182 .. index:: single: ObservationError
1184 On the opposite, this second form is a very simplified method to provide a
1185 matrix. The covariance matrix :math:`\mathbf{M}` is supposed to be a positive
1186 multiple of the identity matrix. The matrix can then be specified only by this
1189 .. math:: \mathbf{M} = m \times \begin{pmatrix}
1190 1 & 0 & \cdots & 0 \\
1191 0 & 1 & \cdots & 0 \\
1192 \vdots & \vdots & \vdots & \vdots \\
1196 The multiplier :math:`m` has to be a floating point or integer positive value
1197 (if it is negative, which is impossible, converted to positive value). For example, a simple diagonal unitary background error
1198 covariance matrix :math:`\mathbf{B}` can be described in a python script as::
1200 BackgroundError = 1.
1202 or, better, by a "*String*" directly in the ADAO case.
1204 Third matrix form: using "*DiagonalSparseMatrix*" representation
1205 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1207 .. index:: single: DiagonalSparseMatrix
1208 .. index:: single: BackgroundError
1209 .. index:: single: EvolutionError
1210 .. index:: single: ObservationError
1212 This third form is also a simplified method to provide a matrix, but a little
1213 more powerful. The covariance matrix :math:`\mathbf{M}` is already supposed to
1214 be diagonal, but the user has to specify all the positive diagonal values. The
1215 matrix can then be specified only by a vector :math:`\mathbf{V}` which will be
1216 set on a diagonal matrix:
1218 .. math:: \mathbf{M} = \begin{pmatrix}
1219 v_{1} & 0 & \cdots & 0 \\
1220 0 & v_{2} & \cdots & 0 \\
1221 \vdots & \vdots & \vdots & \vdots \\
1222 0 & \cdots & 0 & v_{n}
1225 It can be either a Python Numpy array or a matrix, or a list or a list of list
1226 of positive values (if some are negative, which is impossible, converted to
1227 positive values). For example, a simple diagonal unitary background error
1228 covariance matrix :math:`\mathbf{B}` can be described in a python script as::
1230 BackgroundError = [1, 1 ... 1]
1234 BackgroundError = numpy.ones(...)