3 ================================================================================
4 Reference description of the ADAO commands and keywords
5 ================================================================================
7 This section presents the reference description of the ADAO commands and
8 keywords available through the GUI or through scripts.
10 Each command or keyword to be defined through the ADAO GUI has some properties.
11 The first property is to be *required*, *optional* or only factual, describing a
12 type of input. The second property is to be an "open" variable with a fixed type
13 but with any value allowed by the type, or a "restricted" variable, limited to
14 some specified values. The EFICAS editor GUI having build-in validating
15 capacities, the properties of the commands or keywords given through this GUI
16 are automatically correct.
18 The mathematical notations used afterward are explained in the section
19 :ref:`section_theory`.
21 Examples of using these commands are available in the section
22 :ref:`section_examples` and in example files installed with ADAO module.
24 List of possible input types
25 ----------------------------
27 .. index:: single: Dict
28 .. index:: single: Function
29 .. index:: single: Matrix
30 .. index:: single: ScalarSparseMatrix
31 .. index:: single: DiagonalSparseMatrix
32 .. index:: single: String
33 .. index:: single: Script
34 .. index:: single: Vector
36 Each ADAO variable has a pseudo-type to help filling it and validation. The
37 different pseudo-types are:
40 This indicates a variable that has to be filled by a Python dictionary
41 ``{"key":"value...}``, usually given either as a string or as a script file.
44 This indicates a variable that has to be filled by a Python function,
45 usually given as a script file or a component method.
48 This indicates a variable that has to be filled by a matrix, usually given
49 either as a string or as a script file.
51 **ScalarSparseMatrix**
52 This indicates a variable that has to be filled by a unique number (which
53 will be used to multiply an identity matrix), usually given either as a
54 string or as a script file.
56 **DiagonalSparseMatrix**
57 This indicates a variable that has to be filled by a vector (which will be
58 used to replace the diagonal of an identity matrix), usually given either as
59 a string or as a script file.
62 This indicates a script given as an external file. It can be described by a
63 full absolute path name or only by the file name without path. If the file
64 is given only by a file name without path, and if a study directory is also
65 indicated, the file is searched in the given directory.
68 This indicates a string giving a literal representation of a matrix, a
69 vector or a vector series, such as "1 2 ; 3 4" or "[[1,2],[3,4]]" for a
73 This indicates a variable that has to be filled by a vector, usually given
74 either as a string or as a script file.
77 This indicates a variable that has to be filled by a list of
78 vectors, usually given either as a string or as a script file.
80 When a command or keyword can be filled by a script file name, the script has to
81 contain a variable or a method that has the same name as the one to be filled.
82 In other words, when importing the script in a YACS Python node, it must create
83 a variable of the good name in the current namespace of the node.
85 Reference description for ADAO calculation cases
86 ------------------------------------------------
88 List of commands and keywords for an ADAO calculation case
89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
91 .. index:: single: ASSIMILATION_STUDY
92 .. index:: single: Algorithm
93 .. index:: single: AlgorithmParameters
94 .. index:: single: Background
95 .. index:: single: BackgroundError
96 .. index:: single: ControlInput
97 .. index:: single: Debug
98 .. index:: single: EvolutionError
99 .. index:: single: EvolutionModel
100 .. index:: single: InputVariables
101 .. index:: single: Observation
102 .. index:: single: ObservationError
103 .. index:: single: ObservationOperator
104 .. index:: single: Observer
105 .. index:: single: Observers
106 .. index:: single: Observer Template
107 .. index:: single: OutputVariables
108 .. index:: single: Study_name
109 .. index:: single: Study_repertory
110 .. index:: single: UserDataInit
111 .. index:: single: UserPostAnalysis
112 .. index:: single: UserPostAnalysis Template
114 The first set of commands is related to the description of a calculation case,
115 that is a *Data Assimilation* procedure or an *Optimization* procedure. The
116 terms are ordered in alphabetical order, except the first, which describes
117 choice between calculation or checking. The different commands are the
120 **ASSIMILATION_STUDY**
121 *Required command*. This is the general command describing the data
122 assimilation or optimization case. It hierarchically contains all the other
126 *Required command*. This is a string to indicate the data assimilation or
127 optimization algorithm chosen. The choices are limited and available through
128 the GUI. There exists for example "3DVAR", "Blue"... See below the list of
129 algorithms and associated parameters in the following subsection `Optional
130 and required commands for calculation algorithms`_.
132 **AlgorithmParameters**
133 *Optional command*. This command allows to add some optional parameters to
134 control the data assimilation or optimization algorithm. Its value is
135 defined as a "*Dict*" type object. See below the list of algorithms and
136 associated parameters in the following subsection `Optional and required
137 commands for calculation algorithms`_.
140 *Required command*. This indicates the background or initial vector used,
141 previously noted as :math:`\mathbf{x}^b`. Its value is defined as a
142 "*Vector*" type object.
145 *Required command*. This indicates the background error covariance matrix,
146 previously noted as :math:`\mathbf{B}`. Its value is defined as a "*Matrix*"
147 type object, a "*ScalarSparseMatrix*" type object, or a
148 "*DiagonalSparseMatrix*" type object.
151 *Optional command*. This indicates the control vector used to force the
152 evolution model at each step, usually noted as :math:`\mathbf{U}`. Its value
153 is defined as a "*Vector*" or a *VectorSerie* type object. When there is no
154 control, it has to be a void string ''.
157 *Optional command*. This define the level of trace and intermediary debug
158 information. The choices are limited between 0 (for False) and 1 (for
162 *Optional command*. This indicates the evolution error covariance matrix,
163 usually noted as :math:`\mathbf{Q}`. It is defined as a "*Matrix*" type
164 object, a "*ScalarSparseMatrix*" type object, or a "*DiagonalSparseMatrix*"
168 *Optional command*. This indicates the evolution model operator, usually
169 noted :math:`M`, which describes an elementary step of evolution. Its value
170 is defined as a "*Function*" type object or a "*Matrix*" type one. In the
171 case of "*Function*" type, different functional forms can be used, as
172 described in the following subsection `Requirements for functions describing
173 an operator`_. If there is some control :math:`U` included in the evolution
174 model, the operator has to be applied to a pair :math:`(X,U)`.
177 *Optional command*. This command allows to indicates the name and size of
178 physical variables that are bundled together in the state vector. This
179 information is dedicated to data processed inside an algorithm.
182 *Required command*. This indicates the observation vector used for data
183 assimilation or optimization, previously noted as :math:`\mathbf{y}^o`. It
184 is defined as a "*Vector*" or a *VectorSerie* type object.
187 *Required command*. This indicates the observation error covariance matrix,
188 previously noted as :math:`\mathbf{R}`. It is defined as a "*Matrix*" type
189 object, a "*ScalarSparseMatrix*" type object, or a "*DiagonalSparseMatrix*"
192 **ObservationOperator**
193 *Required command*. This indicates the observation operator, previously
194 noted :math:`H`, which transforms the input parameters :math:`\mathbf{x}` to
195 results :math:`\mathbf{y}` to be compared to observations
196 :math:`\mathbf{y}^o`. Its value is defined as a "*Function*" type object or
197 a "*Matrix*" type one. In the case of "*Function*" type, different
198 functional forms can be used, as described in the following subsection
199 `Requirements for functions describing an operator`_. If there is some
200 control :math:`U` included in the observation, the operator has to be
201 applied to a pair :math:`(X,U)`.
204 *Optional command*. This command allows to set internal observers, that are
205 functions linked with a particular variable, which will be executed each
206 time this variable is modified. It is a convenient way to monitor variables
207 of interest during the data assimilation or optimization process, by
208 printing or plotting it, etc. Common templates are provided to help the user
209 to start or to quickly make his case.
212 *Optional command*. This command allows to indicates the name and size of
213 physical variables that are bundled together in the output observation
214 vector. This information is dedicated to data processed inside an algorithm.
217 *Required command*. This is an open string to describe the ADAO study by a
221 *Optional command*. If available, this directory is used as base name for
222 calculation, and used to find all the script files, given by name without
223 path, that can be used to define some other commands by scripts.
226 *Optional command*. This commands allows to initialize some parameters or
227 data automatically before data assimilation or optimization algorithm input
228 processing. It indicates a script file name to be executed before entering
229 in initialization phase of chosen variables.
232 *Optional command*. This commands allows to process some parameters or data
233 automatically after data assimilation or optimization algorithm processing.
234 Its value is defined as a script file or a string, allowing to put
235 post-processing code directly inside the ADAO case. Common templates are
236 provided to help the user to start or to quickly make his case.
238 Optional and required commands for calculation algorithms
239 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
241 .. index:: single: 3DVAR
242 .. index:: single: Blue
243 .. index:: single: ExtendedBlue
244 .. index:: single: EnsembleBlue
245 .. index:: single: KalmanFilter
246 .. index:: single: ExtendedKalmanFilter
247 .. index:: single: UnscentedKalmanFilter
248 .. index:: single: LinearLeastSquares
249 .. index:: single: NonLinearLeastSquares
250 .. index:: single: ParticleSwarmOptimization
251 .. index:: single: QuantileRegression
253 .. index:: single: AlgorithmParameters
254 .. index:: single: Bounds
255 .. index:: single: CostDecrementTolerance
256 .. index:: single: GradientNormTolerance
257 .. index:: single: GroupRecallRate
258 .. index:: single: MaximumNumberOfSteps
259 .. index:: single: Minimizer
260 .. index:: single: NumberOfInsects
261 .. index:: single: ProjectedGradientTolerance
262 .. index:: single: QualityCriterion
263 .. index:: single: Quantile
264 .. index:: single: SetSeed
265 .. index:: single: StoreInternalVariables
266 .. index:: single: StoreSupplementaryCalculations
267 .. index:: single: SwarmVelocity
269 Each algorithm can be controlled using some generic or specific options, given
270 through the "*AlgorithmParameters*" optional command in a script file or a
271 string, as follows for example in a file::
273 AlgorithmParameters = {
274 "Minimizer" : "LBFGSB",
275 "MaximumNumberOfSteps" : 25,
276 "StoreSupplementaryCalculations" : ["APosterioriCovariance","OMA"],
279 To give the "*AlgorithmParameters*" values by string, one must enclose a
280 standard dictionary definition between simple quotes, as for example::
282 '{"Minimizer":"LBFGSB","MaximumNumberOfSteps":25}'
284 This section describes the available options algorithm by algorithm. In
285 addition, for each algorithm, the required commands/keywords are given, being
286 described in `List of commands and keywords for an ADAO calculation case`_. If
287 an option is specified by the user for an algorithm that doesn't support it, the
288 option is simply left unused and don't stop the treatment. The meaning of the
289 acronyms or particular names can be found in the :ref:`genindex` or the
290 :ref:`section_glossary`.
295 *"Background", "BackgroundError",
296 "Observation", "ObservationError",
297 "ObservationOperator"*
299 StoreInternalVariables
300 This Boolean key allows to store default internal variables, mainly the
301 current state during iterative optimization process. Be careful, this can be
302 a numerically costly choice in certain calculation cases. The default is
305 StoreSupplementaryCalculations
306 This list indicates the names of the supplementary variables that can be
307 available at the end of the algorithm. It involves potentially costly
308 calculations. The default is a void list, none of these variables being
309 calculated and stored by default. The possible names are in the following
310 list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
311 "SigmaBck2", "SigmaObs2", "MahalanobisConsistency"].
316 *"Background", "BackgroundError",
317 "Observation", "ObservationError",
318 "ObservationOperator"*
320 StoreInternalVariables
321 This Boolean key allows to store default internal variables, mainly the
322 current state during iterative optimization process. Be careful, this can be
323 a numerically costly choice in certain calculation cases. The default is
326 StoreSupplementaryCalculations
327 This list indicates the names of the supplementary variables that can be
328 available at the end of the algorithm. It involves potentially costly
329 calculations. The default is a void list, none of these variables being
330 calculated and stored by default. The possible names are in the following
331 list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
332 "SigmaBck2", "SigmaObs2", "MahalanobisConsistency"].
334 **"LinearLeastSquares"**
337 *"Observation", "ObservationError",
338 "ObservationOperator"*
340 StoreInternalVariables
341 This Boolean key allows to store default internal variables, mainly the
342 current state during iterative optimization process. Be careful, this can be
343 a numerically costly choice in certain calculation cases. The default is
346 StoreSupplementaryCalculations
347 This list indicates the names of the supplementary variables that can be
348 available at the end of the algorithm. It involves potentially costly
349 calculations. The default is a void list, none of these variables being
350 calculated and stored by default. The possible names are in the following
356 *"Background", "BackgroundError",
357 "Observation", "ObservationError",
358 "ObservationOperator"*
361 This key allows to choose the optimization minimizer. The default choice is
362 "LBFGSB", and the possible ones are "LBFGSB" (nonlinear constrained
363 minimizer, see [Byrd95]_, [Morales11]_ and [Zhu97]_), "TNC" (nonlinear
364 constrained minimizer), "CG" (nonlinear unconstrained minimizer), "BFGS"
365 (nonlinear unconstrained minimizer), "NCG" (Newton CG minimizer). It is
366 strongly recommended to stay with the default.
369 This key allows to define upper and lower bounds for every state variable
370 being optimized. Bounds have to be given by a list of list of pairs of
371 lower/upper bounds for each variable, with possibly ``None`` every time
372 there is no bound. The bounds can always be specified, but they are taken
373 into account only by the constrained optimizers.
376 This key indicates the maximum number of iterations allowed for iterative
377 optimization. The default is 15000, which is very similar to no limit on
378 iterations. It is then recommended to adapt this parameter to the needs on
379 real problems. For some optimizers, the effective stopping step can be
380 slightly different of the limit due to algorithm internal control
383 CostDecrementTolerance
384 This key indicates a limit value, leading to stop successfully the
385 iterative optimization process when the cost function decreases less than
386 this tolerance at the last step. The default is 1.e-7, and it is
387 recommended to adapt it to the needs on real problems.
389 ProjectedGradientTolerance
390 This key indicates a limit value, leading to stop successfully the iterative
391 optimization process when all the components of the projected gradient are
392 under this limit. It is only used for constrained optimizers. The default is
393 -1, that is the internal default of each minimizer (generally 1.e-5), and it
394 is not recommended to change it.
396 GradientNormTolerance
397 This key indicates a limit value, leading to stop successfully the
398 iterative optimization process when the norm of the gradient is under this
399 limit. It is only used for non-constrained optimizers. The default is
400 1.e-5 and it is not recommended to change it.
402 StoreInternalVariables
403 This Boolean key allows to store default internal variables, mainly the
404 current state during iterative optimization process. Be careful, this can be
405 a numerically costly choice in certain calculation cases. The default is
408 StoreSupplementaryCalculations
409 This list indicates the names of the supplementary variables that can be
410 available at the end of the algorithm. It involves potentially costly
411 calculations. The default is a void list, none of these variables being
412 calculated and stored by default. The possible names are in the following
413 list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
414 "SigmaObs2", "MahalanobisConsistency"].
416 **"NonLinearLeastSquares"**
420 "Observation", "ObservationError",
421 "ObservationOperator"*
424 This key allows to choose the optimization minimizer. The default choice is
425 "LBFGSB", and the possible ones are "LBFGSB" (nonlinear constrained
426 minimizer, see [Byrd95]_, [Morales11]_ and [Zhu97]_), "TNC" (nonlinear
427 constrained minimizer), "CG" (nonlinear unconstrained minimizer), "BFGS"
428 (nonlinear unconstrained minimizer), "NCG" (Newton CG minimizer). It is
429 strongly recommended to stay with the default.
432 This key allows to define upper and lower bounds for every state variable
433 being optimized. Bounds have to be given by a list of list of pairs of
434 lower/upper bounds for each variable, with possibly ``None`` every time
435 there is no bound. The bounds can always be specified, but they are taken
436 into account only by the constrained optimizers.
439 This key indicates the maximum number of iterations allowed for iterative
440 optimization. The default is 15000, which is very similar to no limit on
441 iterations. It is then recommended to adapt this parameter to the needs on
442 real problems. For some optimizers, the effective stopping step can be
443 slightly different due to algorithm internal control requirements.
445 CostDecrementTolerance
446 This key indicates a limit value, leading to stop successfully the
447 iterative optimization process when the cost function decreases less than
448 this tolerance at the last step. The default is 1.e-7, and it is
449 recommended to adapt it to the needs on real problems.
451 ProjectedGradientTolerance
452 This key indicates a limit value, leading to stop successfully the iterative
453 optimization process when all the components of the projected gradient are
454 under this limit. It is only used for constrained optimizers. The default is
455 -1, that is the internal default of each minimizer (generally 1.e-5), and it
456 is not recommended to change it.
458 GradientNormTolerance
459 This key indicates a limit value, leading to stop successfully the
460 iterative optimization process when the norm of the gradient is under this
461 limit. It is only used for non-constrained optimizers. The default is
462 1.e-5 and it is not recommended to change it.
464 StoreInternalVariables
465 This Boolean key allows to store default internal variables, mainly the
466 current state during iterative optimization process. Be careful, this can be
467 a numerically costly choice in certain calculation cases. The default is
470 StoreSupplementaryCalculations
471 This list indicates the names of the supplementary variables that can be
472 available at the end of the algorithm. It involves potentially costly
473 calculations. The default is a void list, none of these variables being
474 calculated and stored by default. The possible names are in the following
475 list: ["BMA", "OMA", "OMB", "Innovation"].
480 *"Background", "BackgroundError",
481 "Observation", "ObservationError",
482 "ObservationOperator"*
485 This key allow to give an integer in order to fix the seed of the random
486 generator used to generate the ensemble. A convenient value is for example
487 1000. By default, the seed is left uninitialized, and so use the default
488 initialization from the computer.
493 *"Background", "BackgroundError",
494 "Observation", "ObservationError",
495 "ObservationOperator"*
498 This key allows to choose the type of estimation to be performed. It can be
499 either state-estimation, with a value of "State", or parameter-estimation,
500 with a value of "Parameters". The default choice is "State".
502 StoreInternalVariables
503 This Boolean key allows to store default internal variables, mainly the
504 current state during iterative optimization process. Be careful, this can be
505 a numerically costly choice in certain calculation cases. The default is
508 StoreSupplementaryCalculations
509 This list indicates the names of the supplementary variables that can be
510 available at the end of the algorithm. It involves potentially costly
511 calculations. The default is a void list, none of these variables being
512 calculated and stored by default. The possible names are in the following
513 list: ["APosterioriCovariance", "BMA", "Innovation"].
515 **"ExtendedKalmanFilter"**
518 *"Background", "BackgroundError",
519 "Observation", "ObservationError",
520 "ObservationOperator"*
523 This key allows to define upper and lower bounds for every state variable
524 being optimized. Bounds have to be given by a list of list of pairs of
525 lower/upper bounds for each variable, with extreme values every time there
526 is no bound (``None`` is not allowed when there is no bound).
529 This key allows to define the method to take bounds into account. The
530 possible methods are in the following list: ["EstimateProjection"].
533 This key allows to choose the type of estimation to be performed. It can be
534 either state-estimation, with a value of "State", or parameter-estimation,
535 with a value of "Parameters". The default choice is "State".
537 StoreInternalVariables
538 This Boolean key allows to store default internal variables, mainly the
539 current state during iterative optimization process. Be careful, this can be
540 a numerically costly choice in certain calculation cases. The default is
543 StoreSupplementaryCalculations
544 This list indicates the names of the supplementary variables that can be
545 available at the end of the algorithm. It involves potentially costly
546 calculations. The default is a void list, none of these variables being
547 calculated and stored by default. The possible names are in the following
548 list: ["APosterioriCovariance", "BMA", "Innovation"].
550 **"UnscentedKalmanFilter"**
553 *"Background", "BackgroundError",
554 "Observation", "ObservationError",
555 "ObservationOperator"*
558 This key allows to define upper and lower bounds for every state variable
559 being optimized. Bounds have to be given by a list of list of pairs of
560 lower/upper bounds for each variable, with extreme values every time there
561 is no bound (``None`` is not allowed when there is no bound).
564 This key allows to define the method to take bounds into account. The
565 possible methods are in the following list: ["EstimateProjection"].
568 This key allows to choose the type of estimation to be performed. It can be
569 either state-estimation, with a value of "State", or parameter-estimation,
570 with a value of "Parameters". The default choice is "State".
572 Alpha, Beta, Kappa, Reconditioner
573 These keys are internal scaling parameters. "Alpha" requires a value between
574 1.e-4 and 1. "Beta" has an optimal value of 2 for Gaussian *a priori*
575 distribution. "Kappa" requires an integer value, and the right default is
576 obtained by setting it to 0. "Reconditioner" requires a value between 1.e-3
577 and 10, it defaults to 1.
579 StoreInternalVariables
580 This Boolean key allows to store default internal variables, mainly the
581 current state during iterative optimization process. Be careful, this can be
582 a numerically costly choice in certain calculation cases. The default is
585 StoreSupplementaryCalculations
586 This list indicates the names of the supplementary variables that can be
587 available at the end of the algorithm. It involves potentially costly
588 calculations. The default is a void list, none of these variables being
589 calculated and stored by default. The possible names are in the following
590 list: ["APosterioriCovariance", "BMA", "Innovation"].
592 **"ParticleSwarmOptimization"**
595 *"Background", "BackgroundError",
596 "Observation", "ObservationError",
597 "ObservationOperator"*
600 This key indicates the maximum number of iterations allowed for iterative
601 optimization. The default is 50, which is an arbitrary limit. It is then
602 recommended to adapt this parameter to the needs on real problems.
605 This key indicates the number of insects or particles in the swarm. The
606 default is 100, which is a usual default for this algorithm.
609 This key indicates the part of the insect velocity which is imposed by the
610 swarm. It is a positive floating point value. The default value is 1.
613 This key indicates the recall rate at the best swarm insect. It is a
614 floating point value between 0 and 1. The default value is 0.5.
617 This key indicates the quality criterion, minimized to find the optimal
618 state estimate. The default is the usual data assimilation criterion named
619 "DA", the augmented weighted least squares. The possible criteria has to be
620 in the following list, where the equivalent names are indicated by the sign
621 "=": ["AugmentedWeightedLeastSquares"="AWLS"="DA",
622 "WeightedLeastSquares"="WLS", "LeastSquares"="LS"="L2",
623 "AbsoluteValue"="L1", "MaximumError"="ME"]
626 This key allows to define upper and lower bounds for *increments* on every
627 state variable being optimized (and not on state variables themselves).
628 Bounds have to be given by a list of list of pairs of lower/upper bounds for
629 each increment on variable, with extreme values every time there is no bound
630 (``None`` is not allowed when there is no bound). This key is required and
631 there is no default values.
634 This key allow to give an integer in order to fix the seed of the random
635 generator used to generate the ensemble. A convenient value is for example
636 1000. By default, the seed is left uninitialized, and so use the default
637 initialization from the computer.
639 StoreInternalVariables
640 This Boolean key allows to store default internal variables, mainly the
641 current state during iterative optimization process. Be careful, this can be
642 a numerically costly choice in certain calculation cases. The default is
645 StoreSupplementaryCalculations
646 This list indicates the names of the supplementary variables that can be
647 available at the end of the algorithm. It involves potentially costly
648 calculations. The default is a void list, none of these variables being
649 calculated and stored by default. The possible names are in the following
650 list: ["BMA", "OMA", "OMB", "Innovation"].
652 **"QuantileRegression"**
657 "ObservationOperator"*
660 This key allows to define the real value of the desired quantile, between
661 0 and 1. The default is 0.5, corresponding to the median.
664 This key allows to choose the optimization minimizer. The default choice
665 and only available choice is "MMQR" (Majorize-Minimize for Quantile
669 This key indicates the maximum number of iterations allowed for iterative
670 optimization. The default is 15000, which is very similar to no limit on
671 iterations. It is then recommended to adapt this parameter to the needs on
674 CostDecrementTolerance
675 This key indicates a limit value, leading to stop successfully the
676 iterative optimization process when the cost function or the surrogate
677 decreases less than this tolerance at the last step. The default is 1.e-6,
678 and it is recommended to adapt it to the needs on real problems.
680 StoreInternalVariables
681 This Boolean key allows to store default internal variables, mainly the
682 current state during iterative optimization process. Be careful, this can be
683 a numerically costly choice in certain calculation cases. The default is
686 StoreSupplementaryCalculations
687 This list indicates the names of the supplementary variables that can be
688 available at the end of the algorithm. It involves potentially costly
689 calculations. The default is a void list, none of these variables being
690 calculated and stored by default. The possible names are in the following
691 list: ["BMA", "OMA", "OMB", "Innovation"].
693 Reference description for ADAO checking cases
694 ---------------------------------------------
696 List of commands and keywords for an ADAO checking case
697 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
699 .. index:: single: CHECKING_STUDY
700 .. index:: single: Algorithm
701 .. index:: single: AlgorithmParameters
702 .. index:: single: CheckingPoint
703 .. index:: single: Debug
704 .. index:: single: ObservationOperator
705 .. index:: single: Study_name
706 .. index:: single: Study_repertory
707 .. index:: single: UserDataInit
709 The second set of commands is related to the description of a checking case,
710 that is a procedure to check required properties on information somewhere else
711 by a calculation case. The terms are ordered in alphabetical order, except the
712 first, which describes choice between calculation or checking. The different
713 commands are the following:
716 *Required command*. This is the general command describing the checking
717 case. It hierarchically contains all the other commands.
720 *Required command*. This is a string to indicate the test algorithm chosen.
721 The choices are limited and available through the GUI. There exists for
722 example "FunctionTest", "AdjointTest"... See below the list of algorithms
723 and associated parameters in the following subsection `Optional and required
724 commands for checking algorithms`_.
726 **AlgorithmParameters** *Optional command*. This command allows to add some
727 optional parameters to control the data assimilation or optimization
728 algorithm. It is defined as a "*Dict*" type object, that is, given as a
729 script. See below the list of algorithms and associated parameters in the
730 following subsection `Optional and required commands for checking
734 *Required command*. This indicates the vector used as the state around which
735 to perform the required check, noted :math:`\mathbf{x}` and similar to the
736 background :math:`\mathbf{x}^b`. It is defined as a "*Vector*" type object.
739 *Optional command*. This define the level of trace and intermediary debug
740 information. The choices are limited between 0 (for False) and 1 (for
743 **ObservationOperator**
744 *Required command*. This indicates the observation operator, previously
745 noted :math:`H`, which transforms the input parameters :math:`\mathbf{x}` to
746 results :math:`\mathbf{y}` to be compared to observations
747 :math:`\mathbf{y}^o`. It is defined as a "*Function*" type object. Different
748 functional forms can be used, as described in the following subsection
749 `Requirements for functions describing an operator`_. If there is some
750 control :math:`U` included in the observation, the operator has to be
751 applied to a pair :math:`(X,U)`.
754 *Optional command*. This command allows to set internal observers, that are
755 functions linked with a particular variable, which will be executed each
756 time this variable is modified. It is a convenient way to monitor variables
757 of interest during the data assimilation or optimization process, by
758 printing or plotting it, etc. Common templates are provided to help the user
759 to start or to quickly make his case.
762 *Required command*. This is an open string to describe the study by a name
766 *Optional command*. If available, this directory is used as base name for
767 calculation, and used to find all the script files, given by name without
768 path, that can be used to define some other commands by scripts.
771 *Optional command*. This commands allows to initialize some parameters or
772 data automatically before data assimilation algorithm processing.
774 Optional and required commands for checking algorithms
775 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
777 .. index:: single: AdjointTest
778 .. index:: single: FunctionTest
779 .. index:: single: GradientTest
780 .. index:: single: LinearityTest
781 .. index:: single: ObserverTest
782 .. index:: single: TangentTest
784 .. index:: single: AlgorithmParameters
785 .. index:: single: AmplitudeOfInitialDirection
786 .. index:: single: EpsilonMinimumExponent
787 .. index:: single: InitialDirection
788 .. index:: single: ResiduFormula
789 .. index:: single: SetSeed
791 We recall that each algorithm can be controlled using some generic or specific
792 options, given through the "*AlgorithmParameters*" optional command, as follows
795 AlgorithmParameters = {
796 "AmplitudeOfInitialDirection" : 1,
797 "EpsilonMinimumExponent" : -8,
800 To give the "*AlgorithmParameters*" values by string, one must enclose a
801 standard dictionary definition between simple quotes, as for example::
803 '{"AmplitudeOfInitialDirection" : 1, "EpsilonMinimumExponent" : -8}'
805 If an option is specified by the user for an algorithm that doesn't support it,
806 the option is simply left unused and don't stop the treatment. The meaning of
807 the acronyms or particular names can be found in the :ref:`genindex` or the
808 :ref:`section_glossary`. In addition, for each algorithm, the required
809 commands/keywords are given, being described in `List of commands and keywords
810 for an ADAO checking case`_.
816 "ObservationOperator"*
818 AmplitudeOfInitialDirection
819 This key indicates the scaling of the initial perturbation build as a vector
820 used for the directional derivative around the nominal checking point. The
821 default is 1, that means no scaling.
823 EpsilonMinimumExponent
824 This key indicates the minimal exponent value of the power of 10 coefficient
825 to be used to decrease the increment multiplier. The default is -8, and it
826 has to be between 0 and -20. For example, its default value leads to
827 calculate the residue of the formula with a fixed increment multiplied from
831 This key indicates the vector direction used for the directional derivative
832 around the nominal checking point. It has to be a vector. If not specified,
833 this direction defaults to a random perturbation around zero of the same
834 vector size than the checking point.
837 This key allow to give an integer in order to fix the seed of the random
838 generator used to generate the ensemble. A convenient value is for example
839 1000. By default, the seed is left uninitialized, and so use the default
840 initialization from the computer.
846 "ObservationOperator"*
848 NumberOfPrintedDigits
849 This key indicates the number of digits of precision for floating point
850 printed output. The default is 5, with a minimum of 0.
853 This key indicates the number of time to repeat the function evaluation. The
857 This key requires the activation, or not, of the debug mode during the
858 function evaluation. The default is "True", the choices are "True" or
865 "ObservationOperator"*
867 AmplitudeOfInitialDirection
868 This key indicates the scaling of the initial perturbation build as a vector
869 used for the directional derivative around the nominal checking point. The
870 default is 1, that means no scaling.
872 EpsilonMinimumExponent
873 This key indicates the minimal exponent value of the power of 10 coefficient
874 to be used to decrease the increment multiplier. The default is -8, and it
875 has to be between 0 and -20. For example, its default value leads to
876 calculate the residue of the scalar product formula with a fixed increment
877 multiplied from 1.e0 to 1.e-8.
880 This key indicates the vector direction used for the directional derivative
881 around the nominal checking point. It has to be a vector. If not specified,
882 this direction defaults to a random perturbation around zero of the same
883 vector size than the checking point.
886 This key indicates the residue formula that has to be used for the test. The
887 default choice is "Taylor", and the possible ones are "Taylor" (residue of
888 the Taylor development of the operator, which has to decrease with the
889 square power of the perturbation) and "Norm" (residue obtained by taking the
890 norm of the Taylor development at zero order approximation, which
891 approximate the gradient, and which has to remain constant).
894 This key allow to give an integer in order to fix the seed of the random
895 generator used to generate the ensemble. A convenient value is for example
896 1000. By default, the seed is left uninitialized, and so use the default
897 initialization from the computer.
903 "ObservationOperator"*
905 AmplitudeOfInitialDirection
906 This key indicates the scaling of the initial perturbation build as a vector
907 used for the directional derivative around the nominal checking point. The
908 default is 1, that means no scaling.
910 EpsilonMinimumExponent
911 This key indicates the minimal exponent value of the power of 10 coefficient
912 to be used to decrease the increment multiplier. The default is -8, and it
913 has to be between 0 and -20. For example, its default value leads to
914 calculate the residue of the scalar product formula with a fixed increment
915 multiplied from 1.e0 to 1.e-8.
918 This key indicates the vector direction used for the directional derivative
919 around the nominal checking point. It has to be a vector. If not specified,
920 this direction defaults to a random perturbation around zero of the same
921 vector size than the checking point.
924 This key indicates the residue formula that has to be used for the test. The
925 default choice is "CenteredDL", and the possible ones are "CenteredDL"
926 (residue of the difference between the function at nominal point and the
927 values with positive and negative increments, which has to stay very small),
928 "Taylor" (residue of the Taylor development of the operator normalized by
929 the nominal value, which has to stay very small), "NominalTaylor" (residue
930 of the order 1 approximations of the operator, normalized to the nominal
931 point, which has to stay close to 1), and "NominalTaylorRMS" (residue of the
932 order 1 approximations of the operator, normalized by RMS to the nominal
933 point, which has to stay close to 0).
936 This key allow to give an integer in order to fix the seed of the random
937 generator used to generate the ensemble. A convenient value is for example
938 1000. By default, the seed is left uninitialized, and so use the default
939 initialization from the computer.
946 *Tip for this command:*
947 Because *"CheckingPoint"* and *"ObservationOperator"* are required commands
948 for ALL checking algorithms in the interface, you have to provide a value
949 for them, despite the fact that these commands are not required for
950 *"ObserverTest"*, and will not be used. The easiest way is to give "1" as a
951 STRING for both, *"ObservationOperator"* having to be of type *Matrix*.
957 "ObservationOperator"*
959 AmplitudeOfInitialDirection
960 This key indicates the scaling of the initial perturbation build as a vector
961 used for the directional derivative around the nominal checking point. The
962 default is 1, that means no scaling.
964 EpsilonMinimumExponent
965 This key indicates the minimal exponent value of the power of 10 coefficient
966 to be used to decrease the increment multiplier. The default is -8, and it
967 has to be between 0 and -20. For example, its default value leads to
968 calculate the residue of the scalar product formula with a fixed increment
969 multiplied from 1.e0 to 1.e-8.
972 This key indicates the vector direction used for the directional derivative
973 around the nominal checking point. It has to be a vector. If not specified,
974 this direction defaults to a random perturbation around zero of the same
975 vector size than the checking point.
978 This key allow to give an integer in order to fix the seed of the random
979 generator used to generate the ensemble. A convenient value is for example
980 1000. By default, the seed is left uninitialized, and so use the default
981 initialization from the computer.
983 Requirements for functions describing an operator
984 -------------------------------------------------
986 The operators for observation and evolution are required to implement the data
987 assimilation or optimization procedures. They include the physical simulation by
988 numerical calculations, but also the filtering and restriction to compare the
989 simulation to observation. The evolution operator is considered here in its
990 incremental form, representing the transition between two successive states, and
991 is then similar to the observation operator.
993 Schematically, an operator has to give a output solution given the input
994 parameters. Part of the input parameters can be modified during the optimization
995 procedure. So the mathematical representation of such a process is a function.
996 It was briefly described in the section :ref:`section_theory` and is generalized
997 here by the relation:
999 .. math:: \mathbf{y} = O( \mathbf{x} )
1001 between the pseudo-observations :math:`\mathbf{y}` and the parameters
1002 :math:`\mathbf{x}` using the observation or evolution operator :math:`O`. The
1003 same functional representation can be used for the linear tangent model
1004 :math:`\mathbf{O}` of :math:`O` and its adjoint :math:`\mathbf{O}^*`, also
1005 required by some data assimilation or optimization algorithms.
1007 On input and output of these operators, the :math:`\mathbf{x}` and
1008 :math:`\mathbf{y}` variables or their increments are mathematically vectors,
1009 and they are given as non-oriented vectors (of type list or Numpy array) or
1010 oriented ones (of type Numpy matrix).
1012 Then, **to describe completely an operator, the user has only to provide a
1013 function that fully and only realize the functional operation**.
1015 This function is usually given as a script that can be executed in a YACS node.
1016 This script can without difference launch external codes or use internal SALOME
1017 calls and methods. If the algorithm requires the 3 aspects of the operator
1018 (direct form, tangent form and adjoint form), the user has to give the 3
1019 functions or to approximate them.
1021 There are 3 practical methods for the user to provide an operator functional
1022 representation. These methods are chosen in the "*FROM*" field of each operator
1023 having a "*Function*" value as "*INPUT_TYPE*", as shown by the following figure:
1025 .. eficas_operator_function:
1026 .. image:: images/eficas_operator_function.png
1030 **Choosing an operator functional representation**
1032 First functional form: using "*ScriptWithOneFunction*"
1033 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1035 .. index:: single: ScriptWithOneFunction
1036 .. index:: single: DirectOperator
1037 .. index:: single: DifferentialIncrement
1038 .. index:: single: CenteredFiniteDifference
1040 The first one consist in providing only one potentially non-linear function, and
1041 to approximate the tangent and the adjoint operators. This is done by using the
1042 keyword "*ScriptWithOneFunction*" for the description of the chosen operator in
1043 the ADAO GUI. The user have to provide the function in a script, with a
1044 mandatory name "*DirectOperator*". For example, the script can follow the
1047 def DirectOperator( X ):
1048 """ Direct non-linear simulation operator """
1054 In this case, the user has also provide a value for the differential increment
1055 (or keep the default value), using through the GUI the keyword
1056 "*DifferentialIncrement*", which has a default value of 1%. This coefficient
1057 will be used in the finite differences approximation to build the tangent and
1058 adjoint operators. The finite differences approximation order can also be chosen
1059 through the GUI, using the keyword "*CenteredFiniteDifference*", with 0 for an
1060 uncentered schema of first order (which is the default value), and with 1 for a
1061 centered schema of second order (of twice the first order computational cost).
1063 This first operator definition form allows easily to test the functional form
1064 before its use in an ADAO case, greatly reducing the complexity of
1065 operator implementation.
1067 **Important warning:** the name "*DirectOperator*" is mandatory, and the type of
1068 the ``X`` argument can be either a list, a numpy array or a numpy 1D-matrix. The
1069 user has to treat these cases in his function.
1071 Second functional form: using "*ScriptWithFunctions*"
1072 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1074 .. index:: single: ScriptWithFunctions
1075 .. index:: single: DirectOperator
1076 .. index:: single: TangentOperator
1077 .. index:: single: AdjointOperator
1079 **In general, it is recommended to use the first functional form rather than
1080 the second one. A small performance improvement is not a good reason to use a
1081 detailed implementation as this second functional form.**
1083 The second one consist in providing directly the three associated operators
1084 :math:`O`, :math:`\mathbf{O}` and :math:`\mathbf{O}^*`. This is done by using
1085 the keyword "*ScriptWithFunctions*" for the description of the chosen operator
1086 in the ADAO GUI. The user have to provide three functions in one script, with
1087 three mandatory names "*DirectOperator*", "*TangentOperator*" and
1088 "*AdjointOperator*". For example, the script can follow the template::
1090 def DirectOperator( X ):
1091 """ Direct non-linear simulation operator """
1095 return something like Y
1097 def TangentOperator( (X, dX) ):
1098 """ Tangent linear operator, around X, applied to dX """
1102 return something like Y
1104 def AdjointOperator( (X, Y) ):
1105 """ Adjoint operator, around X, applied to Y """
1109 return something like X
1111 Another time, this second operator definition allow easily to test the
1112 functional forms before their use in an ADAO case, reducing the complexity of
1113 operator implementation.
1115 For some algorithms, it is required that the tangent and adjoint functions can
1116 return the matrix equivalent to the linear operator. In this case, when
1117 respectively the ``dX`` or the ``Y`` arguments are ``None``, the user has to
1118 return the associated matrix.
1120 **Important warning:** the names "*DirectOperator*", "*TangentOperator*" and
1121 "*AdjointOperator*" are mandatory, and the type of the ``X``, Y``, ``dX``
1122 arguments can be either a python list, a numpy array or a numpy 1D-matrix. The
1123 user has to treat these cases in his script.
1125 Third functional form: using "*ScriptWithSwitch*"
1126 +++++++++++++++++++++++++++++++++++++++++++++++++
1128 .. index:: single: ScriptWithSwitch
1129 .. index:: single: DirectOperator
1130 .. index:: single: TangentOperator
1131 .. index:: single: AdjointOperator
1133 **It is recommended not to use this third functional form without a solid
1134 numerical or physical reason. A performance improvement is not a good reason to
1135 use the implementation complexity of this third functional form. Only an
1136 inability to use the first or second forms justifies the use of the third.**
1138 This third form give more possibilities to control the execution of the three
1139 functions representing the operator, allowing advanced usage and control over
1140 each execution of the simulation code. This is done by using the keyword
1141 "*ScriptWithSwitch*" for the description of the chosen operator in the ADAO GUI.
1142 The user have to provide a switch in one script to control the execution of the
1143 direct, tangent and adjoint forms of its simulation code. The user can then, for
1144 example, use other approximations for the tangent and adjoint codes, or
1145 introduce more complexity in the argument treatment of the functions. But it
1146 will be far more complicated to implement and debug.
1148 If, however, you want to use this third form, we recommend using the following
1149 template for the switch. It requires an external script or code named here
1150 "*Physical_simulation_functions.py*", containing three functions named
1151 "*DirectOperator*", "*TangentOperator*" and "*AdjointOperator*" as previously.
1152 Here is the switch template::
1154 import Physical_simulation_functions
1155 import numpy, logging
1158 for param in computation["specificParameters"]:
1159 if param["name"] == "method":
1160 method = param["value"]
1161 if method not in ["Direct", "Tangent", "Adjoint"]:
1162 raise ValueError("No valid computation method is given")
1163 logging.info("Found method is \'%s\'"%method)
1165 logging.info("Loading operator functions")
1166 Function = Physical_simulation_functions.DirectOperator
1167 Tangent = Physical_simulation_functions.TangentOperator
1168 Adjoint = Physical_simulation_functions.AdjointOperator
1170 logging.info("Executing the possible computations")
1172 if method == "Direct":
1173 logging.info("Direct computation")
1174 Xcurrent = computation["inputValues"][0][0][0]
1175 data = Function(numpy.matrix( Xcurrent ).T)
1176 if method == "Tangent":
1177 logging.info("Tangent computation")
1178 Xcurrent = computation["inputValues"][0][0][0]
1179 dXcurrent = computation["inputValues"][0][0][1]
1180 data = Tangent(numpy.matrix(Xcurrent).T, numpy.matrix(dXcurrent).T)
1181 if method == "Adjoint":
1182 logging.info("Adjoint computation")
1183 Xcurrent = computation["inputValues"][0][0][0]
1184 Ycurrent = computation["inputValues"][0][0][1]
1185 data = Adjoint((numpy.matrix(Xcurrent).T, numpy.matrix(Ycurrent).T))
1187 logging.info("Formatting the output")
1188 it = numpy.ravel(data)
1189 outputValues = [[[[]]]]
1191 outputValues[0][0][0].append(val)
1194 result["outputValues"] = outputValues
1195 result["specificOutputInfos"] = []
1196 result["returnCode"] = 0
1197 result["errorMessage"] = ""
1199 All various modifications could be done from this template hypothesis.
1201 Special case of controled evolution or observation operator
1202 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1204 In some cases, the evolution or the observation operator is required to be
1205 controlled by an external input control, given *a priori*. In this case, the
1206 generic form of the incremental model is slightly modified as follows:
1208 .. math:: \mathbf{y} = O( \mathbf{x}, \mathbf{u})
1210 where :math:`\mathbf{u}` is the control over one state increment. In this case,
1211 the direct operator has to be applied to a pair of variables :math:`(X,U)`.
1212 Schematically, the operator has to be set as::
1214 def DirectOperator( (X, U) ):
1215 """ Direct non-linear simulation operator """
1219 return something like X(n+1) (evolution) or Y(n+1) (observation)
1221 The tangent and adjoint operators have the same signature as previously, noting
1222 that the derivatives has to be done only partially against :math:`\mathbf{x}`.
1223 In such a case with explicit control, only the second functional form (using
1224 "*ScriptWithFunctions*") and third functional form (using "*ScriptWithSwitch*")
1227 Requirements to describe covariance matrices
1228 --------------------------------------------
1230 Multiple covariance matrices are required to implement the data assimilation or
1231 optimization procedures. The main ones are the background error covariance
1232 matrix, noted as :math:`\mathbf{B}`, and the observation error covariance matrix,
1233 noted as :math:`\mathbf{R}`. Such a matrix is required to be a squared symetric
1234 semi-definite positive matrix.
1236 There are 3 practical methods for the user to provide a covariance matrix. These
1237 methods are chosen by the "*INPUT_TYPE*" keyword of each defined covariance
1238 matrix, as shown by the following figure:
1240 .. eficas_covariance_matrix:
1241 .. image:: images/eficas_covariance_matrix.png
1245 **Choosing covariance matrix representation**
1247 First matrix form: using "*Matrix*" representation
1248 ++++++++++++++++++++++++++++++++++++++++++++++++++
1250 .. index:: single: Matrix
1251 .. index:: single: BackgroundError
1252 .. index:: single: EvolutionError
1253 .. index:: single: ObservationError
1255 This first form is the default and more general one. The covariance matrix
1256 :math:`\mathbf{M}` has to be fully specified. Even if the matrix is symmetric by
1257 nature, the entire :math:`\mathbf{M}` matrix has to be given.
1259 .. math:: \mathbf{M} = \begin{pmatrix}
1260 m_{11} & m_{12} & \cdots & m_{1n} \\
1261 m_{21} & m_{22} & \cdots & m_{2n} \\
1262 \vdots & \vdots & \vdots & \vdots \\
1263 m_{n1} & \cdots & m_{nn-1} & m_{nn}
1266 It can be either a Python Numpy array or a matrix, or a list of lists of values
1267 (that is, a list of rows). For example, a simple diagonal unitary background
1268 error covariance matrix :math:`\mathbf{B}` can be described in a Python script
1271 BackgroundError = [[1, 0 ... 0], [0, 1 ... 0] ... [0, 0 ... 1]]
1275 BackgroundError = numpy.eye(...)
1277 Second matrix form: using "*ScalarSparseMatrix*" representation
1278 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1280 .. index:: single: ScalarSparseMatrix
1281 .. index:: single: BackgroundError
1282 .. index:: single: EvolutionError
1283 .. index:: single: ObservationError
1285 On the opposite, this second form is a very simplified method to provide a
1286 matrix. The covariance matrix :math:`\mathbf{M}` is supposed to be a positive
1287 multiple of the identity matrix. This matrix can then be specified in a unique
1288 way by the multiplier :math:`m`:
1290 .. math:: \mathbf{M} = m \times \begin{pmatrix}
1291 1 & 0 & \cdots & 0 \\
1292 0 & 1 & \cdots & 0 \\
1293 \vdots & \vdots & \vdots & \vdots \\
1297 The multiplier :math:`m` has to be a floating point or integer positive value
1298 (if it is negative, which is impossible for a positive covariance matrix, it is
1299 converted to positive value). For example, a simple diagonal unitary background
1300 error covariance matrix :math:`\mathbf{B}` can be described in a python script
1303 BackgroundError = 1.
1305 or, better, by a "*String*" directly in the ADAO case.
1307 Third matrix form: using "*DiagonalSparseMatrix*" representation
1308 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1310 .. index:: single: DiagonalSparseMatrix
1311 .. index:: single: BackgroundError
1312 .. index:: single: EvolutionError
1313 .. index:: single: ObservationError
1315 This third form is also a simplified method to provide a matrix, but a little
1316 more powerful than the second one. The covariance matrix :math:`\mathbf{M}` is
1317 already supposed to be diagonal, but the user has to specify all the positive
1318 diagonal values. The matrix can then be specified only by a vector
1319 :math:`\mathbf{V}` which will be set on a diagonal matrix:
1321 .. math:: \mathbf{M} = \begin{pmatrix}
1322 v_{1} & 0 & \cdots & 0 \\
1323 0 & v_{2} & \cdots & 0 \\
1324 \vdots & \vdots & \vdots & \vdots \\
1325 0 & \cdots & 0 & v_{n}
1328 It can be either a Python Numpy array or a matrix, or a list or a list of list
1329 of positive values (in all cases, if some are negative, which is impossible,
1330 they are converted to positive values). For example, a simple diagonal unitary
1331 background error covariance matrix :math:`\mathbf{B}` can be described in a
1332 python script file as::
1334 BackgroundError = [1, 1 ... 1]
1338 BackgroundError = numpy.ones(...)