3 ================================================================================
4 Reference description of the ADAO commands and keywords
5 ================================================================================
7 This section presents the reference description of the ADAO commands and
8 keywords available through the GUI or through scripts.
10 Each command or keyword to be defined through the ADAO GUI has some properties.
11 The first property is to be *required*, *optional* or only factual, describing a
12 type of input. The second property is to be an "open" variable with a fixed type
13 but with any value allowed by the type, or a "restricted" variable, limited to
14 some specified values. The EFICAS editor GUI having build-in validating
15 capacities, the properties of the commands or keywords given through this GUI
16 are automatically correct.
18 The mathematical notations used afterward are explained in the section
19 :ref:`section_theory`.
21 Examples of using these commands are available in the section
22 :ref:`section_examples` and in example files installed with ADAO module.
24 List of possible input types
25 ----------------------------
27 .. index:: single: Dict
28 .. index:: single: Function
29 .. index:: single: Matrix
30 .. index:: single: ScalarSparseMatrix
31 .. index:: single: DiagonalSparseMatrix
32 .. index:: single: String
33 .. index:: single: Script
34 .. index:: single: Vector
36 Each ADAO variable has a pseudo-type to help filling it and validation. The
37 different pseudo-types are:
40 This indicates a variable that has to be filled by a Python dictionary
41 ``{"key":"value...}``, usually given either as a string or as a script file.
44 This indicates a variable that has to be filled by a Python function,
45 usually given as a script file or a component method.
48 This indicates a variable that has to be filled by a matrix, usually given
49 either as a string or as a script file.
51 **ScalarSparseMatrix**
52 This indicates a variable that has to be filled by a unique number (which
53 will be used to multiply an identity matrix), usually given either as a
54 string or as a script file.
56 **DiagonalSparseMatrix**
57 This indicates a variable that has to be filled by a vector (which will be
58 used to replace the diagonal of an identity matrix), usually given either as
59 a string or as a script file.
62 This indicates a script given as an external file. It can be described by a
63 full absolute path name or only by the file name without path. If the file
64 is given only by a file name without path, and if a study directory is also
65 indicated, the file is searched in the given directory.
68 This indicates a string giving a literal representation of a matrix, a
69 vector or a vector series, such as "1 2 ; 3 4" or "[[1,2],[3,4]]" for a
73 This indicates a variable that has to be filled by a vector, usually given
74 either as a string or as a script file.
77 This indicates a variable that has to be filled by a list of
78 vectors, usually given either as a string or as a script file.
80 When a command or keyword can be filled by a script file name, the script has to
81 contain a variable or a method that has the same name as the one to be filled.
82 In other words, when importing the script in a YACS Python node, it must create
83 a variable of the good name in the current namespace of the node.
85 Reference description for ADAO calculation cases
86 ------------------------------------------------
88 List of commands and keywords for an ADAO calculation case
89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
91 .. index:: single: ASSIMILATION_STUDY
92 .. index:: single: Algorithm
93 .. index:: single: AlgorithmParameters
94 .. index:: single: Background
95 .. index:: single: BackgroundError
96 .. index:: single: ControlInput
97 .. index:: single: Debug
98 .. index:: single: EvolutionError
99 .. index:: single: EvolutionModel
100 .. index:: single: InputVariables
101 .. index:: single: Observation
102 .. index:: single: ObservationError
103 .. index:: single: ObservationOperator
104 .. index:: single: Observer
105 .. index:: single: Observers
106 .. index:: single: Observer Template
107 .. index:: single: OutputVariables
108 .. index:: single: Study_name
109 .. index:: single: Study_repertory
110 .. index:: single: UserDataInit
111 .. index:: single: UserPostAnalysis
112 .. index:: single: UserPostAnalysis Template
114 The first set of commands is related to the description of a calculation case,
115 that is a *Data Assimilation* procedure or an *Optimization* procedure. The
116 terms are ordered in alphabetical order, except the first, which describes
117 choice between calculation or checking. The different commands are the
120 **ASSIMILATION_STUDY**
121 *Required command*. This is the general command describing the data
122 assimilation or optimization case. It hierarchically contains all the other
126 *Required command*. This is a string to indicate the data assimilation or
127 optimization algorithm chosen. The choices are limited and available through
128 the GUI. There exists for example "3DVAR", "Blue"... See below the list of
129 algorithms and associated parameters in the following subsection `Optional
130 and required commands for calculation algorithms`_.
132 **AlgorithmParameters**
133 *Optional command*. This command allows to add some optional parameters to
134 control the data assimilation or optimization algorithm. Its value is
135 defined as a "*Dict*" type object. See below the list of algorithms and
136 associated parameters in the following subsection `Optional and required
137 commands for calculation algorithms`_.
140 *Required command*. This indicates the background or initial vector used,
141 previously noted as :math:`\mathbf{x}^b`. Its value is defined as a
142 "*Vector*" type object.
145 *Required command*. This indicates the background error covariance matrix,
146 previously noted as :math:`\mathbf{B}`. Its value is defined as a "*Matrix*"
147 type object, a "*ScalarSparseMatrix*" type object, or a
148 "*DiagonalSparseMatrix*" type object.
151 *Optional command*. This indicates the control vector used to force the
152 evolution model at each step, usually noted as :math:`\mathbf{U}`. Its value
153 is defined as a "*Vector*" or a *VectorSerie* type object. When there is no
154 control, it has to be a void string ''.
157 *Optional command*. This define the level of trace and intermediary debug
158 information. The choices are limited between 0 (for False) and 1 (for
162 *Optional command*. This indicates the evolution error covariance matrix,
163 usually noted as :math:`\mathbf{Q}`. It is defined as a "*Matrix*" type
164 object, a "*ScalarSparseMatrix*" type object, or a "*DiagonalSparseMatrix*"
168 *Optional command*. This indicates the evolution model operator, usually
169 noted :math:`M`, which describes an elementary step of evolution. Its value
170 is defined as a "*Function*" type object or a "*Matrix*" type one. In the
171 case of "*Function*" type, different functional forms can be used, as
172 described in the following subsection `Requirements for functions describing
173 an operator`_. If there is some control :math:`U` included in the evolution
174 model, the operator has to be applied to a pair :math:`(X,U)`.
177 *Optional command*. This command allows to indicates the name and size of
178 physical variables that are bundled together in the state vector. This
179 information is dedicated to data processed inside an algorithm.
182 *Required command*. This indicates the observation vector used for data
183 assimilation or optimization, previously noted as :math:`\mathbf{y}^o`. It
184 is defined as a "*Vector*" or a *VectorSerie* type object.
187 *Required command*. This indicates the observation error covariance matrix,
188 previously noted as :math:`\mathbf{R}`. It is defined as a "*Matrix*" type
189 object, a "*ScalarSparseMatrix*" type object, or a "*DiagonalSparseMatrix*"
192 **ObservationOperator**
193 *Required command*. This indicates the observation operator, previously
194 noted :math:`H`, which transforms the input parameters :math:`\mathbf{x}` to
195 results :math:`\mathbf{y}` to be compared to observations
196 :math:`\mathbf{y}^o`. Its value is defined as a "*Function*" type object or
197 a "*Matrix*" type one. In the case of "*Function*" type, different
198 functional forms can be used, as described in the following subsection
199 `Requirements for functions describing an operator`_. If there is some
200 control :math:`U` included in the observation, the operator has to be
201 applied to a pair :math:`(X,U)`.
204 *Optional command*. This command allows to set internal observers, that are
205 functions linked with a particular variable, which will be executed each
206 time this variable is modified. It is a convenient way to monitor variables
207 of interest during the data assimilation or optimization process, by
208 printing or plotting it, etc. Common templates are provided to help the user
209 to start or to quickly make his case.
212 *Optional command*. This command allows to indicates the name and size of
213 physical variables that are bundled together in the output observation
214 vector. This information is dedicated to data processed inside an algorithm.
217 *Required command*. This is an open string to describe the ADAO study by a
221 *Optional command*. If available, this directory is used as base name for
222 calculation, and used to find all the script files, given by name without
223 path, that can be used to define some other commands by scripts.
226 *Optional command*. This commands allows to initialize some parameters or
227 data automatically before data assimilation or optimization algorithm input
228 processing. It indicates a script file name to be executed before entering
229 in initialization phase of chosen variables.
232 *Optional command*. This commands allows to process some parameters or data
233 automatically after data assimilation or optimization algorithm processing.
234 Its value is defined as a script file or a string, allowing to put
235 post-processing code directly inside the ADAO case. Common templates are
236 provided to help the user to start or to quickly make his case.
238 Optional and required commands for calculation algorithms
239 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
241 .. index:: single: 3DVAR
242 .. index:: single: Blue
243 .. index:: single: ExtendedBlue
244 .. index:: single: EnsembleBlue
245 .. index:: single: KalmanFilter
246 .. index:: single: ExtendedKalmanFilter
247 .. index:: single: UnscentedKalmanFilter
248 .. index:: single: LinearLeastSquares
249 .. index:: single: NonLinearLeastSquares
250 .. index:: single: ParticleSwarmOptimization
251 .. index:: single: QuantileRegression
253 .. index:: single: AlgorithmParameters
254 .. index:: single: Bounds
255 .. index:: single: CostDecrementTolerance
256 .. index:: single: GradientNormTolerance
257 .. index:: single: GroupRecallRate
258 .. index:: single: MaximumNumberOfSteps
259 .. index:: single: Minimizer
260 .. index:: single: NumberOfInsects
261 .. index:: single: ProjectedGradientTolerance
262 .. index:: single: QualityCriterion
263 .. index:: single: Quantile
264 .. index:: single: SetSeed
265 .. index:: single: StoreInternalVariables
266 .. index:: single: StoreSupplementaryCalculations
267 .. index:: single: SwarmVelocity
269 Each algorithm can be controlled using some generic or specific options, given
270 through the "*AlgorithmParameters*" optional command in a script file or a
271 string, as follows for example in a file::
273 AlgorithmParameters = {
274 "Minimizer" : "LBFGSB",
275 "MaximumNumberOfSteps" : 25,
276 "StoreSupplementaryCalculations" : ["APosterioriCovariance","OMA"],
279 To give the "*AlgorithmParameters*" values by string, one must enclose a
280 standard dictionary definition between simple quotes, as for example::
282 '{"Minimizer":"LBFGSB","MaximumNumberOfSteps":25}'
284 This section describes the available options algorithm by algorithm. In
285 addition, for each algorithm, the required commands/keywords are given, being
286 described in `List of commands and keywords for an ADAO calculation case`_. If
287 an option is specified by the user for an algorithm that doesn't support it, the
288 option is simply left unused and don't stop the treatment. The meaning of the
289 acronyms or particular names can be found in the :ref:`genindex` or the
290 :ref:`section_glossary`.
295 *"Background", "BackgroundError",
296 "Observation", "ObservationError",
297 "ObservationOperator"*
299 StoreInternalVariables
300 This Boolean key allows to store default internal variables, mainly the
301 current state during iterative optimization process. Be careful, this can be
302 a numerically costly choice in certain calculation cases. The default is
305 StoreSupplementaryCalculations
306 This list indicates the names of the supplementary variables that can be
307 available at the end of the algorithm. It involves potentially costly
308 calculations. The default is a void list, none of these variables being
309 calculated and stored by default. The possible names are in the following
310 list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
311 "SigmaBck2", "SigmaObs2", "MahalanobisConsistency"].
316 *"Background", "BackgroundError",
317 "Observation", "ObservationError",
318 "ObservationOperator"*
320 StoreInternalVariables
321 This Boolean key allows to store default internal variables, mainly the
322 current state during iterative optimization process. Be careful, this can be
323 a numerically costly choice in certain calculation cases. The default is
326 StoreSupplementaryCalculations
327 This list indicates the names of the supplementary variables that can be
328 available at the end of the algorithm. It involves potentially costly
329 calculations. The default is a void list, none of these variables being
330 calculated and stored by default. The possible names are in the following
331 list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
332 "SigmaBck2", "SigmaObs2", "MahalanobisConsistency"].
334 **"LinearLeastSquares"**
337 *"Observation", "ObservationError",
338 "ObservationOperator"*
340 StoreInternalVariables
341 This Boolean key allows to store default internal variables, mainly the
342 current state during iterative optimization process. Be careful, this can be
343 a numerically costly choice in certain calculation cases. The default is
346 StoreSupplementaryCalculations
347 This list indicates the names of the supplementary variables that can be
348 available at the end of the algorithm. It involves potentially costly
349 calculations. The default is a void list, none of these variables being
350 calculated and stored by default. The possible names are in the following
356 *"Background", "BackgroundError",
357 "Observation", "ObservationError",
358 "ObservationOperator"*
361 This key allows to choose the optimization minimizer. The default choice is
362 "LBFGSB", and the possible ones are "LBFGSB" (nonlinear constrained
363 minimizer, see [Byrd95]_, [Morales11]_ and [Zhu97]_), "TNC" (nonlinear
364 constrained minimizer), "CG" (nonlinear unconstrained minimizer), "BFGS"
365 (nonlinear unconstrained minimizer), "NCG" (Newton CG minimizer). It is
366 strongly recommended to stay with the default.
369 This key allows to define upper and lower bounds for every state variable
370 being optimized. Bounds can be given by a list of list of pairs of
371 lower/upper bounds for each variable, with possibly ``None`` every time
372 there is no bound. The bounds can always be specified, but they are taken
373 into account only by the constrained optimizers.
376 This key indicates the maximum number of iterations allowed for iterative
377 optimization. The default is 15000, which is very similar to no limit on
378 iterations. It is then recommended to adapt this parameter to the needs on
379 real problems. For some optimizers, the effective stopping step can be
380 slightly different of the limit due to algorithm internal control
383 CostDecrementTolerance
384 This key indicates a limit value, leading to stop successfully the
385 iterative optimization process when the cost function decreases less than
386 this tolerance at the last step. The default is 1.e-7, and it is
387 recommended to adapt it to the needs on real problems.
389 ProjectedGradientTolerance
390 This key indicates a limit value, leading to stop successfully the iterative
391 optimization process when all the components of the projected gradient are
392 under this limit. It is only used for constrained optimizers. The default is
393 -1, that is the internal default of each minimizer (generally 1.e-5), and it
394 is not recommended to change it.
396 GradientNormTolerance
397 This key indicates a limit value, leading to stop successfully the
398 iterative optimization process when the norm of the gradient is under this
399 limit. It is only used for non-constrained optimizers. The default is
400 1.e-5 and it is not recommended to change it.
402 StoreInternalVariables
403 This Boolean key allows to store default internal variables, mainly the
404 current state during iterative optimization process. Be careful, this can be
405 a numerically costly choice in certain calculation cases. The default is
408 StoreSupplementaryCalculations
409 This list indicates the names of the supplementary variables that can be
410 available at the end of the algorithm. It involves potentially costly
411 calculations. The default is a void list, none of these variables being
412 calculated and stored by default. The possible names are in the following
413 list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
414 "SigmaObs2", "MahalanobisConsistency"].
416 **"NonLinearLeastSquares"**
420 "Observation", "ObservationError",
421 "ObservationOperator"*
424 This key allows to choose the optimization minimizer. The default choice is
425 "LBFGSB", and the possible ones are "LBFGSB" (nonlinear constrained
426 minimizer, see [Byrd95]_, [Morales11]_ and [Zhu97]_), "TNC" (nonlinear
427 constrained minimizer), "CG" (nonlinear unconstrained minimizer), "BFGS"
428 (nonlinear unconstrained minimizer), "NCG" (Newton CG minimizer). It is
429 strongly recommended to stay with the default.
432 This key allows to define upper and lower bounds for every state variable
433 being optimized. Bounds can be given by a list of list of pairs of
434 lower/upper bounds for each variable, with possibly ``None`` every time
435 there is no bound. The bounds can always be specified, but they are taken
436 into account only by the constrained optimizers.
439 This key indicates the maximum number of iterations allowed for iterative
440 optimization. The default is 15000, which is very similar to no limit on
441 iterations. It is then recommended to adapt this parameter to the needs on
442 real problems. For some optimizers, the effective stopping step can be
443 slightly different due to algorithm internal control requirements.
445 CostDecrementTolerance
446 This key indicates a limit value, leading to stop successfully the
447 iterative optimization process when the cost function decreases less than
448 this tolerance at the last step. The default is 1.e-7, and it is
449 recommended to adapt it to the needs on real problems.
451 ProjectedGradientTolerance
452 This key indicates a limit value, leading to stop successfully the iterative
453 optimization process when all the components of the projected gradient are
454 under this limit. It is only used for constrained optimizers. The default is
455 -1, that is the internal default of each minimizer (generally 1.e-5), and it
456 is not recommended to change it.
458 GradientNormTolerance
459 This key indicates a limit value, leading to stop successfully the
460 iterative optimization process when the norm of the gradient is under this
461 limit. It is only used for non-constrained optimizers. The default is
462 1.e-5 and it is not recommended to change it.
464 StoreInternalVariables
465 This Boolean key allows to store default internal variables, mainly the
466 current state during iterative optimization process. Be careful, this can be
467 a numerically costly choice in certain calculation cases. The default is
470 StoreSupplementaryCalculations
471 This list indicates the names of the supplementary variables that can be
472 available at the end of the algorithm. It involves potentially costly
473 calculations. The default is a void list, none of these variables being
474 calculated and stored by default. The possible names are in the following
475 list: ["BMA", "OMA", "OMB", "Innovation"].
480 *"Background", "BackgroundError",
481 "Observation", "ObservationError",
482 "ObservationOperator"*
485 This key allow to give an integer in order to fix the seed of the random
486 generator used to generate the ensemble. A convenient value is for example
487 1000. By default, the seed is left uninitialized, and so use the default
488 initialization from the computer.
493 *"Background", "BackgroundError",
494 "Observation", "ObservationError",
495 "ObservationOperator"*
498 This key allows to choose the type of estimation to be performed. It can be
499 either state-estimation, with a value of "State", or parameter-estimation,
500 with a value of "Parameters". The default choice is "State".
502 StoreInternalVariables
503 This Boolean key allows to store default internal variables, mainly the
504 current state during iterative optimization process. Be careful, this can be
505 a numerically costly choice in certain calculation cases. The default is
508 StoreSupplementaryCalculations
509 This list indicates the names of the supplementary variables that can be
510 available at the end of the algorithm. It involves potentially costly
511 calculations. The default is a void list, none of these variables being
512 calculated and stored by default. The possible names are in the following
513 list: ["APosterioriCovariance", "BMA", "Innovation"].
515 **"ExtendedKalmanFilter"**
518 *"Background", "BackgroundError",
519 "Observation", "ObservationError",
520 "ObservationOperator"*
523 This key allows to define upper and lower bounds for every state variable
524 being optimized. Bounds can be given by a list of list of pairs of
525 lower/upper bounds for each variable, with extreme values every time there
526 is no bound. The bounds can always be specified, but they are taken into
527 account only by the constrained optimizers.
530 This key allows to define the method to take bounds into account. The
531 possible methods are in the following list: ["EstimateProjection"].
534 This key allows to choose the type of estimation to be performed. It can be
535 either state-estimation, with a value of "State", or parameter-estimation,
536 with a value of "Parameters". The default choice is "State".
538 StoreInternalVariables
539 This Boolean key allows to store default internal variables, mainly the
540 current state during iterative optimization process. Be careful, this can be
541 a numerically costly choice in certain calculation cases. The default is
544 StoreSupplementaryCalculations
545 This list indicates the names of the supplementary variables that can be
546 available at the end of the algorithm. It involves potentially costly
547 calculations. The default is a void list, none of these variables being
548 calculated and stored by default. The possible names are in the following
549 list: ["APosterioriCovariance", "BMA", "Innovation"].
551 **"UnscentedKalmanFilter"**
554 *"Background", "BackgroundError",
555 "Observation", "ObservationError",
556 "ObservationOperator"*
559 This key allows to define upper and lower bounds for every state variable
560 being optimized. Bounds can be given by a list of list of pairs of
561 lower/upper bounds for each variable, with extreme values every time there
562 is no bound. The bounds can always be specified, but they are taken into
563 account only by the constrained optimizers.
566 This key allows to define the method to take bounds into account. The
567 possible methods are in the following list: ["EstimateProjection"].
570 This key allows to choose the type of estimation to be performed. It can be
571 either state-estimation, with a value of "State", or parameter-estimation,
572 with a value of "Parameters". The default choice is "State".
574 Alpha, Beta, Kappa, Reconditioner
575 These keys are internal scaling parameters. "Alpha" requires a value between
576 1.e-4 and 1. "Beta" has an optimal value of 2 for Gaussian *a priori*
577 distribution. "Kappa" requires an integer value, and the right default is
578 obtained by setting it to 0. "Reconditioner" requires a value between 1.e-3
579 and 10, it defaults to 1.
581 StoreInternalVariables
582 This Boolean key allows to store default internal variables, mainly the
583 current state during iterative optimization process. Be careful, this can be
584 a numerically costly choice in certain calculation cases. The default is
587 StoreSupplementaryCalculations
588 This list indicates the names of the supplementary variables that can be
589 available at the end of the algorithm. It involves potentially costly
590 calculations. The default is a void list, none of these variables being
591 calculated and stored by default. The possible names are in the following
592 list: ["APosterioriCovariance", "BMA", "Innovation"].
594 **"ParticleSwarmOptimization"**
597 *"Background", "BackgroundError",
598 "Observation", "ObservationError",
599 "ObservationOperator"*
602 This key indicates the maximum number of iterations allowed for iterative
603 optimization. The default is 50, which is an arbitrary limit. It is then
604 recommended to adapt this parameter to the needs on real problems.
607 This key indicates the number of insects or particles in the swarm. The
608 default is 100, which is a usual default for this algorithm.
611 This key indicates the part of the insect velocity which is imposed by the
612 swarm. It is a positive floating point value. The default value is 1.
615 This key indicates the recall rate at the best swarm insect. It is a
616 floating point value between 0 and 1. The default value is 0.5.
619 This key indicates the quality criterion, minimized to find the optimal
620 state estimate. The default is the usual data assimilation criterion named
621 "DA", the augmented weighted least squares. The possible criteria has to
622 be in the following list, where the equivalent names are indicated by "=":
623 ["AugmentedPonderatedLeastSquares"="APLS"="DA",
624 "PonderatedLeastSquares"="PLS", "LeastSquares"="LS"="L2",
625 "AbsoluteValue"="L1", "MaximumError"="ME"]
628 This key allow to give an integer in order to fix the seed of the random
629 generator used to generate the ensemble. A convenient value is for example
630 1000. By default, the seed is left uninitialized, and so use the default
631 initialization from the computer.
633 StoreInternalVariables
634 This Boolean key allows to store default internal variables, mainly the
635 current state during iterative optimization process. Be careful, this can be
636 a numerically costly choice in certain calculation cases. The default is
639 StoreSupplementaryCalculations
640 This list indicates the names of the supplementary variables that can be
641 available at the end of the algorithm. It involves potentially costly
642 calculations. The default is a void list, none of these variables being
643 calculated and stored by default. The possible names are in the following
644 list: ["BMA", "OMA", "OMB", "Innovation"].
646 **"QuantileRegression"**
651 "ObservationOperator"*
654 This key allows to define the real value of the desired quantile, between
655 0 and 1. The default is 0.5, corresponding to the median.
658 This key allows to choose the optimization minimizer. The default choice
659 and only available choice is "MMQR" (Majorize-Minimize for Quantile
663 This key indicates the maximum number of iterations allowed for iterative
664 optimization. The default is 15000, which is very similar to no limit on
665 iterations. It is then recommended to adapt this parameter to the needs on
668 CostDecrementTolerance
669 This key indicates a limit value, leading to stop successfully the
670 iterative optimization process when the cost function or the surrogate
671 decreases less than this tolerance at the last step. The default is 1.e-6,
672 and it is recommended to adapt it to the needs on real problems.
674 StoreInternalVariables
675 This Boolean key allows to store default internal variables, mainly the
676 current state during iterative optimization process. Be careful, this can be
677 a numerically costly choice in certain calculation cases. The default is
680 StoreSupplementaryCalculations
681 This list indicates the names of the supplementary variables that can be
682 available at the end of the algorithm. It involves potentially costly
683 calculations. The default is a void list, none of these variables being
684 calculated and stored by default. The possible names are in the following
685 list: ["BMA", "OMA", "OMB", "Innovation"].
687 Reference description for ADAO checking cases
688 ---------------------------------------------
690 List of commands and keywords for an ADAO checking case
691 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
693 .. index:: single: CHECKING_STUDY
694 .. index:: single: Algorithm
695 .. index:: single: AlgorithmParameters
696 .. index:: single: CheckingPoint
697 .. index:: single: Debug
698 .. index:: single: ObservationOperator
699 .. index:: single: Study_name
700 .. index:: single: Study_repertory
701 .. index:: single: UserDataInit
703 The second set of commands is related to the description of a checking case,
704 that is a procedure to check required properties on information somewhere else
705 by a calculation case. The terms are ordered in alphabetical order, except the
706 first, which describes choice between calculation or checking. The different
707 commands are the following:
710 *Required command*. This is the general command describing the checking
711 case. It hierarchically contains all the other commands.
714 *Required command*. This is a string to indicate the test algorithm chosen.
715 The choices are limited and available through the GUI. There exists for
716 example "FunctionTest", "AdjointTest"... See below the list of algorithms
717 and associated parameters in the following subsection `Optional and required
718 commands for checking algorithms`_.
720 **AlgorithmParameters** *Optional command*. This command allows to add some
721 optional parameters to control the data assimilation or optimization
722 algorithm. It is defined as a "*Dict*" type object, that is, given as a
723 script. See below the list of algorithms and associated parameters in the
724 following subsection `Optional and required commands for checking
728 *Required command*. This indicates the vector used as the state around which
729 to perform the required check, noted :math:`\mathbf{x}` and similar to the
730 background :math:`\mathbf{x}^b`. It is defined as a "*Vector*" type object.
733 *Optional command*. This define the level of trace and intermediary debug
734 information. The choices are limited between 0 (for False) and 1 (for
737 **ObservationOperator**
738 *Required command*. This indicates the observation operator, previously
739 noted :math:`H`, which transforms the input parameters :math:`\mathbf{x}` to
740 results :math:`\mathbf{y}` to be compared to observations
741 :math:`\mathbf{y}^o`. It is defined as a "*Function*" type object. Different
742 functional forms can be used, as described in the following subsection
743 `Requirements for functions describing an operator`_. If there is some
744 control :math:`U` included in the observation, the operator has to be
745 applied to a pair :math:`(X,U)`.
748 *Optional command*. This command allows to set internal observers, that are
749 functions linked with a particular variable, which will be executed each
750 time this variable is modified. It is a convenient way to monitor variables
751 of interest during the data assimilation or optimization process, by
752 printing or plotting it, etc. Common templates are provided to help the user
753 to start or to quickly make his case.
756 *Required command*. This is an open string to describe the study by a name
760 *Optional command*. If available, this directory is used as base name for
761 calculation, and used to find all the script files, given by name without
762 path, that can be used to define some other commands by scripts.
765 *Optional command*. This commands allows to initialize some parameters or
766 data automatically before data assimilation algorithm processing.
768 Optional and required commands for checking algorithms
769 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
771 .. index:: single: AdjointTest
772 .. index:: single: FunctionTest
773 .. index:: single: GradientTest
774 .. index:: single: LinearityTest
775 .. index:: single: ObserverTest
776 .. index:: single: TangentTest
778 .. index:: single: AlgorithmParameters
779 .. index:: single: AmplitudeOfInitialDirection
780 .. index:: single: EpsilonMinimumExponent
781 .. index:: single: InitialDirection
782 .. index:: single: ResiduFormula
783 .. index:: single: SetSeed
785 We recall that each algorithm can be controlled using some generic or specific
786 options, given through the "*AlgorithmParameters*" optional command, as follows
789 AlgorithmParameters = {
790 "AmplitudeOfInitialDirection" : 1,
791 "EpsilonMinimumExponent" : -8,
794 If an option is specified by the user for an algorithm that doesn't support it,
795 the option is simply left unused and don't stop the treatment. The meaning of
796 the acronyms or particular names can be found in the :ref:`genindex` or the
797 :ref:`section_glossary`. In addition, for each algorithm, the required
798 commands/keywords are given, being described in `List of commands and keywords
799 for an ADAO checking case`_.
805 "ObservationOperator"*
807 AmplitudeOfInitialDirection
808 This key indicates the scaling of the initial perturbation build as a vector
809 used for the directional derivative around the nominal checking point. The
810 default is 1, that means no scaling.
812 EpsilonMinimumExponent
813 This key indicates the minimal exponent value of the power of 10 coefficient
814 to be used to decrease the increment multiplier. The default is -8, and it
815 has to be between 0 and -20. For example, its default value leads to
816 calculate the residue of the formula with a fixed increment multiplied from
820 This key indicates the vector direction used for the directional derivative
821 around the nominal checking point. It has to be a vector. If not specified,
822 this direction defaults to a random perturbation around zero of the same
823 vector size than the checking point.
826 This key allow to give an integer in order to fix the seed of the random
827 generator used to generate the ensemble. A convenient value is for example
828 1000. By default, the seed is left uninitialized, and so use the default
829 initialization from the computer.
835 "ObservationOperator"*
837 NumberOfPrintedDigits
838 This key indicates the number of digits of precision for floating point
839 printed output. The default is 5, with a minimum of 0.
842 This key indicates the number of time to repeat the function evaluation. The
846 This key requires the activation, or not, of the debug mode during the
847 function evaluation. The default is "True", the choices are "True" or
854 "ObservationOperator"*
856 AmplitudeOfInitialDirection
857 This key indicates the scaling of the initial perturbation build as a vector
858 used for the directional derivative around the nominal checking point. The
859 default is 1, that means no scaling.
861 EpsilonMinimumExponent
862 This key indicates the minimal exponent value of the power of 10 coefficient
863 to be used to decrease the increment multiplier. The default is -8, and it
864 has to be between 0 and -20. For example, its default value leads to
865 calculate the residue of the scalar product formula with a fixed increment
866 multiplied from 1.e0 to 1.e-8.
869 This key indicates the vector direction used for the directional derivative
870 around the nominal checking point. It has to be a vector. If not specified,
871 this direction defaults to a random perturbation around zero of the same
872 vector size than the checking point.
875 This key indicates the residue formula that has to be used for the test. The
876 default choice is "Taylor", and the possible ones are "Taylor" (residue of
877 the Taylor development of the operator, which has to decrease with the
878 square power of the perturbation) and "Norm" (residue obtained by taking the
879 norm of the Taylor development at zero order approximation, which
880 approximate the gradient, and which has to remain constant).
883 This key allow to give an integer in order to fix the seed of the random
884 generator used to generate the ensemble. A convenient value is for example
885 1000. By default, the seed is left uninitialized, and so use the default
886 initialization from the computer.
892 "ObservationOperator"*
894 AmplitudeOfInitialDirection
895 This key indicates the scaling of the initial perturbation build as a vector
896 used for the directional derivative around the nominal checking point. The
897 default is 1, that means no scaling.
899 EpsilonMinimumExponent
900 This key indicates the minimal exponent value of the power of 10 coefficient
901 to be used to decrease the increment multiplier. The default is -8, and it
902 has to be between 0 and -20. For example, its default value leads to
903 calculate the residue of the scalar product formula with a fixed increment
904 multiplied from 1.e0 to 1.e-8.
907 This key indicates the vector direction used for the directional derivative
908 around the nominal checking point. It has to be a vector. If not specified,
909 this direction defaults to a random perturbation around zero of the same
910 vector size than the checking point.
913 This key indicates the residue formula that has to be used for the test. The
914 default choice is "CenteredDL", and the possible ones are "CenteredDL"
915 (residue of the difference between the function at nominal point and the
916 values with positive and negative increments, which has to stay very small),
917 "Taylor" (residue of the Taylor development of the operator normalized by
918 the nominal value, which has to stay very small), "NominalTaylor" (residue
919 of the order 1 approximations of the operator, normalized to the nominal
920 point, which has to stay close to 1), and "NominalTaylorRMS" (residue of the
921 order 1 approximations of the operator, normalized by RMS to the nominal
922 point, which has to stay close to 0).
925 This key allow to give an integer in order to fix the seed of the random
926 generator used to generate the ensemble. A convenient value is for example
927 1000. By default, the seed is left uninitialized, and so use the default
928 initialization from the computer.
935 *Tip for this command:*
936 Because *"CheckingPoint"* and *"ObservationOperator"* are required commands
937 for ALL checking algorithms in the interface, you have to provide a value
938 for them, despite the fact that these commands are not required for
939 *"ObserverTest"*, and will not be used. The easiest way is to give "1" as a
940 STRING for both, *"ObservationOperator"* having to be of type *Matrix*.
946 "ObservationOperator"*
948 AmplitudeOfInitialDirection
949 This key indicates the scaling of the initial perturbation build as a vector
950 used for the directional derivative around the nominal checking point. The
951 default is 1, that means no scaling.
953 EpsilonMinimumExponent
954 This key indicates the minimal exponent value of the power of 10 coefficient
955 to be used to decrease the increment multiplier. The default is -8, and it
956 has to be between 0 and -20. For example, its default value leads to
957 calculate the residue of the scalar product formula with a fixed increment
958 multiplied from 1.e0 to 1.e-8.
961 This key indicates the vector direction used for the directional derivative
962 around the nominal checking point. It has to be a vector. If not specified,
963 this direction defaults to a random perturbation around zero of the same
964 vector size than the checking point.
967 This key allow to give an integer in order to fix the seed of the random
968 generator used to generate the ensemble. A convenient value is for example
969 1000. By default, the seed is left uninitialized, and so use the default
970 initialization from the computer.
972 Requirements for functions describing an operator
973 -------------------------------------------------
975 The operators for observation and evolution are required to implement the data
976 assimilation or optimization procedures. They include the physical simulation by
977 numerical calculations, but also the filtering and restriction to compare the
978 simulation to observation. The evolution operator is considered here in its
979 incremental form, representing the transition between two successive states, and
980 is then similar to the observation operator.
982 Schematically, an operator has to give a output solution given the input
983 parameters. Part of the input parameters can be modified during the optimization
984 procedure. So the mathematical representation of such a process is a function.
985 It was briefly described in the section :ref:`section_theory` and is generalized
986 here by the relation:
988 .. math:: \mathbf{y} = O( \mathbf{x} )
990 between the pseudo-observations :math:`\mathbf{y}` and the parameters
991 :math:`\mathbf{x}` using the observation or evolution operator :math:`O`. The
992 same functional representation can be used for the linear tangent model
993 :math:`\mathbf{O}` of :math:`O` and its adjoint :math:`\mathbf{O}^*`, also
994 required by some data assimilation or optimization algorithms.
996 On input and output of these operators, the :math:`\mathbf{x}` and
997 :math:`\mathbf{y}` variables or their increments are mathematically vectors,
998 and they are given as non-oriented vectors (of type list or Numpy array) or
999 oriented ones (of type Numpy matrix).
1001 Then, **to describe completely an operator, the user has only to provide a
1002 function that fully and only realize the functional operation**.
1004 This function is usually given as a script that can be executed in a YACS node.
1005 This script can without difference launch external codes or use internal SALOME
1006 calls and methods. If the algorithm requires the 3 aspects of the operator
1007 (direct form, tangent form and adjoint form), the user has to give the 3
1008 functions or to approximate them.
1010 There are 3 practical methods for the user to provide an operator functional
1011 representation. These methods are chosen in the "*FROM*" field of each operator
1012 having a "*Function*" value as "*INPUT_TYPE*", as shown by the following figure:
1014 .. eficas_operator_function:
1015 .. image:: images/eficas_operator_function.png
1019 **Choosing an operator functional representation**
1021 First functional form: using "*ScriptWithOneFunction*"
1022 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1024 .. index:: single: ScriptWithOneFunction
1025 .. index:: single: DirectOperator
1026 .. index:: single: DifferentialIncrement
1027 .. index:: single: CenteredFiniteDifference
1029 The first one consist in providing only one potentially non-linear function, and
1030 to approximate the tangent and the adjoint operators. This is done by using the
1031 keyword "*ScriptWithOneFunction*" for the description of the chosen operator in
1032 the ADAO GUI. The user have to provide the function in a script, with a
1033 mandatory name "*DirectOperator*". For example, the script can follow the
1036 def DirectOperator( X ):
1037 """ Direct non-linear simulation operator """
1043 In this case, the user has also provide a value for the differential increment
1044 (or keep the default value), using through the GUI the keyword
1045 "*DifferentialIncrement*", which has a default value of 1%. This coefficient
1046 will be used in the finite differences approximation to build the tangent and
1047 adjoint operators. The finite differences approximation order can also be chosen
1048 through the GUI, using the keyword "*CenteredFiniteDifference*", with 0 for an
1049 uncentered schema of first order (which is the default value), and with 1 for a
1050 centered schema of second order (of twice the first order computational cost).
1052 This first operator definition form allows easily to test the functional form
1053 before its use in an ADAO case, greatly reducing the complexity of
1054 operator implementation.
1056 **Important warning:** the name "*DirectOperator*" is mandatory, and the type of
1057 the ``X`` argument can be either a list, a numpy array or a numpy 1D-matrix. The
1058 user has to treat these cases in his function.
1060 Second functional form: using "*ScriptWithFunctions*"
1061 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1063 .. index:: single: ScriptWithFunctions
1064 .. index:: single: DirectOperator
1065 .. index:: single: TangentOperator
1066 .. index:: single: AdjointOperator
1068 **In general, it is recommended to use the first functional form rather than
1069 the second one. A small performance improvement is not a good reason to use a
1070 detailed implementation as this second functional form.**
1072 The second one consist in providing directly the three associated operators
1073 :math:`O`, :math:`\mathbf{O}` and :math:`\mathbf{O}^*`. This is done by using
1074 the keyword "*ScriptWithFunctions*" for the description of the chosen operator
1075 in the ADAO GUI. The user have to provide three functions in one script, with
1076 three mandatory names "*DirectOperator*", "*TangentOperator*" and
1077 "*AdjointOperator*". For example, the script can follow the template::
1079 def DirectOperator( X ):
1080 """ Direct non-linear simulation operator """
1084 return something like Y
1086 def TangentOperator( (X, dX) ):
1087 """ Tangent linear operator, around X, applied to dX """
1091 return something like Y
1093 def AdjointOperator( (X, Y) ):
1094 """ Adjoint operator, around X, applied to Y """
1098 return something like X
1100 Another time, this second operator definition allow easily to test the
1101 functional forms before their use in an ADAO case, reducing the complexity of
1102 operator implementation.
1104 For some algorithms, it is required that the tangent and adjoint functions can
1105 return the matrix equivalent to the linear operator. In this case, when
1106 respectively the ``dX`` or the ``Y`` arguments are ``None``, the user has to
1107 return the associated matrix.
1109 **Important warning:** the names "*DirectOperator*", "*TangentOperator*" and
1110 "*AdjointOperator*" are mandatory, and the type of the ``X``, Y``, ``dX``
1111 arguments can be either a python list, a numpy array or a numpy 1D-matrix. The
1112 user has to treat these cases in his script.
1114 Third functional form: using "*ScriptWithSwitch*"
1115 +++++++++++++++++++++++++++++++++++++++++++++++++
1117 .. index:: single: ScriptWithSwitch
1118 .. index:: single: DirectOperator
1119 .. index:: single: TangentOperator
1120 .. index:: single: AdjointOperator
1122 **It is recommended not to use this third functional form without a solid
1123 numerical or physical reason. A performance improvement is not a good reason to
1124 use the implementation complexity of this third functional form. Only an
1125 inability to use the first or second forms justifies the use of the third.**
1127 This third form give more possibilities to control the execution of the three
1128 functions representing the operator, allowing advanced usage and control over
1129 each execution of the simulation code. This is done by using the keyword
1130 "*ScriptWithSwitch*" for the description of the chosen operator in the ADAO GUI.
1131 The user have to provide a switch in one script to control the execution of the
1132 direct, tangent and adjoint forms of its simulation code. The user can then, for
1133 example, use other approximations for the tangent and adjoint codes, or
1134 introduce more complexity in the argument treatment of the functions. But it
1135 will be far more complicated to implement and debug.
1137 If, however, you want to use this third form, we recommend using the following
1138 template for the switch. It requires an external script or code named here
1139 "*Physical_simulation_functions.py*", containing three functions named
1140 "*DirectOperator*", "*TangentOperator*" and "*AdjointOperator*" as previously.
1141 Here is the switch template::
1143 import Physical_simulation_functions
1144 import numpy, logging
1147 for param in computation["specificParameters"]:
1148 if param["name"] == "method":
1149 method = param["value"]
1150 if method not in ["Direct", "Tangent", "Adjoint"]:
1151 raise ValueError("No valid computation method is given")
1152 logging.info("Found method is \'%s\'"%method)
1154 logging.info("Loading operator functions")
1155 Function = Physical_simulation_functions.DirectOperator
1156 Tangent = Physical_simulation_functions.TangentOperator
1157 Adjoint = Physical_simulation_functions.AdjointOperator
1159 logging.info("Executing the possible computations")
1161 if method == "Direct":
1162 logging.info("Direct computation")
1163 Xcurrent = computation["inputValues"][0][0][0]
1164 data = Function(numpy.matrix( Xcurrent ).T)
1165 if method == "Tangent":
1166 logging.info("Tangent computation")
1167 Xcurrent = computation["inputValues"][0][0][0]
1168 dXcurrent = computation["inputValues"][0][0][1]
1169 data = Tangent(numpy.matrix(Xcurrent).T, numpy.matrix(dXcurrent).T)
1170 if method == "Adjoint":
1171 logging.info("Adjoint computation")
1172 Xcurrent = computation["inputValues"][0][0][0]
1173 Ycurrent = computation["inputValues"][0][0][1]
1174 data = Adjoint((numpy.matrix(Xcurrent).T, numpy.matrix(Ycurrent).T))
1176 logging.info("Formatting the output")
1177 it = numpy.ravel(data)
1178 outputValues = [[[[]]]]
1180 outputValues[0][0][0].append(val)
1183 result["outputValues"] = outputValues
1184 result["specificOutputInfos"] = []
1185 result["returnCode"] = 0
1186 result["errorMessage"] = ""
1188 All various modifications could be done from this template hypothesis.
1190 Special case of controled evolution or observation operator
1191 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1193 In some cases, the evolution or the observation operator is required to be
1194 controlled by an external input control, given *a priori*. In this case, the
1195 generic form of the incremental model is slightly modified as follows:
1197 .. math:: \mathbf{y} = O( \mathbf{x}, \mathbf{u})
1199 where :math:`\mathbf{u}` is the control over one state increment. In this case,
1200 the direct operator has to be applied to a pair of variables :math:`(X,U)`.
1201 Schematically, the operator has to be set as::
1203 def DirectOperator( (X, U) ):
1204 """ Direct non-linear simulation operator """
1208 return something like X(n+1) (evolution) or Y(n+1) (observation)
1210 The tangent and adjoint operators have the same signature as previously, noting
1211 that the derivatives has to be done only partially against :math:`\mathbf{x}`.
1212 In such a case with explicit control, only the second functional form (using
1213 "*ScriptWithFunctions*") and third functional form (using "*ScriptWithSwitch*")
1216 Requirements to describe covariance matrices
1217 --------------------------------------------
1219 Multiple covariance matrices are required to implement the data assimilation or
1220 optimization procedures. The main ones are the background error covariance
1221 matrix, noted as :math:`\mathbf{B}`, and the observation error covariance matrix,
1222 noted as :math:`\mathbf{R}`. Such a matrix is required to be a squared symetric
1223 semi-definite positive matrix.
1225 There are 3 practical methods for the user to provide a covariance matrix. These
1226 methods are chosen by the "*INPUT_TYPE*" keyword of each defined covariance
1227 matrix, as shown by the following figure:
1229 .. eficas_covariance_matrix:
1230 .. image:: images/eficas_covariance_matrix.png
1234 **Choosing covariance matrix representation**
1236 First matrix form: using "*Matrix*" representation
1237 ++++++++++++++++++++++++++++++++++++++++++++++++++
1239 .. index:: single: Matrix
1240 .. index:: single: BackgroundError
1241 .. index:: single: EvolutionError
1242 .. index:: single: ObservationError
1244 This first form is the default and more general one. The covariance matrix
1245 :math:`\mathbf{M}` has to be fully specified. Even if the matrix is symmetric by
1246 nature, the entire :math:`\mathbf{M}` matrix has to be given.
1248 .. math:: \mathbf{M} = \begin{pmatrix}
1249 m_{11} & m_{12} & \cdots & m_{1n} \\
1250 m_{21} & m_{22} & \cdots & m_{2n} \\
1251 \vdots & \vdots & \vdots & \vdots \\
1252 m_{n1} & \cdots & m_{nn-1} & m_{nn}
1255 It can be either a Python Numpy array or a matrix, or a list of lists of values
1256 (that is, a list of rows). For example, a simple diagonal unitary background
1257 error covariance matrix :math:`\mathbf{B}` can be described in a Python script
1260 BackgroundError = [[1, 0 ... 0], [0, 1 ... 0] ... [0, 0 ... 1]]
1264 BackgroundError = numpy.eye(...)
1266 Second matrix form: using "*ScalarSparseMatrix*" representation
1267 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1269 .. index:: single: ScalarSparseMatrix
1270 .. index:: single: BackgroundError
1271 .. index:: single: EvolutionError
1272 .. index:: single: ObservationError
1274 On the opposite, this second form is a very simplified method to provide a
1275 matrix. The covariance matrix :math:`\mathbf{M}` is supposed to be a positive
1276 multiple of the identity matrix. This matrix can then be specified in a unique
1277 way by the multiplier :math:`m`:
1279 .. math:: \mathbf{M} = m \times \begin{pmatrix}
1280 1 & 0 & \cdots & 0 \\
1281 0 & 1 & \cdots & 0 \\
1282 \vdots & \vdots & \vdots & \vdots \\
1286 The multiplier :math:`m` has to be a floating point or integer positive value
1287 (if it is negative, which is impossible for a positive covariance matrix, it is
1288 converted to positive value). For example, a simple diagonal unitary background
1289 error covariance matrix :math:`\mathbf{B}` can be described in a python script
1292 BackgroundError = 1.
1294 or, better, by a "*String*" directly in the ADAO case.
1296 Third matrix form: using "*DiagonalSparseMatrix*" representation
1297 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1299 .. index:: single: DiagonalSparseMatrix
1300 .. index:: single: BackgroundError
1301 .. index:: single: EvolutionError
1302 .. index:: single: ObservationError
1304 This third form is also a simplified method to provide a matrix, but a little
1305 more powerful than the second one. The covariance matrix :math:`\mathbf{M}` is
1306 already supposed to be diagonal, but the user has to specify all the positive
1307 diagonal values. The matrix can then be specified only by a vector
1308 :math:`\mathbf{V}` which will be set on a diagonal matrix:
1310 .. math:: \mathbf{M} = \begin{pmatrix}
1311 v_{1} & 0 & \cdots & 0 \\
1312 0 & v_{2} & \cdots & 0 \\
1313 \vdots & \vdots & \vdots & \vdots \\
1314 0 & \cdots & 0 & v_{n}
1317 It can be either a Python Numpy array or a matrix, or a list or a list of list
1318 of positive values (in all cases, if some are negative, which is impossible,
1319 they are converted to positive values). For example, a simple diagonal unitary
1320 background error covariance matrix :math:`\mathbf{B}` can be described in a
1321 python script file as::
1323 BackgroundError = [1, 1 ... 1]
1327 BackgroundError = numpy.ones(...)