3 ================================================================================
4 Reference description of the ADAO commands and keywords
5 ================================================================================
7 This section presents the reference description of the ADAO commands and
8 keywords available through the GUI or through scripts.
10 Each command or keyword to be defined through the ADAO GUI has some properties.
11 The first property is to be *required*, *optional* or only factual, describing a
12 type of input. The second property is to be an "open" variable with a fixed type
13 but with any value allowed by the type, or a "restricted" variable, limited to
14 some specified values. The EFICAS editor GUI having build-in validating
15 capacities, the properties of the commands or keywords given through this GUI
16 are automatically correct.
18 The mathematical notations used afterward are explained in the section
19 :ref:`section_theory`.
21 Examples of using these commands are available in the section
22 :ref:`section_examples` and in example files installed with ADAO module.
24 List of possible input types
25 ----------------------------
27 .. index:: single: Dict
28 .. index:: single: Function
29 .. index:: single: Matrix
30 .. index:: single: ScalarSparseMatrix
31 .. index:: single: DiagonalSparseMatrix
32 .. index:: single: String
33 .. index:: single: Script
34 .. index:: single: Vector
36 Each ADAO variable has a pseudo-type to help filling it and validation. The
37 different pseudo-types are:
40 This indicates a variable that has to be filled by a Python dictionary
41 ``{"key":"value...}``, usually given either as a string or as a script file.
44 This indicates a variable that has to be filled by a Python function,
45 usually given as a script file or a component method.
48 This indicates a variable that has to be filled by a matrix, usually given
49 either as a string or as a script file.
51 **ScalarSparseMatrix**
52 This indicates a variable that has to be filled by a unique number (which
53 will be used to multiply an identity matrix), usually given either as a
54 string or as a script file.
56 **DiagonalSparseMatrix**
57 This indicates a variable that has to be filled by a vector (which will be
58 used to replace the diagonal of an identity matrix), usually given either as
59 a string or as a script file.
62 This indicates a script given as an external file. It can be described by a
63 full absolute path name or only by the file name without path. If the file
64 is given only by a file name without path, and if a study directory is also
65 indicated, the file is searched in the given directory.
68 This indicates a string giving a literal representation of a matrix, a
69 vector or a vector serie, such as "1 2 ; 3 4" or "[[1,2],[3,4]]" for a
73 This indicates a variable that has to be filled by a vector, usually given
74 either as a string or as a script file.
77 This indicates a variable that has to be filled by a list of
78 vectors, usually given either as a string or as a script file.
80 When a command or keyword can be filled by a script file name, the script has to
81 contain a variable or a method that has the same name as the one to be filled.
82 In other words, when importing the script in a YACS Python node, it must create
83 a variable of the good name in the current namespace of the node.
85 Reference description for ADAO calculation cases
86 ------------------------------------------------
88 List of commands and keywords for an ADAO calculation case
89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
91 .. index:: single: ASSIMILATION_STUDY
92 .. index:: single: Algorithm
93 .. index:: single: AlgorithmParameters
94 .. index:: single: Background
95 .. index:: single: BackgroundError
96 .. index:: single: ControlInput
97 .. index:: single: Debug
98 .. index:: single: EvolutionError
99 .. index:: single: EvolutionModel
100 .. index:: single: InputVariables
101 .. index:: single: Observation
102 .. index:: single: ObservationError
103 .. index:: single: ObservationOperator
104 .. index:: single: Observers
105 .. index:: single: OutputVariables
106 .. index:: single: Study_name
107 .. index:: single: Study_repertory
108 .. index:: single: UserDataInit
109 .. index:: single: UserPostAnalysis
111 The first set of commands is related to the description of a calculation case,
112 that is a *Data Assimilation* procedure or an *Optimization* procedure. The
113 terms are ordered in alphabetical order, except the first, which describes
114 choice between calculation or checking. The different commands are the
117 **ASSIMILATION_STUDY**
118 *Required command*. This is the general command describing the data
119 assimilation or optimization case. It hierarchically contains all the other
123 *Required command*. This is a string to indicate the data assimilation or
124 optimization algorithm chosen. The choices are limited and available through
125 the GUI. There exists for example "3DVAR", "Blue"... See below the list of
126 algorithms and associated parameters in the following subsection `Optional
127 and required commands for calculation algorithms`_.
129 **AlgorithmParameters**
130 *Optional command*. This command allows to add some optional parameters to
131 control the data assimilation or optimization algorithm. Its value is
132 defined as a "*Dict*" type object. See below the list of algorithms and
133 associated parameters in the following subsection `Optional and required
134 commands for calculation algorithms`_.
137 *Required command*. This indicates the background or initial vector used,
138 previously noted as :math:`\mathbf{x}^b`. Its value is defined as a
139 "*Vector*" type object.
142 *Required command*. This indicates the background error covariance matrix,
143 previously noted as :math:`\mathbf{B}`. Its value is defined as a "*Matrix*"
144 type object, a "*ScalarSparseMatrix*" type object, or a
145 "*DiagonalSparseMatrix*" type object.
148 *Optional command*. This indicates the control vector used to force the
149 evolution model at each step, usually noted as :math:`\mathbf{U}`. Its value
150 is defined as a "*Vector*" or a *VectorSerie* type object. When there is no
151 control, it has to be a void string ''.
154 *Optional command*. This define the level of trace and intermediary debug
155 information. The choices are limited between 0 (for False) and 1 (for
159 *Optional command*. This indicates the evolution error covariance matrix,
160 usually noted as :math:`\mathbf{Q}`. It is defined as a "*Matrix*" type
161 object, a "*ScalarSparseMatrix*" type object, or a "*DiagonalSparseMatrix*"
165 *Optional command*. This indicates the evolution model operator, usually
166 noted :math:`M`, which describes an elementary step of evolution. Its value
167 is defined as a "*Function*" type object. Different functional forms can be
168 used, as described in the following subsection `Requirements for functions
169 describing an operator`_. If there is some control :math:`U` included in the
170 evolution model, the operator has to be applied to a pair :math:`(X,U)`.
173 *Optional command*. This command allows to indicates the name and size of
174 physical variables that are bundled together in the state vector. This
175 information is dedicated to data processed inside an algorithm.
178 *Required command*. This indicates the observation vector used for data
179 assimilation or optimization, previously noted as :math:`\mathbf{y}^o`. It
180 is defined as a "*Vector*" or a *VectorSerie* type object.
183 *Required command*. This indicates the observation error covariance matrix,
184 previously noted as :math:`\mathbf{R}`. It is defined as a "*Matrix*" type
185 object, a "*ScalarSparseMatrix*" type object, or a "*DiagonalSparseMatrix*"
188 **ObservationOperator**
189 *Required command*. This indicates the observation operator, previously
190 noted :math:`H`, which transforms the input parameters :math:`\mathbf{x}` to
191 results :math:`\mathbf{y}` to be compared to observations
192 :math:`\mathbf{y}^o`. Its value is defined as a "*Function*" type object.
193 Different functional forms can be used, as described in the following
194 subsection `Requirements for functions describing an operator`_. If there is
195 some control :math:`U` included in the observation, the operator has to be
196 applied to a pair :math:`(X,U)`.
199 *Optional command*. This command allows to set internal observers, that are
200 functions linked with a particular variable, which will be executed each
201 time this variable is modified. It is a convenient way to monitor variables
202 of interest during the data assimilation or optimization process, by
203 printing or plotting it, etc. Common templates are provided to help the user
204 to start or to quickly make his case.
207 *Optional command*. This command allows to indicates the name and size of
208 physical variables that are bundled together in the output observation
209 vector. This information is dedicated to data processed inside an algorithm.
212 *Required command*. This is an open string to describe the ADAO study by a
216 *Optional command*. If available, this directory is used as base name for
217 calculation, and used to find all the script files, given by name without
218 path, that can be used to define some other commands by scripts.
221 *Optional command*. This commands allows to initialize some parameters or
222 data automatically before data assimilation or optimisation algorithm input
223 processing. It indicates a script file name to be executed before entering
224 in initialization phase of chosen variables.
227 *Optional command*. This commands allows to process some parameters or data
228 automatically after data assimilation or optimization algorithm processing.
229 Its value is defined as a script file or a string, allowing to put
230 post-processing code directly inside the ADAO case. Common templates are
231 provided to help the user to start or to quickly make his case.
233 Optional and required commands for calculation algorithms
234 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
236 .. index:: single: 3DVAR
237 .. index:: single: Blue
238 .. index:: single: ExtendedBlue
239 .. index:: single: EnsembleBlue
240 .. index:: single: KalmanFilter
241 .. index:: single: ExtendedKalmanFilter
242 .. index:: single: UnscentedKalmanFilter
243 .. index:: single: LinearLeastSquares
244 .. index:: single: NonLinearLeastSquares
245 .. index:: single: ParticleSwarmOptimization
246 .. index:: single: QuantileRegression
248 .. index:: single: AlgorithmParameters
249 .. index:: single: Bounds
250 .. index:: single: CostDecrementTolerance
251 .. index:: single: GradientNormTolerance
252 .. index:: single: GroupRecallRate
253 .. index:: single: MaximumNumberOfSteps
254 .. index:: single: Minimizer
255 .. index:: single: NumberOfInsects
256 .. index:: single: ProjectedGradientTolerance
257 .. index:: single: QualityCriterion
258 .. index:: single: Quantile
259 .. index:: single: SetSeed
260 .. index:: single: StoreInternalVariables
261 .. index:: single: StoreSupplementaryCalculations
262 .. index:: single: SwarmVelocity
264 Each algorithm can be controlled using some generic or specific options, given
265 through the "*AlgorithmParameters*" optional command in a script file or a
266 sring, as follows for example in a file::
268 AlgorithmParameters = {
269 "Minimizer" : "LBFGSB",
270 "MaximumNumberOfSteps" : 25,
271 "StoreSupplementaryCalculations" : ["APosterioriCovariance","OMA"],
274 To give the "*AlgorithmParameters*" values by string, one must enclose a
275 standard dictionnary definition between simple quotes, as for example::
277 '{"Minimizer":"LBFGSB","MaximumNumberOfSteps":25}'
279 This section describes the available options algorithm by algorithm. In
280 addition, for each algorithm, the required commands/keywords are given, being
281 described in `List of commands and keywords for an ADAO calculation case`_. If
282 an option is specified by the user for an algorithm that doesn't support it, the
283 option is simply left unused and don't stop the treatment. The meaning of the
284 acronyms or particular names can be found in the :ref:`genindex` or the
285 :ref:`section_glossary`.
290 *"Background", "BackgroundError",
291 "Observation", "ObservationError",
292 "ObservationOperator"*
294 StoreInternalVariables
295 This boolean key allows to store default internal variables, mainly the
296 current state during iterative optimization process. Be careful, this can be
297 a numerically costly choice in certain calculation cases. The default is
300 StoreSupplementaryCalculations
301 This list indicates the names of the supplementary variables that can be
302 available at the end of the algorithm. It involves potentially costly
303 calculations. The default is a void list, none of these variables being
304 calculated and stored by default. The possible names are in the following
305 list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
306 "SigmaBck2", "SigmaObs2", "MahalanobisConsistency"].
311 *"Background", "BackgroundError",
312 "Observation", "ObservationError",
313 "ObservationOperator"*
315 StoreInternalVariables
316 This boolean key allows to store default internal variables, mainly the
317 current state during iterative optimization process. Be careful, this can be
318 a numerically costly choice in certain calculation cases. The default is
321 StoreSupplementaryCalculations
322 This list indicates the names of the supplementary variables that can be
323 available at the end of the algorithm. It involves potentially costly
324 calculations. The default is a void list, none of these variables being
325 calculated and stored by default. The possible names are in the following
326 list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
327 "SigmaBck2", "SigmaObs2", "MahalanobisConsistency"].
329 **"LinearLeastSquares"**
332 *"Observation", "ObservationError",
333 "ObservationOperator"*
335 StoreInternalVariables
336 This boolean key allows to store default internal variables, mainly the
337 current state during iterative optimization process. Be careful, this can be
338 a numerically costly choice in certain calculation cases. The default is
341 StoreSupplementaryCalculations
342 This list indicates the names of the supplementary variables that can be
343 available at the end of the algorithm. It involves potentially costly
344 calculations. The default is a void list, none of these variables being
345 calculated and stored by default. The possible names are in the following
351 *"Background", "BackgroundError",
352 "Observation", "ObservationError",
353 "ObservationOperator"*
356 This key allows to choose the optimization minimizer. The default choice
357 is "LBFGSB", and the possible ones are "LBFGSB" (nonlinear constrained
358 minimizer, see [Byrd95]_ and [Zhu97]_), "TNC" (nonlinear constrained
359 minimizer), "CG" (nonlinear unconstrained minimizer), "BFGS" (nonlinear
360 unconstrained minimizer), "NCG" (Newton CG minimizer). It is recommended to
361 stay with the default.
364 This key allows to define upper and lower bounds for every state variable
365 being optimized. Bounds can be given by a list of list of pairs of
366 lower/upper bounds for each variable, with possibly ``None`` every time
367 there is no bound. The bounds can always be specified, but they are taken
368 into account only by the constrained minimizers.
371 This key indicates the maximum number of iterations allowed for iterative
372 optimization. The default is 15000, which is very similar to no limit on
373 iterations. It is then recommended to adapt this parameter to the needs on
374 real problems. For some minimizers, the effective stopping step can be
375 slightly different of the limit due to algorithm internal control
378 CostDecrementTolerance
379 This key indicates a limit value, leading to stop successfully the
380 iterative optimization process when the cost function decreases less than
381 this tolerance at the last step. The default is 1.e-7, and it is
382 recommended to adapt it to the needs on real problems.
384 ProjectedGradientTolerance
385 This key indicates a limit value, leading to stop successfully the iterative
386 optimization process when all the components of the projected gradient are
387 under this limit. It is only used for constrained minimizers. The default is
388 -1, that is the internal default of each minimizer (generally 1.e-5), and it
389 is not recommended to change it.
391 GradientNormTolerance
392 This key indicates a limit value, leading to stop successfully the
393 iterative optimization process when the norm of the gradient is under this
394 limit. It is only used for non-constrained minimizers. The default is
395 1.e-5 and it is not recommended to change it.
397 StoreInternalVariables
398 This boolean key allows to store default internal variables, mainly the
399 current state during iterative optimization process. Be careful, this can be
400 a numerically costly choice in certain calculation cases. The default is
403 StoreSupplementaryCalculations
404 This list indicates the names of the supplementary variables that can be
405 available at the end of the algorithm. It involves potentially costly
406 calculations. The default is a void list, none of these variables being
407 calculated and stored by default. The possible names are in the following
408 list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
409 "SigmaObs2", "MahalanobisConsistency"].
411 **"NonLinearLeastSquares"**
415 "Observation", "ObservationError",
416 "ObservationOperator"*
419 This key allows to choose the optimization minimizer. The default choice is
420 "LBFGSB", and the possible ones are "LBFGSB" (nonlinear constrained
421 minimizer, see [Byrd95]_ and [Zhu97]_), "TNC" (nonlinear constrained
422 minimizer), "CG" (nonlinear unconstrained minimizer), "BFGS" (nonlinear
423 unconstrained minimizer), "NCG" (Newton CG minimizer). It is recommended to
424 stay with the default.
427 This key allows to define upper and lower bounds for every state variable
428 being optimized. Bounds can be given by a list of list of pairs of
429 lower/upper bounds for each variable, with possibly ``None`` every time
430 there is no bound. The bounds can always be specified, but they are taken
431 into account only by the constrained minimizers.
434 This key indicates the maximum number of iterations allowed for iterative
435 optimization. The default is 15000, which is very similar to no limit on
436 iterations. It is then recommended to adapt this parameter to the needs on
437 real problems. For some minimizers, the effective stopping step can be
438 slightly different due to algorithm internal control requirements.
440 CostDecrementTolerance
441 This key indicates a limit value, leading to stop successfully the
442 iterative optimization process when the cost function decreases less than
443 this tolerance at the last step. The default is 1.e-7, and it is
444 recommended to adapt it to the needs on real problems.
446 ProjectedGradientTolerance
447 This key indicates a limit value, leading to stop successfully the iterative
448 optimization process when all the components of the projected gradient are
449 under this limit. It is only used for constrained minimizers. The default is
450 -1, that is the internal default of each minimizer (generally 1.e-5), and it
451 is not recommended to change it.
453 GradientNormTolerance
454 This key indicates a limit value, leading to stop successfully the
455 iterative optimization process when the norm of the gradient is under this
456 limit. It is only used for non-constrained minimizers. The default is
457 1.e-5 and it is not recommended to change it.
459 StoreInternalVariables
460 This boolean key allows to store default internal variables, mainly the
461 current state during iterative optimization process. Be careful, this can be
462 a numerically costly choice in certain calculation cases. The default is
465 StoreSupplementaryCalculations
466 This list indicates the names of the supplementary variables that can be
467 available at the end of the algorithm. It involves potentially costly
468 calculations. The default is a void list, none of these variables being
469 calculated and stored by default. The possible names are in the following
470 list: ["BMA", "OMA", "OMB", "Innovation"].
475 *"Background", "BackgroundError",
476 "Observation", "ObservationError",
477 "ObservationOperator"*
480 This key allow to give an integer in order to fix the seed of the random
481 generator used to generate the ensemble. A convenient value is for example
482 1000. By default, the seed is left uninitialized, and so use the default
483 initialization from the computer.
488 *"Background", "BackgroundError",
489 "Observation", "ObservationError",
490 "ObservationOperator"*
493 This key allows to choose the type of estimation to be performed. It can be
494 either state-estimation, with a value of "State", or parameter-estimation,
495 with a value of "Parameters". The default choice is "State".
497 StoreInternalVariables
498 This boolean key allows to store default internal variables, mainly the
499 current state during iterative optimization process. Be careful, this can be
500 a numerically costly choice in certain calculation cases. The default is
503 StoreSupplementaryCalculations
504 This list indicates the names of the supplementary variables that can be
505 available at the end of the algorithm. It involves potentially costly
506 calculations. The default is a void list, none of these variables being
507 calculated and stored by default. The possible names are in the following
508 list: ["APosterioriCovariance", "BMA", "Innovation"].
510 **"ExtendedKalmanFilter"**
513 *"Background", "BackgroundError",
514 "Observation", "ObservationError",
515 "ObservationOperator"*
518 This key allows to define upper and lower bounds for every state variable
519 being optimized. Bounds can be given by a list of list of pairs of
520 lower/upper bounds for each variable, with extreme values every time there
521 is no bound. The bounds can always be specified, but they are taken into
522 account only by the constrained minimizers.
525 This key allows to define the method to take bounds into account. The
526 possible methods are in the following list: ["EstimateProjection"].
529 This key allows to choose the type of estimation to be performed. It can be
530 either state-estimation, with a value of "State", or parameter-estimation,
531 with a value of "Parameters". The default choice is "State".
533 StoreInternalVariables
534 This boolean key allows to store default internal variables, mainly the
535 current state during iterative optimization process. Be careful, this can be
536 a numerically costly choice in certain calculation cases. The default is
539 StoreSupplementaryCalculations
540 This list indicates the names of the supplementary variables that can be
541 available at the end of the algorithm. It involves potentially costly
542 calculations. The default is a void list, none of these variables being
543 calculated and stored by default. The possible names are in the following
544 list: ["APosterioriCovariance", "BMA", "Innovation"].
546 **"UnscentedKalmanFilter"**
549 *"Background", "BackgroundError",
550 "Observation", "ObservationError",
551 "ObservationOperator"*
554 This key allows to define upper and lower bounds for every state variable
555 being optimized. Bounds can be given by a list of list of pairs of
556 lower/upper bounds for each variable, with extreme values every time there
557 is no bound. The bounds can always be specified, but they are taken into
558 account only by the constrained minimizers.
561 This key allows to define the method to take bounds into account. The
562 possible methods are in the following list: ["EstimateProjection"].
565 This key allows to choose the type of estimation to be performed. It can be
566 either state-estimation, with a value of "State", or parameter-estimation,
567 with a value of "Parameters". The default choice is "State".
569 Alpha, Beta, Kappa, Reconditioner
570 These keys are internal scaling parameters. "Alpha" requires a value between
571 1.e-4 and 1. "Beta" has an optimal value of 2 for gaussian *a priori*
572 distribution. "Kappa" requires an integer value, and the right default is
573 obtained by setting it to 0. "Reconditioner" requires a value between 1.e-3
574 and 10, it defaults to 1.
576 StoreInternalVariables
577 This boolean key allows to store default internal variables, mainly the
578 current state during iterative optimization process. Be careful, this can be
579 a numerically costly choice in certain calculation cases. The default is
582 StoreSupplementaryCalculations
583 This list indicates the names of the supplementary variables that can be
584 available at the end of the algorithm. It involves potentially costly
585 calculations. The default is a void list, none of these variables being
586 calculated and stored by default. The possible names are in the following
587 list: ["APosterioriCovariance", "BMA", "Innovation"].
589 **"ParticleSwarmOptimization"**
592 *"Background", "BackgroundError",
593 "Observation", "ObservationError",
594 "ObservationOperator"*
597 This key indicates the maximum number of iterations allowed for iterative
598 optimization. The default is 50, which is an arbitrary limit. It is then
599 recommended to adapt this parameter to the needs on real problems.
602 This key indicates the number of insects or particles in the swarm. The
603 default is 100, which is a usual default for this algorithm.
606 This key indicates the part of the insect velocity which is imposed by the
607 swarm. It is a positive floating point value. The default value is 1.
610 This key indicates the recall rate at the best swarm insect. It is a
611 floating point value between 0 and 1. The default value is 0.5.
614 This key indicates the quality criterion, minimized to find the optimal
615 state estimate. The default is the usual data assimilation criterion named
616 "DA", the augmented ponderated least squares. The possible criteria has to
617 be in the following list, where the equivalent names are indicated by "=":
618 ["AugmentedPonderatedLeastSquares"="APLS"="DA",
619 "PonderatedLeastSquares"="PLS", "LeastSquares"="LS"="L2",
620 "AbsoluteValue"="L1", "MaximumError"="ME"]
623 This key allow to give an integer in order to fix the seed of the random
624 generator used to generate the ensemble. A convenient value is for example
625 1000. By default, the seed is left uninitialized, and so use the default
626 initialization from the computer.
628 StoreInternalVariables
629 This boolean key allows to store default internal variables, mainly the
630 current state during iterative optimization process. Be careful, this can be
631 a numerically costly choice in certain calculation cases. The default is
634 StoreSupplementaryCalculations
635 This list indicates the names of the supplementary variables that can be
636 available at the end of the algorithm. It involves potentially costly
637 calculations. The default is a void list, none of these variables being
638 calculated and stored by default. The possible names are in the following
639 list: ["BMA", "OMA", "OMB", "Innovation"].
641 **"QuantileRegression"**
646 "ObservationOperator"*
649 This key allows to define the real value of the desired quantile, between
650 0 and 1. The default is 0.5, corresponding to the median.
653 This key allows to choose the optimization minimizer. The default choice
654 and only available choice is "MMQR" (Majorize-Minimize for Quantile
658 This key indicates the maximum number of iterations allowed for iterative
659 optimization. The default is 15000, which is very similar to no limit on
660 iterations. It is then recommended to adapt this parameter to the needs on
663 CostDecrementTolerance
664 This key indicates a limit value, leading to stop successfully the
665 iterative optimization process when the cost function or the surrogate
666 decreases less than this tolerance at the last step. The default is 1.e-6,
667 and it is recommended to adapt it to the needs on real problems.
669 StoreInternalVariables
670 This boolean key allows to store default internal variables, mainly the
671 current state during iterative optimization process. Be careful, this can be
672 a numerically costly choice in certain calculation cases. The default is
675 StoreSupplementaryCalculations
676 This list indicates the names of the supplementary variables that can be
677 available at the end of the algorithm. It involves potentially costly
678 calculations. The default is a void list, none of these variables being
679 calculated and stored by default. The possible names are in the following
680 list: ["BMA", "OMA", "OMB", "Innovation"].
682 Reference description for ADAO checking cases
683 ---------------------------------------------
685 List of commands and keywords for an ADAO checking case
686 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
688 .. index:: single: CHECKING_STUDY
689 .. index:: single: Algorithm
690 .. index:: single: AlgorithmParameters
691 .. index:: single: CheckingPoint
692 .. index:: single: Debug
693 .. index:: single: ObservationOperator
694 .. index:: single: Study_name
695 .. index:: single: Study_repertory
696 .. index:: single: UserDataInit
698 The second set of commands is related to the description of a checking case,
699 that is a procedure to check required properties on information somewhere else
700 by a calculation case. The terms are ordered in alphabetical order, except the
701 first, which describes choice between calculation or checking. The different
702 commands are the following:
705 *Required command*. This is the general command describing the checking
706 case. It hierarchically contains all the other commands.
709 *Required command*. This is a string to indicate the test algorithm chosen.
710 The choices are limited and available through the GUI. There exists for
711 example "FunctionTest", "AdjointTest"... See below the list of algorithms
712 and associated parameters in the following subsection `Optional and required
713 commands for checking algorithms`_.
715 **AlgorithmParameters** *Optional command*. This command allows to add some
716 optional parameters to control the data assimilation or optimization
717 algorithm. It is defined as a "*Dict*" type object, that is, given as a
718 script. See below the list of algorithms and associated parameters in the
719 following subsection `Optional and required commands for checking
723 *Required command*. This indicates the vector used, previously noted as
724 :math:`\mathbf{x}^b`. It is defined as a "*Vector*" type object.
727 *Optional command*. This define the level of trace and intermediary debug
728 information. The choices are limited between 0 (for False) and 1 (for
731 **ObservationOperator**
732 *Required command*. This indicates the observation operator, previously
733 noted :math:`H`, which transforms the input parameters :math:`\mathbf{x}` to
734 results :math:`\mathbf{y}` to be compared to observations
735 :math:`\mathbf{y}^o`. It is defined as a "*Function*" type object. Different
736 functional forms can be used, as described in the following subsection
737 `Requirements for functions describing an operator`_. If there is some
738 control :math:`U` included in the observation, the operator has to be
739 applied to a pair :math:`(X,U)`.
742 *Required command*. This is an open string to describe the study by a name
746 *Optional command*. If available, this directory is used as base name for
747 calculation, and used to find all the script files, given by name without
748 path, that can be used to define some other commands by scripts.
751 *Optional command*. This commands allows to initialize some parameters or
752 data automatically before data assimilation algorithm processing.
754 Optional and required commands for checking algorithms
755 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
757 .. index:: single: AdjointTest
758 .. index:: single: FunctionTest
759 .. index:: single: GradientTest
760 .. index:: single: LinearityTest
762 .. index:: single: AlgorithmParameters
763 .. index:: single: AmplitudeOfInitialDirection
764 .. index:: single: EpsilonMinimumExponent
765 .. index:: single: InitialDirection
766 .. index:: single: ResiduFormula
767 .. index:: single: SetSeed
769 We recall that each algorithm can be controlled using some generic or specific
770 options, given through the "*AlgorithmParameters*" optional command, as follows
773 AlgorithmParameters = {
774 "AmplitudeOfInitialDirection" : 1,
775 "EpsilonMinimumExponent" : -8,
778 If an option is specified by the user for an algorithm that doesn't support it,
779 the option is simply left unused and don't stop the treatment. The meaning of
780 the acronyms or particular names can be found in the :ref:`genindex` or the
781 :ref:`section_glossary`. In addition, for each algorithm, the required
782 commands/keywords are given, being described in `List of commands and keywords
783 for an ADAO checking case`_.
789 "ObservationOperator"*
791 AmplitudeOfInitialDirection
792 This key indicates the scaling of the initial perturbation build as a vector
793 used for the directional derivative around the nominal checking point. The
794 default is 1, that means no scaling.
796 EpsilonMinimumExponent
797 This key indicates the minimal exponent value of the power of 10 coefficient
798 to be used to decrease the increment multiplier. The default is -8, and it
799 has to be between 0 and -20. For example, its default value leads to
800 calculate the residue of the formula with a fixed increment multiplied from
804 This key indicates the vector direction used for the directional derivative
805 around the nominal checking point. It has to be a vector. If not specified,
806 this direction defaults to a random perturbation around zero of the same
807 vector size than the checking point.
810 This key allow to give an integer in order to fix the seed of the random
811 generator used to generate the ensemble. A convenient value is for example
812 1000. By default, the seed is left uninitialized, and so use the default
813 initialization from the computer.
819 "ObservationOperator"*
821 NumberOfPrintedDigits
822 This key indicates the number of digits of precision for floating point
823 printed output. The default is 5, with a minimum of 0.
826 This key indicates the number of time to repeat the function evaluation. The
830 This key requires the activation, or not, of the debug mode during the
831 function evaluation. The default is "True", the choices are "True" or
838 "ObservationOperator"*
840 AmplitudeOfInitialDirection
841 This key indicates the scaling of the initial perturbation build as a vector
842 used for the directional derivative around the nominal checking point. The
843 default is 1, that means no scaling.
845 EpsilonMinimumExponent
846 This key indicates the minimal exponent value of the power of 10 coefficient
847 to be used to decrease the increment multiplier. The default is -8, and it
848 has to be between 0 and -20. For example, its default value leads to
849 calculate the residue of the scalar product formula with a fixed increment
850 multiplied from 1.e0 to 1.e-8.
853 This key indicates the vector direction used for the directional derivative
854 around the nominal checking point. It has to be a vector. If not specified,
855 this direction defaults to a random perturbation around zero of the same
856 vector size than the checking point.
859 This key indicates the residue formula that has to be used for the test. The
860 default choice is "Taylor", and the possible ones are "Taylor" (residue of
861 the Taylor development of the operator, which has to decrease with the
862 square power of the perturbation) and "Norm" (residue obtained by taking the
863 norm of the Taylor development at zero order approximation, which
864 approximate the gradient, and which has to remain constant).
867 This key allow to give an integer in order to fix the seed of the random
868 generator used to generate the ensemble. A convenient value is for example
869 1000. By default, the seed is left uninitialized, and so use the default
870 initialization from the computer.
876 "ObservationOperator"*
878 AmplitudeOfInitialDirection
879 This key indicates the scaling of the initial perturbation build as a vector
880 used for the directional derivative around the nominal checking point. The
881 default is 1, that means no scaling.
883 EpsilonMinimumExponent
884 This key indicates the minimal exponent value of the power of 10 coefficient
885 to be used to decrease the increment multiplier. The default is -8, and it
886 has to be between 0 and -20. For example, its default value leads to
887 calculate the residue of the scalar product formula with a fixed increment
888 multiplied from 1.e0 to 1.e-8.
891 This key indicates the vector direction used for the directional derivative
892 around the nominal checking point. It has to be a vector. If not specified,
893 this direction defaults to a random perturbation around zero of the same
894 vector size than the checking point.
897 This key indicates the residue formula that has to be used for the test. The
898 default choice is "CenteredDL", and the possible ones are "CenteredDL"
899 (residue of the difference between the function at nominal point and the
900 values with positive and negative increments, which has to stay very small),
901 "Taylor" (residue of the Taylor development of the operator normalized by
902 the nominal value, which has to stay very small), "NominalTaylor" (residue
903 of the order 1 approximations of the operator, normalized to the nominal
904 point, which has to stay close to 1), and "NominalTaylorRMS" (residue of the
905 order 1 approximations of the operator, normalized by RMS to the nominal
906 point, which has to stay close to 0).
909 This key allow to give an integer in order to fix the seed of the random
910 generator used to generate the ensemble. A convenient value is for example
911 1000. By default, the seed is left uninitialized, and so use the default
912 initialization from the computer.
914 Requirements for functions describing an operator
915 -------------------------------------------------
917 The operators for observation and evolution are required to implement the data
918 assimilation or optimization procedures. They include the physical simulation by
919 numerical calculations, but also the filtering and restriction to compare the
920 simulation to observation. The evolution operator is considered here in its
921 incremental form, representing the transition between two successive states, and
922 is then similar to the observation operator.
924 Schematically, an operator has to give a output solution given the input
925 parameters. Part of the input parameters can be modified during the optimization
926 procedure. So the mathematical representation of such a process is a function.
927 It was briefly described in the section :ref:`section_theory` and is generalized
928 here by the relation:
930 .. math:: \mathbf{y} = O( \mathbf{x} )
932 between the pseudo-observations :math:`\mathbf{y}` and the parameters
933 :math:`\mathbf{x}` using the observation or evolution operator :math:`O`. The
934 same functional representation can be used for the linear tangent model
935 :math:`\mathbf{O}` of :math:`O` and its adjoint :math:`\mathbf{O}^*`, also
936 required by some data assimilation or optimization algorithms.
938 On input and output of these operators, the :math:`\mathbf{x}` and
939 :math:`\mathbf{y}` variables or their increments are mathematically vectors,
940 and they are given as non-orented vectors (of type list or Numpy array) or
941 oriented ones (of type Numpy matrix).
943 Then, **to describe completely an operator, the user has only to provide a
944 function that fully and only realize the functional operation**.
946 This function is usually given as a script that can be executed in a YACS node.
947 This script can without difference launch external codes or use internal SALOME
948 calls and methods. If the algorithm requires the 3 aspects of the operator
949 (direct form, tangent form and adjoint form), the user has to give the 3
950 functions or to approximate them.
952 There are 3 practical methods for the user to provide an operator functional
953 representation. These methods are chosen in the "*FROM*" field of each operator
954 having a "*Function*" value as "*INPUT_TYPE*", as shown by the following figure:
956 .. eficas_operator_function:
957 .. image:: images/eficas_operator_function.png
961 **Choosing an operator functional representation**
963 First functional form: using "*ScriptWithOneFunction*"
964 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
966 .. index:: single: ScriptWithOneFunction
967 .. index:: single: DirectOperator
968 .. index:: single: DifferentialIncrement
969 .. index:: single: CenteredFiniteDifference
971 The first one consist in providing only one potentially non-linear function, and
972 to approximate the tangent and the adjoint operators. This is done by using the
973 keyword "*ScriptWithOneFunction*" for the description of the chosen operator in
974 the ADAO GUI. The user have to provide the function in a script, with a
975 mandatory name "*DirectOperator*". For example, the script can follow the
978 def DirectOperator( X ):
979 """ Direct non-linear simulation operator """
985 In this case, the user has also provide a value for the differential increment
986 (or keep the devault value), using through the GUI the keyword
987 "*DifferentialIncrement*", which has a default value of 1%. This coefficient
988 will be used in the finite difference approximation to build the tangent and
989 adjoint operators. The finite difference approximation order can also be chosen
990 through the GUI, using the keyword "*CenteredFiniteDifference*", with 0 for an
991 uncentered schema of first order (which is the default value), and with 1 for a
992 centered schema of second order (of twice the first order computational cost).
994 This first operator definition form allows easily to test the functional form
995 before its use in an ADAO case, greatly reducing the complexity of
996 operator implementation.
998 **Important warning:** the name "*DirectOperator*" is mandatory, and the type of
999 the ``X`` argument can be either a list, a numpy array or a numpy 1D-matrix. The
1000 user has to treat these cases in his function.
1002 Second functional form: using "*ScriptWithFunctions*"
1003 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1005 .. index:: single: ScriptWithFunctions
1006 .. index:: single: DirectOperator
1007 .. index:: single: TangentOperator
1008 .. index:: single: AdjointOperator
1010 **In general, it is recommended to use the first functionnal form rather than
1011 the second one. A small performance improvement is not a good reason to use a
1012 detailled implementation as this second functional form.**
1014 The second one consist in providing directly the three associated operators
1015 :math:`O`, :math:`\mathbf{O}` and :math:`\mathbf{O}^*`. This is done by using
1016 the keyword "*ScriptWithFunctions*" for the description of the chosen operator
1017 in the ADAO GUI. The user have to provide three functions in one script, with
1018 three mandatory names "*DirectOperator*", "*TangentOperator*" and
1019 "*AdjointOperator*". For example, the script can follow the template::
1021 def DirectOperator( X ):
1022 """ Direct non-linear simulation operator """
1026 return something like Y
1028 def TangentOperator( (X, dX) ):
1029 """ Tangent linear operator, around X, applied to dX """
1033 return something like Y
1035 def AdjointOperator( (X, Y) ):
1036 """ Adjoint operator, around X, applied to Y """
1040 return something like X
1042 Another time, this second operator definition allow easily to test the
1043 functional forms before their use in an ADAO case, reducing the complexity of
1044 operator implementation.
1046 For some algorithms, it is required that the tangent and adjoint functions can
1047 return the matrix equivalent to the linear operator. In this case, when
1048 respectivly the ``dX`` or the ``Y`` arguments are ``None``, the user has to
1049 return the associated matrix.
1051 **Important warning:** the names "*DirectOperator*", "*TangentOperator*" and
1052 "*AdjointOperator*" are mandatory, and the type of the ``X``, Y``, ``dX``
1053 arguments can be either a python list, a numpy array or a numpy 1D-matrix. The
1054 user has to treat these cases in his script.
1056 Third functional form: using "*ScriptWithSwitch*"
1057 +++++++++++++++++++++++++++++++++++++++++++++++++
1059 .. index:: single: ScriptWithSwitch
1060 .. index:: single: DirectOperator
1061 .. index:: single: TangentOperator
1062 .. index:: single: AdjointOperator
1064 **It is recommended not to use this third functional form without a solid
1065 numerical or physical reason. A performance improvement is not a good reason to
1066 use the implementation complexity of this third functional form. Only an
1067 inability to use the first or second forms justifies the use of the third.**
1069 This third form give more possibilities to control the execution of the three
1070 functions representing the operator, allowing advanced usage and control over
1071 each execution of the simulation code. This is done by using the keyword
1072 "*ScriptWithSwitch*" for the description of the chosen operator in the ADAO GUI.
1073 The user have to provide a switch in one script to control the execution of the
1074 direct, tangent and adjoint forms of its simulation code. The user can then, for
1075 example, use other approximations for the tangent and adjoint codes, or
1076 introduce more complexity in the argument treatment of the functions. But it
1077 will be far more complicated to implement and debug.
1079 If, however, you want to use this third form, we recommend using the following
1080 template for the switch. It requires an external script or code named here
1081 "*Physical_simulation_functions.py*", containing three functions named
1082 "*DirectOperator*", "*TangentOperator*" and "*AdjointOperator*" as previously.
1083 Here is the switch template::
1085 import Physical_simulation_functions
1086 import numpy, logging
1089 for param in computation["specificParameters"]:
1090 if param["name"] == "method":
1091 method = param["value"]
1092 if method not in ["Direct", "Tangent", "Adjoint"]:
1093 raise ValueError("No valid computation method is given")
1094 logging.info("Found method is \'%s\'"%method)
1096 logging.info("Loading operator functions")
1097 Function = Physical_simulation_functions.DirectOperator
1098 Tangent = Physical_simulation_functions.TangentOperator
1099 Adjoint = Physical_simulation_functions.AdjointOperator
1101 logging.info("Executing the possible computations")
1103 if method == "Direct":
1104 logging.info("Direct computation")
1105 Xcurrent = computation["inputValues"][0][0][0]
1106 data = Function(numpy.matrix( Xcurrent ).T)
1107 if method == "Tangent":
1108 logging.info("Tangent computation")
1109 Xcurrent = computation["inputValues"][0][0][0]
1110 dXcurrent = computation["inputValues"][0][0][1]
1111 data = Tangent(numpy.matrix(Xcurrent).T, numpy.matrix(dXcurrent).T)
1112 if method == "Adjoint":
1113 logging.info("Adjoint computation")
1114 Xcurrent = computation["inputValues"][0][0][0]
1115 Ycurrent = computation["inputValues"][0][0][1]
1116 data = Adjoint((numpy.matrix(Xcurrent).T, numpy.matrix(Ycurrent).T))
1118 logging.info("Formatting the output")
1119 it = numpy.ravel(data)
1120 outputValues = [[[[]]]]
1122 outputValues[0][0][0].append(val)
1125 result["outputValues"] = outputValues
1126 result["specificOutputInfos"] = []
1127 result["returnCode"] = 0
1128 result["errorMessage"] = ""
1130 All various modifications could be done from this template hypothesis.
1132 Special case of controled evolution or observation operator
1133 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1135 In some cases, the evolution or the observation operator is required to be
1136 controled by an external input control, given *a priori*. In this case, the
1137 generic form of the incremental model is slightly modified as follows:
1139 .. math:: \mathbf{y} = O( \mathbf{x}, \mathbf{u})
1141 where :math:`\mathbf{u}` is the control over one state increment. In this case,
1142 the direct operator has to be applied to a pair of variables :math:`(X,U)`.
1143 Schematically, the operator has to be set as::
1145 def DirectOperator( (X, U) ):
1146 """ Direct non-linear simulation operator """
1150 return something like X(n+1) (evolution) or Y(n+1) (observation)
1152 The tangent and adjoint operators have the same signature as previously, noting
1153 that the derivatives has to be done only partially against :math:`\mathbf{x}`.
1154 In such a case with explicit control, only the second functional form (using
1155 "*ScriptWithFunctions*") and third functional form (using "*ScriptWithSwitch*")
1158 Requirements to describe covariance matrices
1159 --------------------------------------------
1161 Multiple covariance matrices are required to implement the data assimilation or
1162 optimization procedures. The main ones are the background error covariance
1163 matrix, noted as :math:`\mathbf{B}`, and the observation error covariance matrix,
1164 noted as :math:`\mathbf{R}`. Such a matrix is required to be a squared symetric
1165 semi-definite positive matrix.
1167 There are 3 practical methods for the user to provide a covariance matrix. These
1168 methods are chosen by the "*INPUT_TYPE*" keyword of each defined covariance
1169 matrix, as shown by the following figure:
1171 .. eficas_covariance_matrix:
1172 .. image:: images/eficas_covariance_matrix.png
1176 **Choosing covariance matrix representation**
1178 First matrix form: using "*Matrix*" representation
1179 ++++++++++++++++++++++++++++++++++++++++++++++++++
1181 .. index:: single: Matrix
1182 .. index:: single: BackgroundError
1183 .. index:: single: EvolutionError
1184 .. index:: single: ObservationError
1186 This first form is the default and more general one. The covariance matrix
1187 :math:`\mathbf{M}` has to be fully specified. Even if the matrix is symetric by
1188 nature, the entire :math:`\mathbf{M}` matrix has to be given.
1190 .. math:: \mathbf{M} = \begin{pmatrix}
1191 m_{11} & m_{12} & \cdots & m_{1n} \\
1192 m_{21} & m_{22} & \cdots & m_{2n} \\
1193 \vdots & \vdots & \vdots & \vdots \\
1194 m_{n1} & \cdots & m_{nn-1} & m_{nn}
1197 It can be either a Python Numpy array or a matrix, or a list of lists of values
1198 (that is, a list of rows). For example, a simple diagonal unitary background
1199 error covariance matrix :math:`\mathbf{B}` can be described in a Python script
1202 BackgroundError = [[1, 0 ... 0], [0, 1 ... 0] ... [0, 0 ... 1]]
1206 BackgroundError = numpy.eye(...)
1208 Second matrix form: using "*ScalarSparseMatrix*" representation
1209 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1211 .. index:: single: ScalarSparseMatrix
1212 .. index:: single: BackgroundError
1213 .. index:: single: EvolutionError
1214 .. index:: single: ObservationError
1216 On the opposite, this second form is a very simplified method to provide a
1217 matrix. The covariance matrix :math:`\mathbf{M}` is supposed to be a positive
1218 multiple of the identity matrix. This matrix can then be specified in a unique
1219 way by the multiplier :math:`m`:
1221 .. math:: \mathbf{M} = m \times \begin{pmatrix}
1222 1 & 0 & \cdots & 0 \\
1223 0 & 1 & \cdots & 0 \\
1224 \vdots & \vdots & \vdots & \vdots \\
1228 The multiplier :math:`m` has to be a floating point or integer positive value
1229 (if it is negative, which is impossible for a positive covariance matrix, it is
1230 converted to positive value). For example, a simple diagonal unitary background
1231 error covariance matrix :math:`\mathbf{B}` can be described in a python script
1234 BackgroundError = 1.
1236 or, better, by a "*String*" directly in the ADAO case.
1238 Third matrix form: using "*DiagonalSparseMatrix*" representation
1239 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1241 .. index:: single: DiagonalSparseMatrix
1242 .. index:: single: BackgroundError
1243 .. index:: single: EvolutionError
1244 .. index:: single: ObservationError
1246 This third form is also a simplified method to provide a matrix, but a little
1247 more powerful than the second one. The covariance matrix :math:`\mathbf{M}` is
1248 already supposed to be diagonal, but the user has to specify all the positive
1249 diagonal values. The matrix can then be specified only by a vector
1250 :math:`\mathbf{V}` which will be set on a diagonal matrix:
1252 .. math:: \mathbf{M} = \begin{pmatrix}
1253 v_{1} & 0 & \cdots & 0 \\
1254 0 & v_{2} & \cdots & 0 \\
1255 \vdots & \vdots & \vdots & \vdots \\
1256 0 & \cdots & 0 & v_{n}
1259 It can be either a Python Numpy array or a matrix, or a list or a list of list
1260 of positive values (in all cases, if some are negative, which is impossible,
1261 they are converted to positive values). For example, a simple diagonal unitary
1262 background error covariance matrix :math:`\mathbf{B}` can be described in a
1263 python script file as::
1265 BackgroundError = [1, 1 ... 1]
1269 BackgroundError = numpy.ones(...)