doc/reference.rst

   1 .. _section_reference:
   2
   3 ================================================================================
   4 Reference description of the ADAO commands and keywords
   5 ================================================================================
   6
   7 This section presents the reference description of the ADAO commands and
   8 keywords available through the GUI or through scripts.
   9
  10 Each command or keyword to be defined through the ADAO GUI has some properties.
  11 The first property is to be *required*, *optional* or only factual, describing a
  12 type of input. The second property is to be an "open" variable with a fixed type
  13 but with any value allowed by the type, or a "restricted" variable, limited to
  14 some specified values. The EFICAS editor GUI having build-in validating
  15 capacities, the properties of the commands or keywords given through this GUI
  16 are automatically correct.
  17
  18 The mathematical notations used afterward are explained in the section
  19 :ref:`section_theory`.
  20
  21 Examples of using these commands are available in the section
  22 :ref:`section_examples` and in example files installed with ADAO module.
  23
  24 List of possible input types
  25 ----------------------------
  26
  27 .. index:: single: Dict
  28 .. index:: single: Function
  29 .. index:: single: Matrix
  30 .. index:: single: String
  31 .. index:: single: Script
  32 .. index:: single: Vector
  33
  34 Each ADAO variable has a pseudo-type to help filling it and validation. The
  35 different pseudo-types are:
  36
  37 **Dict**
  38     This indicates a variable that has to be filled by a dictionary, usually
  39     given as a script.
  40
  41 **Function**
  42     This indicates a variable that has to be filled by a function, usually given
  43     as a script or a component method.
  44
  45 **Matrix**
  46     This indicates a variable that has to be filled by a matrix, usually given
  47     either as a string or as a script.
  48
  49 **String**
  50     This indicates a string giving a literal representation of a matrix, a
  51     vector or a vector serie, such as "1 2 ; 3 4" for a square 2x2 matrix.
  52
  53 **Script**
  54     This indicates a script given as an external file. It can be described by a
  55     full absolute path name or only by the file name without path.
  56
  57 **Vector**
  58     This indicates a variable that has to be filled by a vector, usually given
  59     either as a string or as a script.
  60
  61 **VectorSerie** This indicates a variable that has to be filled by a list of
  62     vectors, usually given either as a string or as a script.
  63
  64 When a command or keyword can be filled by a script file name, the script has to
  65 contain a variable or a method that has the same name as the one to be filled.
  66 In other words, when importing the script in a YACS Python node, it must create
  67 a variable of the good name in the current namespace.
  68
  69 Reference description for ADAO calculation cases
  70 ------------------------------------------------
  71
  72 List of commands and keywords for an ADAO calculation case
  73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  74
  75 .. index:: single: ASSIMILATION_STUDY
  76 .. index:: single: Algorithm
  77 .. index:: single: AlgorithmParameters
  78 .. index:: single: Background
  79 .. index:: single: BackgroundError
  80 .. index:: single: ControlInput
  81 .. index:: single: Debug
  82 .. index:: single: EvolutionError
  83 .. index:: single: EvolutionModel
  84 .. index:: single: InputVariables
  85 .. index:: single: Observation
  86 .. index:: single: ObservationError
  87 .. index:: single: ObservationOperator
  88 .. index:: single: Observers
  89 .. index:: single: OutputVariables
  90 .. index:: single: Study_name
  91 .. index:: single: Study_repertory
  92 .. index:: single: UserDataInit
  93 .. index:: single: UserPostAnalysis
  94
  95 The first set of commands is related to the description of a calculation case,
  96 that is a *Data Assimilation* procedure or an *Optimization* procedure. The
  97 terms are ordered in alphabetical order, except the first, which describes
  98 choice between calculation or checking. The different commands are the
  99 following:
 100
 101 **ASSIMILATION_STUDY**
 102     *Required command*. This is the general command describing the data
 103     assimilation or optimization case. It hierarchically contains all the other
 104     commands.
 105
 106 **Algorithm**
 107     *Required command*. This is a string to indicate the data assimilation or
 108     optimization algorithm chosen. The choices are limited and available through
 109     the GUI. There exists for example "3DVAR", "Blue"... See below the list of
 110     algorithms and associated parameters in the following subsection `Options
 111     and required commands for calculation algorithms`_.
 112
 113 **AlgorithmParameters**
 114     *Optional command*. This command allows to add some optional parameters to
 115     control the data assimilation or optimization algorithm. It is defined as a
 116     "*Dict*" type object, that is, given as a script. See below the list of
 117     algorithms and associated parameters in the following subsection `Options
 118     and required commands for calculation algorithms`_.
 119
 120 **Background**
 121     *Required command*. This indicates the background or initial vector used,
 122     previously noted as :math:`\mathbf{x}^b`. It is defined as a "*Vector*" type
 123     object, that is, given either as a string or as a script.
 124
 125 **BackgroundError**
 126     *Required command*. This indicates the background error covariance matrix,
 127     previously noted as :math:`\mathbf{B}`. It is defined as a "*Matrix*" type
 128     object, that is, given either as a string or as a script.
 129
 130 **ControlInput**
 131     *Optional command*. This indicates the control vector used to force the
 132     evolution model at each step, usually noted as :math:`\mathbf{U}`. It is
 133     defined as a "*Vector*" or a *VectorSerie* type object, that is, given
 134     either as a string or as a script. When there is no control, it has to be a
 135     void string ''.
 136
 137 **Debug**
 138     *Required command*. This define the level of trace and intermediary debug
 139     information. The choices are limited between 0 (for False) and 1 (for
 140     True).
 141
 142 **EvolutionError**
 143     *Optional command*. This indicates the evolution error covariance matrix,
 144     usually noted as :math:`\mathbf{Q}`. It is defined as a "*Matrix*" type
 145     object, that is, given either as a string or as a script.
 146
 147 **EvolutionModel**
 148     *Optional command*. This indicates the evolution model operator, usually
 149     noted :math:`M`, which describes a step of evolution. It is defined as a
 150     "*Function*" type object, that is, given as a script. Different functional
 151     forms can be used, as described in the following subsection `Requirements
 152     for functions describing an operator`_. If there is some control :math:`U`
 153     included in the evolution model, the operator has to be applied to a pair
 154     :math:`(X,U)`.
 155
 156 **InputVariables**
 157     *Optional command*. This command allows to indicates the name and size of
 158     physical variables that are bundled together in the control vector. This
 159     information is dedicated to data processed inside an algorithm.
 160
 161 **Observation**
 162     *Required command*. This indicates the observation vector used for data
 163     assimilation or optimization, previously noted as :math:`\mathbf{y}^o`. It
 164     is defined as a "*Vector*" or a *VectorSerie* type object, that is, given
 165     either as a string or as a script.
 166
 167 **ObservationError**
 168     *Required command*. This indicates the observation error covariance matrix,
 169     previously noted as :math:`\mathbf{R}`. It is defined as a "*Matrix*" type
 170     object, that is, given either as a string or as a script.
 171
 172 **ObservationOperator**
 173     *Required command*. This indicates the observation operator, previously
 174     noted :math:`H`, which transforms the input parameters :math:`\mathbf{x}` to
 175     results :math:`\mathbf{y}` to be compared to observations
 176     :math:`\mathbf{y}^o`. It is defined as a "*Function*" type object, that is,
 177     given as a script. Different functional forms can be used, as described in
 178     the following subsection `Requirements for functions describing an
 179     operator`_. If there is some control :math:`U` included in the observation,
 180     the operator has to be applied to a pair :math:`(X,U)`.
 181
 182 **Observers**
 183     *Optional command*. This command allows to set internal observers, that are
 184     functions linked with a particular variable, which will be executed each
 185     time this variable is modified. It is a convenient way to monitor interest
 186     variables during the data assimilation or optimization process, by printing
 187     or plotting it, etc.
 188
 189 **OutputVariables**
 190     *Optional command*. This command allows to indicates the name and size of
 191     physical variables that are bundled together in the output observation
 192     vector. This information is dedicated to data processed inside an algorithm.
 193
 194 **Study_name**
 195     *Required command*. This is an open string to describe the study by a name
 196     or a sentence.
 197
 198 **Study_repertory**
 199     *Optional command*. If available, this repertory is used to find all the
 200     script files that can be used to define some other commands by scripts.
 201
 202 **UserDataInit**
 203     *Optional command*. This commands allows to initialize some parameters or
 204     data automatically before data assimilation algorithm processing.
 205
 206 **UserPostAnalysis**
 207     *Optional command*. This commands allows to process some parameters or data
 208     automatically after data assimilation algorithm processing. It is defined as
 209     a script or a string, allowing to put post-processing code directly inside
 210     the ADAO case.
 211
 212 Options and required commands for calculation algorithms
 213 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 214
 215 .. index:: single: 3DVAR
 216 .. index:: single: Blue
 217 .. index:: single: EnsembleBlue
 218 .. index:: single: KalmanFilter
 219 .. index:: single: ExtendedKalmanFilter
 220 .. index:: single: LinearLeastSquares
 221 .. index:: single: NonLinearLeastSquares
 222 .. index:: single: ParticleSwarmOptimization
 223 .. index:: single: QuantileRegression
 224
 225 .. index:: single: AlgorithmParameters
 226 .. index:: single: Bounds
 227 .. index:: single: CostDecrementTolerance
 228 .. index:: single: GradientNormTolerance
 229 .. index:: single: GroupRecallRate
 230 .. index:: single: MaximumNumberOfSteps
 231 .. index:: single: Minimizer
 232 .. index:: single: NumberOfInsects
 233 .. index:: single: ProjectedGradientTolerance
 234 .. index:: single: QualityCriterion
 235 .. index:: single: Quantile
 236 .. index:: single: SetSeed
 237 .. index:: single: StoreInternalVariables
 238 .. index:: single: StoreSupplementaryCalculations
 239 .. index:: single: SwarmVelocity
 240
 241 Each algorithm can be controlled using some generic or specific options given
 242 through the "*AlgorithmParameters*" optional command, as follows for example::
 243
 244     AlgorithmParameters = {
 245         "Minimizer" : "LBFGSB",
 246         "MaximumNumberOfSteps" : 25,
 247         "StoreSupplementaryCalculations" : ["APosterioriCovariance","OMA"],
 248         }
 249
 250 This section describes the available options algorithm by algorithm. If an
 251 option is specified for an algorithm that doesn't support it, the option is
 252 simply left unused. The meaning of the acronyms or particular names can be found
 253 in the :ref:`genindex` or the :ref:`section_glossary`. In addition, for each
 254 algorithm, the required commands/keywords are given, being described in `List of
 255 commands and keywords for an ADAO calculation case`_.
 256
 257 **"Blue"**
 258
 259   *Required commands*
 260     *"Background", "BackgroundError",
 261     "Observation", "ObservationError",
 262     "ObservationOperator"*
 263
 264   StoreSupplementaryCalculations
 265     This list indicates the names of the supplementary variables that can be
 266     available at the end of the algorithm. It involves potentially costly
 267     calculations. The default is a void list, none of these variables being
 268     calculated and stored by default. The possible names are in the following
 269     list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
 270     "SigmaBck2", "SigmaObs2", "MahalanobisConsistency"].
 271
 272 **"LinearLeastSquares"**
 273
 274   *Required commands*
 275     *"Observation", "ObservationError",
 276     "ObservationOperator"*
 277
 278   StoreSupplementaryCalculations
 279     This list indicates the names of the supplementary variables that can be
 280     available at the end of the algorithm. It involves potentially costly
 281     calculations. The default is a void list, none of these variables being
 282     calculated and stored by default. The possible names are in the following
 283     list: ["OMA"].
 284
 285 **"3DVAR"**
 286
 287   *Required commands*
 288     *"Background", "BackgroundError",
 289     "Observation", "ObservationError",
 290     "ObservationOperator"*
 291
 292   Minimizer
 293     This key allows to choose the optimization minimizer. The default choice
 294     is "LBFGSB", and the possible ones are "LBFGSB" (nonlinear constrained
 295     minimizer, see [Byrd95]_ and [Zhu97]_), "TNC" (nonlinear constrained
 296     minimizer), "CG" (nonlinear unconstrained minimizer), "BFGS" (nonlinear
 297     unconstrained minimizer), "NCG" (Newton CG minimizer).
 298
 299   Bounds
 300     This key allows to define upper and lower bounds for every control
 301     variable being optimized. Bounds can be given by a list of list of pairs
 302     of lower/upper bounds for each variable, with possibly ``None`` every time
 303     there is no bound. The bounds can always be specified, but they are taken
 304     into account only by the constrained minimizers.
 305
 306   MaximumNumberOfSteps
 307     This key indicates the maximum number of iterations allowed for iterative
 308     optimization. The default is 15000, which is very similar to no limit on
 309     iterations. It is then recommended to adapt this parameter to the needs on
 310     real problems. For some minimizers, the effective stopping step can be
 311     slightly different due to algorithm internal control requirements.
 312
 313   CostDecrementTolerance
 314     This key indicates a limit value, leading to stop successfully the
 315     iterative optimization process when the cost function decreases less than
 316     this tolerance at the last step. The default is 1.e-7, and it is
 317     recommended to adapt it to the needs on real problems.
 318
 319   ProjectedGradientTolerance
 320     This key indicates a limit value, leading to stop successfully the iterative
 321     optimization process when all the components of the projected gradient are
 322     under this limit. It is only used for constrained minimizers. The default is
 323     -1, that is the internal default of each minimizer (generally 1.e-5), and it
 324     is not recommended to change it.
 325
 326   GradientNormTolerance
 327     This key indicates a limit value, leading to stop successfully the
 328     iterative optimization process when the norm of the gradient is under this
 329     limit. It is only used for non-constrained minimizers.  The default is
 330     1.e-5 and it is not recommended to change it.
 331
 332   StoreInternalVariables
 333     This boolean key allows to store default internal variables, mainly the
 334     current state during iterative optimization process. Be careful, this can be
 335     a numerically costly choice in certain calculation cases. The default is
 336     "False".
 337
 338   StoreSupplementaryCalculations
 339     This list indicates the names of the supplementary variables that can be
 340     available at the end of the algorithm. It involves potentially costly
 341     calculations. The default is a void list, none of these variables being
 342     calculated and stored by default. The possible names are in the following
 343     list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
 344     "SigmaObs2", "MahalanobisConsistency"].
 345
 346 **"NonLinearLeastSquares"**
 347
 348   *Required commands*
 349     *"Background",
 350     "Observation", "ObservationError",
 351     "ObservationOperator"*
 352
 353   Minimizer
 354     This key allows to choose the optimization minimizer. The default choice
 355     is "LBFGSB", and the possible ones are "LBFGSB" (nonlinear constrained
 356     minimizer, see [Byrd95]_ and [Zhu97]_), "TNC" (nonlinear constrained
 357     minimizer), "CG" (nonlinear unconstrained minimizer), "BFGS" (nonlinear
 358     unconstrained minimizer), "NCG" (Newton CG minimizer).
 359
 360   Bounds
 361     This key allows to define upper and lower bounds for every control
 362     variable being optimized. Bounds can be given by a list of list of pairs
 363     of lower/upper bounds for each variable, with possibly ``None`` every time
 364     there is no bound. The bounds can always be specified, but they are taken
 365     into account only by the constrained minimizers.
 366
 367   MaximumNumberOfSteps
 368     This key indicates the maximum number of iterations allowed for iterative
 369     optimization. The default is 15000, which is very similar to no limit on
 370     iterations. It is then recommended to adapt this parameter to the needs on
 371     real problems. For some minimizers, the effective stopping step can be
 372     slightly different due to algorithm internal control requirements.
 373
 374   CostDecrementTolerance
 375     This key indicates a limit value, leading to stop successfully the
 376     iterative optimization process when the cost function decreases less than
 377     this tolerance at the last step. The default is 1.e-7, and it is
 378     recommended to adapt it to the needs on real problems.
 379
 380   ProjectedGradientTolerance
 381     This key indicates a limit value, leading to stop successfully the iterative
 382     optimization process when all the components of the projected gradient are
 383     under this limit. It is only used for constrained minimizers. The default is
 384     -1, that is the internal default of each minimizer (generally 1.e-5), and it
 385     is not recommended to change it.
 386
 387   GradientNormTolerance
 388     This key indicates a limit value, leading to stop successfully the
 389     iterative optimization process when the norm of the gradient is under this
 390     limit. It is only used for non-constrained minimizers.  The default is
 391     1.e-5 and it is not recommended to change it.
 392
 393   StoreInternalVariables
 394     This boolean key allows to store default internal variables, mainly the
 395     current state during iterative optimization process. Be careful, this can be
 396     a numerically costly choice in certain calculation cases. The default is
 397     "False".
 398
 399   StoreSupplementaryCalculations
 400     This list indicates the names of the supplementary variables that can be
 401     available at the end of the algorithm. It involves potentially costly
 402     calculations. The default is a void list, none of these variables being
 403     calculated and stored by default. The possible names are in the following
 404     list: ["BMA", "OMA", "OMB", "Innovation"].
 405
 406 **"EnsembleBlue"**
 407
 408   *Required commands*
 409     *"Background", "BackgroundError",
 410     "Observation", "ObservationError",
 411     "ObservationOperator"*
 412
 413   SetSeed
 414     This key allow to give an integer in order to fix the seed of the random
 415     generator used to generate the ensemble. A convenient value is for example
 416     1000. By default, the seed is left uninitialized, and so use the default
 417     initialization from the computer.
 418
 419 **"KalmanFilter"**
 420
 421   *Required commands*
 422     *"Background", "BackgroundError",
 423     "Observation", "ObservationError",
 424     "ObservationOperator",
 425     "EvolutionModel", "EvolutionError",
 426     "ControlInput"*
 427
 428   EstimationOf
 429     This key allows to choose the type of estimation to be performed. It can be
 430     either state-estimation, named "State", or parameter-estimation, named
 431     "Parameters". The default choice is "State".
 432
 433   StoreSupplementaryCalculations
 434     This list indicates the names of the supplementary variables that can be
 435     available at the end of the algorithm. It involves potentially costly
 436     calculations. The default is a void list, none of these variables being
 437     calculated and stored by default. The possible names are in the following
 438     list: ["APosterioriCovariance", "BMA", "Innovation"].
 439
 440 **"ExtendedKalmanFilter"**
 441
 442   *Required commands*
 443     *"Background", "BackgroundError",
 444     "Observation", "ObservationError",
 445     "ObservationOperator",
 446     "EvolutionModel", "EvolutionError",
 447     "ControlInput"*
 448
 449   Bounds
 450     This key allows to define upper and lower bounds for every control variable
 451     being optimized. Bounds can be given by a list of list of pairs of
 452     lower/upper bounds for each variable, with extreme values every time there
 453     is no bound. The bounds can always be specified, but they are taken into
 454     account only by the constrained minimizers.
 455
 456   ConstrainedBy
 457     This key allows to define the method to take bounds into account. The
 458     possible methods are in the following list: ["EstimateProjection"].
 459
 460   EstimationOf
 461     This key allows to choose the type of estimation to be performed. It can be
 462     either state-estimation, named "State", or parameter-estimation, named
 463     "Parameters". The default choice is "State".
 464
 465   StoreSupplementaryCalculations
 466     This list indicates the names of the supplementary variables that can be
 467     available at the end of the algorithm. It involves potentially costly
 468     calculations. The default is a void list, none of these variables being
 469     calculated and stored by default. The possible names are in the following
 470     list: ["APosterioriCovariance", "BMA", "Innovation"].
 471
 472 **"ParticleSwarmOptimization"**
 473
 474   *Required commands*
 475     *"Background", "BackgroundError",
 476     "Observation", "ObservationError",
 477     "ObservationOperator"*
 478
 479   MaximumNumberOfSteps
 480     This key indicates the maximum number of iterations allowed for iterative
 481     optimization. The default is 50, which is an arbitrary limit. It is then
 482     recommended to adapt this parameter to the needs on real problems.
 483
 484   NumberOfInsects
 485     This key indicates the number of insects or particles in the swarm. The
 486     default is 100, which is a usual default for this algorithm.
 487
 488   SwarmVelocity
 489     This key indicates the part of the insect velocity which is imposed by the
 490     swarm. It is a positive floating point value. The default value is 1.
 491
 492   GroupRecallRate
 493     This key indicates the recall rate at the best swarm insect. It is a
 494     floating point value between 0 and 1. The default value is 0.5.
 495
 496   QualityCriterion
 497     This key indicates the quality criterion, minimized to find the optimal
 498     state estimate. The default is the usual data assimilation criterion named
 499     "DA", the augmented ponderated least squares. The possible criteria has to
 500     be in the following list, where the equivalent names are indicated by "=":
 501     ["AugmentedPonderatedLeastSquares"="APLS"="DA",
 502     "PonderatedLeastSquares"="PLS", "LeastSquares"="LS"="L2",
 503     "AbsoluteValue"="L1", "MaximumError"="ME"]
 504
 505   SetSeed
 506     This key allow to give an integer in order to fix the seed of the random
 507     generator used to generate the ensemble. A convenient value is for example
 508     1000. By default, the seed is left uninitialized, and so use the default
 509     initialization from the computer.
 510
 511   StoreInternalVariables
 512     This boolean key allows to store default internal variables, mainly the
 513     current state during iterative optimization process. Be careful, this can be
 514     a numerically costly choice in certain calculation cases. The default is
 515     "False".
 516
 517   StoreSupplementaryCalculations
 518     This list indicates the names of the supplementary variables that can be
 519     available at the end of the algorithm. It involves potentially costly
 520     calculations. The default is a void list, none of these variables being
 521     calculated and stored by default. The possible names are in the following
 522     list: ["BMA", "OMA", "OMB", "Innovation"].
 523
 524 **"QuantileRegression"**
 525
 526   *Required commands*
 527     *"Background",
 528     "Observation",
 529     "ObservationOperator"*
 530
 531   Quantile
 532     This key allows to define the real value of the desired quantile, between
 533     0 and 1. The default is 0.5, corresponding to the median.
 534
 535   Minimizer
 536     This key allows to choose the optimization minimizer. The default choice
 537     and only available choice is "MMQR" (Majorize-Minimize for Quantile
 538     Regression).
 539
 540   MaximumNumberOfSteps
 541     This key indicates the maximum number of iterations allowed for iterative
 542     optimization. The default is 15000, which is very similar to no limit on
 543     iterations. It is then recommended to adapt this parameter to the needs on
 544     real problems.
 545
 546   CostDecrementTolerance
 547     This key indicates a limit value, leading to stop successfully the
 548     iterative optimization process when the cost function or the surrogate
 549     decreases less than this tolerance at the last step. The default is 1.e-6,
 550     and it is recommended to adapt it to the needs on real problems.
 551
 552   StoreInternalVariables
 553     This boolean key allows to store default internal variables, mainly the
 554     current state during iterative optimization process. Be careful, this can be
 555     a numerically costly choice in certain calculation cases. The default is
 556     "False".
 557
 558   StoreSupplementaryCalculations
 559     This list indicates the names of the supplementary variables that can be
 560     available at the end of the algorithm. It involves potentially costly
 561     calculations. The default is a void list, none of these variables being
 562     calculated and stored by default. The possible names are in the following
 563     list: ["BMA", "OMA", "OMB", "Innovation"].
 564
 565 Reference description for ADAO checking cases
 566 ---------------------------------------------
 567
 568 List of commands and keywords for an ADAO checking case
 569 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 570
 571 .. index:: single: CHECKING_STUDY
 572 .. index:: single: Algorithm
 573 .. index:: single: AlgorithmParameters
 574 .. index:: single: CheckingPoint
 575 .. index:: single: Debug
 576 .. index:: single: ObservationOperator
 577 .. index:: single: Study_name
 578 .. index:: single: Study_repertory
 579 .. index:: single: UserDataInit
 580
 581 The second set of commands is related to the description of a checking case,
 582 that is a procedure to check required properties on information somewhere else
 583 by a calculation case. The terms are ordered in alphabetical order, except the
 584 first, which describes choice between calculation or checking. The different
 585 commands are the following:
 586
 587 **CHECKING_STUDY**
 588     *Required command*. This is the general command describing the checking
 589     case. It hierarchically contains all the other commands.
 590
 591 **Algorithm**
 592     *Required command*. This is a string to indicate the data assimilation or
 593     optimization algorithm chosen. The choices are limited and available through
 594     the GUI. There exists for example "FunctionTest", "AdjointTest"... See below
 595     the list of algorithms and associated parameters in the following subsection
 596     `Options and required commands for checking algorithms`_.
 597
 598 **AlgorithmParameters**
 599     *Optional command*. This command allows to add some optional parameters to
 600     control the data assimilation or optimization algorithm. It is defined as a
 601     "*Dict*" type object, that is, given as a script. See below the list of
 602     algorithms and associated parameters in the following subsection `Options
 603     and required commands for checking algorithms`_.
 604
 605 **CheckingPoint**
 606     *Required command*. This indicates the vector used,
 607     previously noted as :math:`\mathbf{x}^b`. It is defined as a "*Vector*" type
 608     object, that is, given either as a string or as a script.
 609
 610 **Debug**
 611     *Required command*. This define the level of trace and intermediary debug
 612     information. The choices are limited between 0 (for False) and 1 (for
 613     True).
 614
 615 **ObservationOperator**
 616     *Required command*. This indicates the observation operator, previously
 617     noted :math:`H`, which transforms the input parameters :math:`\mathbf{x}` to
 618     results :math:`\mathbf{y}` to be compared to observations
 619     :math:`\mathbf{y}^o`. It is defined as a "*Function*" type object, that is,
 620     given as a script. Different functional forms can be used, as described in
 621     the following subsection `Requirements for functions describing an
 622     operator`_.
 623
 624 **Study_name**
 625     *Required command*. This is an open string to describe the study by a name
 626     or a sentence.
 627
 628 **Study_repertory**
 629     *Optional command*. If available, this repertory is used to find all the
 630     script files that can be used to define some other commands by scripts.
 631
 632 **UserDataInit**
 633     *Optional command*. This commands allows to initialize some parameters or
 634     data automatically before data assimilation algorithm processing.
 635
 636 Options and required commands for checking algorithms
 637 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 638
 639 .. index:: single: AdjointTest
 640 .. index:: single: FunctionTest
 641 .. index:: single: GradientTest
 642
 643 .. index:: single: AlgorithmParameters
 644 .. index:: single: AmplitudeOfInitialDirection
 645 .. index:: single: EpsilonMinimumExponent
 646 .. index:: single: InitialDirection
 647 .. index:: single: ResiduFormula
 648 .. index:: single: SetSeed
 649
 650 We recall that each algorithm can be controlled using some generic or specific
 651 options given through the "*AlgorithmParameters*" optional command, as follows
 652 for example::
 653
 654     AlgorithmParameters = {
 655         "AmplitudeOfInitialDirection" : 1,
 656         "EpsilonMinimumExponent" : -8,
 657         }
 658
 659 If an option is specified for an algorithm that doesn't support it, the option
 660 is simply left unused. The meaning of the acronyms or particular names can be
 661 found in the :ref:`genindex` or the :ref:`section_glossary`. In addition, for
 662 each algorithm, the required commands/keywords are given, being described in
 663 `List of commands and keywords for an ADAO checking case`_.
 664
 665 **"AdjointTest"**
 666
 667   *Required commands*
 668     *"CheckingPoint",
 669     "ObservationOperator"*
 670
 671   AmplitudeOfInitialDirection
 672     This key indicates the scaling of the initial perturbation build as a vector
 673     used for the directional derivative around the nominal checking point. The
 674     default is 1, that means no scaling.
 675
 676   EpsilonMinimumExponent
 677     This key indicates the minimal exponent value of the power of 10 coefficient
 678     to be used to decrease the increment multiplier. The default is -8, and it
 679     has to be between 0 and -20. For example, its default value leads to
 680     calculate the residue of the scalar product formula with a fixed increment
 681     multiplied from 1.e0 to 1.e-8.
 682
 683   InitialDirection
 684     This key indicates the vector direction used for the directional derivative
 685     around the nominal checking point. It has to be a vector. If not specified,
 686     this direction defaults to a random perturbation around zero of the same
 687     vector size than the checking point.
 688
 689   SetSeed
 690     This key allow to give an integer in order to fix the seed of the random
 691     generator used to generate the ensemble. A convenient value is for example
 692     1000. By default, the seed is left uninitialized, and so use the default
 693     initialization from the computer.
 694
 695 **"FunctionTest"**
 696
 697   *Required commands*
 698     *"CheckingPoint",
 699     "ObservationOperator"*
 700
 701   No option
 702
 703 **"GradientTest"**
 704
 705   *Required commands*
 706     *"CheckingPoint",
 707     "ObservationOperator"*
 708
 709   AmplitudeOfInitialDirection
 710     This key indicates the scaling of the initial perturbation build as a vector
 711     used for the directional derivative around the nominal checking point. The
 712     default is 1, that means no scaling.
 713
 714   EpsilonMinimumExponent
 715     This key indicates the minimal exponent value of the power of 10 coefficient
 716     to be used to decrease the increment multiplier. The default is -8, and it
 717     has to be between 0 and -20. For example, its default value leads to
 718     calculate the residue of the scalar product formula with a fixed increment
 719     multiplied from 1.e0 to 1.e-8.
 720
 721   InitialDirection
 722     This key indicates the vector direction used for the directional derivative
 723     around the nominal checking point. It has to be a vector. If not specified,
 724     this direction defaults to a random perturbation around zero of the same
 725     vector size than the checking point.
 726
 727   ResiduFormula
 728     This key indicates the residue formula that has to be used for the test. The
 729     default choice is "Taylor", and the possible ones are "Taylor" (residue of
 730     the Taylor development of the operator, which has to decrease with the power
 731     of 2 in perturbation) and "Norm" (residue obtained by taking the norm of the
 732     Taylor development at zero order approximation, which approximate the
 733     gradient, and which has to remain constant).
 734
 735   SetSeed
 736     This key allow to give an integer in order to fix the seed of the random
 737     generator used to generate the ensemble. A convenient value is for example
 738     1000. By default, the seed is left uninitialized, and so use the default
 739     initialization from the computer.
 740
 741 Requirements for functions describing an operator
 742 -------------------------------------------------
 743
 744 The operators for observation and evolution are required to implement the data
 745 assimilation or optimization procedures. They include the physical simulation
 746 numerical simulations, but also the filtering and restriction to compare the
 747 simulation to observation. The evolution operator is considered here in its
 748 incremental form, representing the transition between two successive states, and
 749 is then similar to the observation operator.
 750
 751 Schematically, an operator has to give a output solution given the input
 752 parameters. Part of the input parameters can be modified during the optimization
 753 procedure. So the mathematical representation of such a process is a function.
 754 It was briefly described in the section :ref:`section_theory` and is generalized
 755 here by the relation:
 756
 757 .. math:: \mathbf{y} = O( \mathbf{x} )
 758
 759 between the pseudo-observations :math:`\mathbf{y}` and the parameters
 760 :math:`\mathbf{x}` using the observation or evolution operator :math:`O`. The
 761 same functional representation can be used for the linear tangent model
 762 :math:`\mathbf{O}` of :math:`O` and its adjoint :math:`\mathbf{O}^*`, also
 763 required by some data assimilation or optimization algorithms.
 764
 765 Then, **to describe completely an operator, the user has only to provide a
 766 function that fully and only realize the functional operation**.
 767
 768 This function is usually given as a script that can be executed in a YACS node.
 769 This script can without difference launch external codes or use internal SALOME
 770 calls and methods. If the algorithm requires the 3 aspects of the operator
 771 (direct form, tangent form and adjoint form), the user has to give the 3
 772 functions or to approximate them.
 773
 774 There are 3 practical methods for the user to provide the operator functional
 775 representation.
 776
 777 First functional form: using "*ScriptWithOneFunction*"
 778 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 779
 780 The first one consist in providing only one potentially non-linear function, and
 781 to approximate the tangent and the adjoint operators. This is done by using the
 782 keyword "*ScriptWithOneFunction*" for the description of the chosen operator in
 783 the ADAO GUI. The user have to provide the function in a script, with a
 784 mandatory name "*DirectOperator*". For example, the script can follow the
 785 template::
 786
 787     def DirectOperator( X ):
 788         """ Direct non-linear simulation operator """
 789         ...
 790         ...
 791         ...
 792         return Y=O(X)
 793
 794 In this case, the user can also provide a value for the differential increment,
 795 using through the GUI the keyword "*DifferentialIncrement*", which has a default
 796 value of 1%. This coefficient will be used in the finite difference
 797 approximation to build the tangent and adjoint operators.
 798
 799 This first operator definition allow easily to test the functional form before
 800 its use in an ADAO case, reducing the complexity of implementation.
 801
 802 Second functional form: using "*ScriptWithFunctions*"
 803 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 804
 805 The second one consist in providing directly the three associated operators
 806 :math:`O`, :math:`\mathbf{O}` and :math:`\mathbf{O}^*`. This is done by using
 807 the keyword "*ScriptWithFunctions*" for the description of the chosen operator
 808 in the ADAO GUI. The user have to provide three functions in one script, with
 809 three mandatory names "*DirectOperator*", "*TangentOperator*" and
 810 "*AdjointOperator*". For example, the script can follow the template::
 811
 812     def DirectOperator( X ):
 813         """ Direct non-linear simulation operator """
 814         ...
 815         ...
 816         ...
 817         return something like Y
 818
 819     def TangentOperator( (X, dX) ):
 820         """ Tangent linear operator, around X, applied to dX """
 821         ...
 822         ...
 823         ...
 824         return something like Y
 825
 826     def AdjointOperator( (X, Y) ):
 827         """ Adjoint operator, around X, applied to Y """
 828         ...
 829         ...
 830         ...
 831         return something like X
 832
 833 Another time, this second perator definition allow easily to test the functional
 834 forms before their use in an ADAO case, greatly reducing the complexity of
 835 implementation.
 836
 837 Third functional form: using "*ScriptWithSwitch*"
 838 +++++++++++++++++++++++++++++++++++++++++++++++++
 839
 840 This third form give more possibilities to control the execution of the three
 841 functions representing the operator, allowing advanced usage and control over
 842 each execution of the simulation code. This is done by using the keyword
 843 "*ScriptWithSwitch*" for the description of the chosen operator in the ADAO GUI.
 844 The user have to provide a switch in one script to control the execution of the
 845 direct, tangent and adjoint forms of its simulation code. The user can then, for
 846 example, use other approximations for the tangent and adjoint codes, or
 847 introduce more complexity in the argument treatment of the functions. But it
 848 will be far more complicated to implement and debug.
 849
 850 **It is recommended not to use this third functional form without a solid
 851 numerical or physical reason.**
 852
 853 If, however, you want to use this third form, we recommend using the following
 854 template for the switch. It requires an external script or code named
 855 "*Physical_simulation_functions.py*", containing three functions named
 856 "*DirectOperator*", "*TangentOperator*" and "*AdjointOperator*" as previously.
 857 Here is the switch template::
 858
 859     import Physical_simulation_functions
 860     import numpy, logging
 861     #
 862     method = ""
 863     for param in computation["specificParameters"]:
 864         if param["name"] == "method":
 865             method = param["value"]
 866     if method not in ["Direct", "Tangent", "Adjoint"]:
 867         raise ValueError("No valid computation method is given")
 868     logging.info("Found method is \'%s\'"%method)
 869     #
 870     logging.info("Loading operator functions")
 871     Function = Physical_simulation_functions.DirectOperator
 872     Tangent  = Physical_simulation_functions.TangentOperator
 873     Adjoint  = Physical_simulation_functions.AdjointOperator
 874     #
 875     logging.info("Executing the possible computations")
 876     data = []
 877     if method == "Direct":
 878         logging.info("Direct computation")
 879         Xcurrent = computation["inputValues"][0][0][0]
 880         data = Function(numpy.matrix( Xcurrent ).T)
 881     if method == "Tangent":
 882         logging.info("Tangent computation")
 883         Xcurrent  = computation["inputValues"][0][0][0]
 884         dXcurrent = computation["inputValues"][0][0][1]
 885         data = Tangent(numpy.matrix(Xcurrent).T, numpy.matrix(dXcurrent).T)
 886     if method == "Adjoint":
 887         logging.info("Adjoint computation")
 888         Xcurrent = computation["inputValues"][0][0][0]
 889         Ycurrent = computation["inputValues"][0][0][1]
 890         data = Adjoint((numpy.matrix(Xcurrent).T, numpy.matrix(Ycurrent).T))
 891     #
 892     logging.info("Formatting the output")
 893     it = numpy.ravel(data)
 894     outputValues = [[[[]]]]
 895     for val in it:
 896       outputValues[0][0][0].append(val)
 897     #
 898     result = {}
 899     result["outputValues"]        = outputValues
 900     result["specificOutputInfos"] = []
 901     result["returnCode"]          = 0
 902     result["errorMessage"]        = ""
 903
 904 All various modifications could be done from this template hypothesis.
 905
 906 Special case of controled evolution operator
 907 ++++++++++++++++++++++++++++++++++++++++++++
 908
 909 In some cases, the evolution or the observation operators are required to be
 910 controled by an external input control, given a priori. In this case, the
 911 generic form of the incremental evolution model is slightly modified as follows:
 912
 913 .. math:: \mathbf{y} = O( \mathbf{x}, \mathbf{u})
 914
 915 where :math:`\mathbf{u}` is the control over one state increment. In this case,
 916 the direct operator has to be applied to a pair of variables :math:`(X,U)`.
 917 Schematically, the operator has to be set as::
 918
 919     def DirectOperator( (X, U) ):
 920         """ Direct non-linear simulation operator """
 921         ...
 922         ...
 923         ...
 924         return something like X(n+1) or Y(n+1)
 925
 926 The tangent and adjoint operators have the same signature as previously, noting
 927 that the derivatives has to be done only partially against :math:`\mathbf{x}`.
 928 In such a case with explicit control, only the second functional form (using
 929 "*ScriptWithFunctions*") and third functional form (using "*ScriptWithSwitch*")
 930 can be used.