doc/reference.rst

   1 .. _section_reference:
   2
   3 ================================================================================
   4 Reference description of the ADAO commands and keywords
   5 ================================================================================
   6
   7 This section presents the reference description of the ADAO commands and
   8 keywords available through the GUI or through scripts.
   9
  10 Each command or keyword to be defined through the ADAO GUI has some properties.
  11 The first property is to be *required*, *optional* or only factual, describing a
  12 type of input. The second property is to be an "open" variable with a fixed type
  13 but with any value allowed by the type, or a "restricted" variable, limited to
  14 some specified values. The EFICAS editor GUI having build-in validating
  15 capacities, the properties of the commands or keywords given through this GUI
  16 are automatically correct.
  17
  18 The mathematical notations used afterward are explained in the section
  19 :ref:`section_theory`.
  20
  21 Examples of using these commands are available in the section
  22 :ref:`section_examples` and in example files installed with ADAO module.
  23
  24 List of possible input types
  25 ----------------------------
  26
  27 .. index:: single: Dict
  28 .. index:: single: Function
  29 .. index:: single: Matrix
  30 .. index:: single: ScalarSparseMatrix
  31 .. index:: single: DiagonalSparseMatrix
  32 .. index:: single: String
  33 .. index:: single: Script
  34 .. index:: single: Vector
  35
  36 Each ADAO variable has a pseudo-type to help filling it and validation. The
  37 different pseudo-types are:
  38
  39 **Dict**
  40     This indicates a variable that has to be filled by a dictionary, usually
  41     given as a script.
  42
  43 **Function**
  44     This indicates a variable that has to be filled by a function, usually given
  45     as a script or a component method.
  46
  47 **Matrix**
  48     This indicates a variable that has to be filled by a matrix, usually given
  49     either as a string or as a script.
  50
  51 **ScalarSparseMatrix**
  52     This indicates a variable that has to be filled by a unique number, which
  53     will be used to multiply an identity matrix, usually given either as a
  54     string or as a script.
  55
  56 **DiagonalSparseMatrix**
  57     This indicates a variable that has to be filled by a vector, which will be
  58     over the diagonal of an identity matrix, usually given either as a string or
  59     as a script.
  60
  61 **Script**
  62     This indicates a script given as an external file. It can be described by a
  63     full absolute path name or only by the file name without path.
  64
  65 **String**
  66     This indicates a string giving a literal representation of a matrix, a
  67     vector or a vector serie, such as "1 2 ; 3 4" for a square 2x2 matrix.
  68
  69 **Vector**
  70     This indicates a variable that has to be filled by a vector, usually given
  71     either as a string or as a script.
  72
  73 **VectorSerie** This indicates a variable that has to be filled by a list of
  74     vectors, usually given either as a string or as a script.
  75
  76 When a command or keyword can be filled by a script file name, the script has to
  77 contain a variable or a method that has the same name as the one to be filled.
  78 In other words, when importing the script in a YACS Python node, it must create
  79 a variable of the good name in the current namespace.
  80
  81 Reference description for ADAO calculation cases
  82 ------------------------------------------------
  83
  84 List of commands and keywords for an ADAO calculation case
  85 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  86
  87 .. index:: single: ASSIMILATION_STUDY
  88 .. index:: single: Algorithm
  89 .. index:: single: AlgorithmParameters
  90 .. index:: single: Background
  91 .. index:: single: BackgroundError
  92 .. index:: single: ControlInput
  93 .. index:: single: Debug
  94 .. index:: single: EvolutionError
  95 .. index:: single: EvolutionModel
  96 .. index:: single: InputVariables
  97 .. index:: single: Observation
  98 .. index:: single: ObservationError
  99 .. index:: single: ObservationOperator
 100 .. index:: single: Observers
 101 .. index:: single: OutputVariables
 102 .. index:: single: Study_name
 103 .. index:: single: Study_repertory
 104 .. index:: single: UserDataInit
 105 .. index:: single: UserPostAnalysis
 106
 107 The first set of commands is related to the description of a calculation case,
 108 that is a *Data Assimilation* procedure or an *Optimization* procedure. The
 109 terms are ordered in alphabetical order, except the first, which describes
 110 choice between calculation or checking. The different commands are the
 111 following:
 112
 113 **ASSIMILATION_STUDY**
 114     *Required command*. This is the general command describing the data
 115     assimilation or optimization case. It hierarchically contains all the other
 116     commands.
 117
 118 **Algorithm**
 119     *Required command*. This is a string to indicate the data assimilation or
 120     optimization algorithm chosen. The choices are limited and available through
 121     the GUI. There exists for example "3DVAR", "Blue"... See below the list of
 122     algorithms and associated parameters in the following subsection `Options
 123     and required commands for calculation algorithms`_.
 124
 125 **AlgorithmParameters**
 126     *Optional command*. This command allows to add some optional parameters to
 127     control the data assimilation or optimization algorithm. It is defined as a
 128     "*Dict*" type object, that is, given as a script. See below the list of
 129     algorithms and associated parameters in the following subsection `Options
 130     and required commands for calculation algorithms`_.
 131
 132 **Background**
 133     *Required command*. This indicates the background or initial vector used,
 134     previously noted as :math:`\mathbf{x}^b`. It is defined as a "*Vector*" type
 135     object, that is, given either as a string or as a script.
 136
 137 **BackgroundError**
 138     *Required command*. This indicates the background error covariance matrix,
 139     previously noted as :math:`\mathbf{B}`. It is defined as a "*Matrix*" type
 140     object, a "*ScalarSparseMatrix*" type object, or a "*DiagonalSparseMatrix*"
 141     type object, that is, given either as a string or as a script.
 142
 143 **ControlInput**
 144     *Optional command*. This indicates the control vector used to force the
 145     evolution model at each step, usually noted as :math:`\mathbf{U}`. It is
 146     defined as a "*Vector*" or a *VectorSerie* type object, that is, given
 147     either as a string or as a script. When there is no control, it has to be a
 148     void string ''.
 149
 150 **Debug**
 151     *Required command*. This define the level of trace and intermediary debug
 152     information. The choices are limited between 0 (for False) and 1 (for
 153     True).
 154
 155 **EvolutionError**
 156     *Optional command*. This indicates the evolution error covariance matrix,
 157     usually noted as :math:`\mathbf{Q}`. It is defined as a "*Matrix*" type
 158     object, a "*ScalarSparseMatrix*" type object, or a "*DiagonalSparseMatrix*"
 159     type object, that is, given either as a string or as a script.
 160
 161 **EvolutionModel**
 162     *Optional command*. This indicates the evolution model operator, usually
 163     noted :math:`M`, which describes a step of evolution. It is defined as a
 164     "*Function*" type object, that is, given as a script. Different functional
 165     forms can be used, as described in the following subsection `Requirements
 166     for functions describing an operator`_. If there is some control :math:`U`
 167     included in the evolution model, the operator has to be applied to a pair
 168     :math:`(X,U)`.
 169
 170 **InputVariables**
 171     *Optional command*. This command allows to indicates the name and size of
 172     physical variables that are bundled together in the control vector. This
 173     information is dedicated to data processed inside an algorithm.
 174
 175 **Observation**
 176     *Required command*. This indicates the observation vector used for data
 177     assimilation or optimization, previously noted as :math:`\mathbf{y}^o`. It
 178     is defined as a "*Vector*" or a *VectorSerie* type object, that is, given
 179     either as a string or as a script.
 180
 181 **ObservationError**
 182     *Required command*. This indicates the observation error covariance matrix,
 183     previously noted as :math:`\mathbf{R}`. It is defined as a "*Matrix*" type
 184     object, a "*ScalarSparseMatrix*" type object, or a "*DiagonalSparseMatrix*"
 185     type object, that is, given either as a string or as a script.
 186
 187 **ObservationOperator**
 188     *Required command*. This indicates the observation operator, previously
 189     noted :math:`H`, which transforms the input parameters :math:`\mathbf{x}` to
 190     results :math:`\mathbf{y}` to be compared to observations
 191     :math:`\mathbf{y}^o`. It is defined as a "*Function*" type object, that is,
 192     given as a script. Different functional forms can be used, as described in
 193     the following subsection `Requirements for functions describing an
 194     operator`_. If there is some control :math:`U` included in the observation,
 195     the operator has to be applied to a pair :math:`(X,U)`.
 196
 197 **Observers**
 198     *Optional command*. This command allows to set internal observers, that are
 199     functions linked with a particular variable, which will be executed each
 200     time this variable is modified. It is a convenient way to monitor interest
 201     variables during the data assimilation or optimization process, by printing
 202     or plotting it, etc.
 203
 204 **OutputVariables**
 205     *Optional command*. This command allows to indicates the name and size of
 206     physical variables that are bundled together in the output observation
 207     vector. This information is dedicated to data processed inside an algorithm.
 208
 209 **Study_name**
 210     *Required command*. This is an open string to describe the study by a name
 211     or a sentence.
 212
 213 **Study_repertory**
 214     *Optional command*. If available, this repertory is used to find all the
 215     script files that can be used to define some other commands by scripts.
 216
 217 **UserDataInit**
 218     *Optional command*. This commands allows to initialize some parameters or
 219     data automatically before data assimilation algorithm processing.
 220
 221 **UserPostAnalysis**
 222     *Optional command*. This commands allows to process some parameters or data
 223     automatically after data assimilation algorithm processing. It is defined as
 224     a script or a string, allowing to put post-processing code directly inside
 225     the ADAO case.
 226
 227 Options and required commands for calculation algorithms
 228 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 229
 230 .. index:: single: 3DVAR
 231 .. index:: single: Blue
 232 .. index:: single: EnsembleBlue
 233 .. index:: single: KalmanFilter
 234 .. index:: single: ExtendedKalmanFilter
 235 .. index:: single: LinearLeastSquares
 236 .. index:: single: NonLinearLeastSquares
 237 .. index:: single: ParticleSwarmOptimization
 238 .. index:: single: QuantileRegression
 239
 240 .. index:: single: AlgorithmParameters
 241 .. index:: single: Bounds
 242 .. index:: single: CostDecrementTolerance
 243 .. index:: single: GradientNormTolerance
 244 .. index:: single: GroupRecallRate
 245 .. index:: single: MaximumNumberOfSteps
 246 .. index:: single: Minimizer
 247 .. index:: single: NumberOfInsects
 248 .. index:: single: ProjectedGradientTolerance
 249 .. index:: single: QualityCriterion
 250 .. index:: single: Quantile
 251 .. index:: single: SetSeed
 252 .. index:: single: StoreInternalVariables
 253 .. index:: single: StoreSupplementaryCalculations
 254 .. index:: single: SwarmVelocity
 255
 256 Each algorithm can be controlled using some generic or specific options given
 257 through the "*AlgorithmParameters*" optional command, as follows for example::
 258
 259     AlgorithmParameters = {
 260         "Minimizer" : "LBFGSB",
 261         "MaximumNumberOfSteps" : 25,
 262         "StoreSupplementaryCalculations" : ["APosterioriCovariance","OMA"],
 263         }
 264
 265 This section describes the available options algorithm by algorithm. If an
 266 option is specified for an algorithm that doesn't support it, the option is
 267 simply left unused. The meaning of the acronyms or particular names can be found
 268 in the :ref:`genindex` or the :ref:`section_glossary`. In addition, for each
 269 algorithm, the required commands/keywords are given, being described in `List of
 270 commands and keywords for an ADAO calculation case`_.
 271
 272 **"Blue"**
 273
 274   *Required commands*
 275     *"Background", "BackgroundError",
 276     "Observation", "ObservationError",
 277     "ObservationOperator"*
 278
 279   StoreInternalVariables
 280     This boolean key allows to store default internal variables, mainly the
 281     current state during iterative optimization process. Be careful, this can be
 282     a numerically costly choice in certain calculation cases. The default is
 283     "False".
 284
 285   StoreSupplementaryCalculations
 286     This list indicates the names of the supplementary variables that can be
 287     available at the end of the algorithm. It involves potentially costly
 288     calculations. The default is a void list, none of these variables being
 289     calculated and stored by default. The possible names are in the following
 290     list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
 291     "SigmaBck2", "SigmaObs2", "MahalanobisConsistency"].
 292
 293 **"LinearLeastSquares"**
 294
 295   *Required commands*
 296     *"Observation", "ObservationError",
 297     "ObservationOperator"*
 298
 299   StoreInternalVariables
 300     This boolean key allows to store default internal variables, mainly the
 301     current state during iterative optimization process. Be careful, this can be
 302     a numerically costly choice in certain calculation cases. The default is
 303     "False".
 304
 305   StoreSupplementaryCalculations
 306     This list indicates the names of the supplementary variables that can be
 307     available at the end of the algorithm. It involves potentially costly
 308     calculations. The default is a void list, none of these variables being
 309     calculated and stored by default. The possible names are in the following
 310     list: ["OMA"].
 311
 312 **"3DVAR"**
 313
 314   *Required commands*
 315     *"Background", "BackgroundError",
 316     "Observation", "ObservationError",
 317     "ObservationOperator"*
 318
 319   Minimizer
 320     This key allows to choose the optimization minimizer. The default choice
 321     is "LBFGSB", and the possible ones are "LBFGSB" (nonlinear constrained
 322     minimizer, see [Byrd95]_ and [Zhu97]_), "TNC" (nonlinear constrained
 323     minimizer), "CG" (nonlinear unconstrained minimizer), "BFGS" (nonlinear
 324     unconstrained minimizer), "NCG" (Newton CG minimizer).
 325
 326   Bounds
 327     This key allows to define upper and lower bounds for every control
 328     variable being optimized. Bounds can be given by a list of list of pairs
 329     of lower/upper bounds for each variable, with possibly ``None`` every time
 330     there is no bound. The bounds can always be specified, but they are taken
 331     into account only by the constrained minimizers.
 332
 333   MaximumNumberOfSteps
 334     This key indicates the maximum number of iterations allowed for iterative
 335     optimization. The default is 15000, which is very similar to no limit on
 336     iterations. It is then recommended to adapt this parameter to the needs on
 337     real problems. For some minimizers, the effective stopping step can be
 338     slightly different due to algorithm internal control requirements.
 339
 340   CostDecrementTolerance
 341     This key indicates a limit value, leading to stop successfully the
 342     iterative optimization process when the cost function decreases less than
 343     this tolerance at the last step. The default is 1.e-7, and it is
 344     recommended to adapt it to the needs on real problems.
 345
 346   ProjectedGradientTolerance
 347     This key indicates a limit value, leading to stop successfully the iterative
 348     optimization process when all the components of the projected gradient are
 349     under this limit. It is only used for constrained minimizers. The default is
 350     -1, that is the internal default of each minimizer (generally 1.e-5), and it
 351     is not recommended to change it.
 352
 353   GradientNormTolerance
 354     This key indicates a limit value, leading to stop successfully the
 355     iterative optimization process when the norm of the gradient is under this
 356     limit. It is only used for non-constrained minimizers.  The default is
 357     1.e-5 and it is not recommended to change it.
 358
 359   StoreInternalVariables
 360     This boolean key allows to store default internal variables, mainly the
 361     current state during iterative optimization process. Be careful, this can be
 362     a numerically costly choice in certain calculation cases. The default is
 363     "False".
 364
 365   StoreSupplementaryCalculations
 366     This list indicates the names of the supplementary variables that can be
 367     available at the end of the algorithm. It involves potentially costly
 368     calculations. The default is a void list, none of these variables being
 369     calculated and stored by default. The possible names are in the following
 370     list: ["APosterioriCovariance", "BMA", "OMA", "OMB", "Innovation",
 371     "SigmaObs2", "MahalanobisConsistency"].
 372
 373 **"NonLinearLeastSquares"**
 374
 375   *Required commands*
 376     *"Background",
 377     "Observation", "ObservationError",
 378     "ObservationOperator"*
 379
 380   Minimizer
 381     This key allows to choose the optimization minimizer. The default choice
 382     is "LBFGSB", and the possible ones are "LBFGSB" (nonlinear constrained
 383     minimizer, see [Byrd95]_ and [Zhu97]_), "TNC" (nonlinear constrained
 384     minimizer), "CG" (nonlinear unconstrained minimizer), "BFGS" (nonlinear
 385     unconstrained minimizer), "NCG" (Newton CG minimizer).
 386
 387   Bounds
 388     This key allows to define upper and lower bounds for every control
 389     variable being optimized. Bounds can be given by a list of list of pairs
 390     of lower/upper bounds for each variable, with possibly ``None`` every time
 391     there is no bound. The bounds can always be specified, but they are taken
 392     into account only by the constrained minimizers.
 393
 394   MaximumNumberOfSteps
 395     This key indicates the maximum number of iterations allowed for iterative
 396     optimization. The default is 15000, which is very similar to no limit on
 397     iterations. It is then recommended to adapt this parameter to the needs on
 398     real problems. For some minimizers, the effective stopping step can be
 399     slightly different due to algorithm internal control requirements.
 400
 401   CostDecrementTolerance
 402     This key indicates a limit value, leading to stop successfully the
 403     iterative optimization process when the cost function decreases less than
 404     this tolerance at the last step. The default is 1.e-7, and it is
 405     recommended to adapt it to the needs on real problems.
 406
 407   ProjectedGradientTolerance
 408     This key indicates a limit value, leading to stop successfully the iterative
 409     optimization process when all the components of the projected gradient are
 410     under this limit. It is only used for constrained minimizers. The default is
 411     -1, that is the internal default of each minimizer (generally 1.e-5), and it
 412     is not recommended to change it.
 413
 414   GradientNormTolerance
 415     This key indicates a limit value, leading to stop successfully the
 416     iterative optimization process when the norm of the gradient is under this
 417     limit. It is only used for non-constrained minimizers.  The default is
 418     1.e-5 and it is not recommended to change it.
 419
 420   StoreInternalVariables
 421     This boolean key allows to store default internal variables, mainly the
 422     current state during iterative optimization process. Be careful, this can be
 423     a numerically costly choice in certain calculation cases. The default is
 424     "False".
 425
 426   StoreSupplementaryCalculations
 427     This list indicates the names of the supplementary variables that can be
 428     available at the end of the algorithm. It involves potentially costly
 429     calculations. The default is a void list, none of these variables being
 430     calculated and stored by default. The possible names are in the following
 431     list: ["BMA", "OMA", "OMB", "Innovation"].
 432
 433 **"EnsembleBlue"**
 434
 435   *Required commands*
 436     *"Background", "BackgroundError",
 437     "Observation", "ObservationError",
 438     "ObservationOperator"*
 439
 440   SetSeed
 441     This key allow to give an integer in order to fix the seed of the random
 442     generator used to generate the ensemble. A convenient value is for example
 443     1000. By default, the seed is left uninitialized, and so use the default
 444     initialization from the computer.
 445
 446 **"KalmanFilter"**
 447
 448   *Required commands*
 449     *"Background", "BackgroundError",
 450     "Observation", "ObservationError",
 451     "ObservationOperator",
 452     "EvolutionModel", "EvolutionError",
 453     "ControlInput"*
 454
 455   EstimationOf
 456     This key allows to choose the type of estimation to be performed. It can be
 457     either state-estimation, named "State", or parameter-estimation, named
 458     "Parameters". The default choice is "State".
 459
 460   StoreSupplementaryCalculations
 461     This list indicates the names of the supplementary variables that can be
 462     available at the end of the algorithm. It involves potentially costly
 463     calculations. The default is a void list, none of these variables being
 464     calculated and stored by default. The possible names are in the following
 465     list: ["APosterioriCovariance", "BMA", "Innovation"].
 466
 467 **"ExtendedKalmanFilter"**
 468
 469   *Required commands*
 470     *"Background", "BackgroundError",
 471     "Observation", "ObservationError",
 472     "ObservationOperator",
 473     "EvolutionModel", "EvolutionError",
 474     "ControlInput"*
 475
 476   Bounds
 477     This key allows to define upper and lower bounds for every control variable
 478     being optimized. Bounds can be given by a list of list of pairs of
 479     lower/upper bounds for each variable, with extreme values every time there
 480     is no bound. The bounds can always be specified, but they are taken into
 481     account only by the constrained minimizers.
 482
 483   ConstrainedBy
 484     This key allows to define the method to take bounds into account. The
 485     possible methods are in the following list: ["EstimateProjection"].
 486
 487   EstimationOf
 488     This key allows to choose the type of estimation to be performed. It can be
 489     either state-estimation, named "State", or parameter-estimation, named
 490     "Parameters". The default choice is "State".
 491
 492   StoreSupplementaryCalculations
 493     This list indicates the names of the supplementary variables that can be
 494     available at the end of the algorithm. It involves potentially costly
 495     calculations. The default is a void list, none of these variables being
 496     calculated and stored by default. The possible names are in the following
 497     list: ["APosterioriCovariance", "BMA", "Innovation"].
 498
 499 **"ParticleSwarmOptimization"**
 500
 501   *Required commands*
 502     *"Background", "BackgroundError",
 503     "Observation", "ObservationError",
 504     "ObservationOperator"*
 505
 506   MaximumNumberOfSteps
 507     This key indicates the maximum number of iterations allowed for iterative
 508     optimization. The default is 50, which is an arbitrary limit. It is then
 509     recommended to adapt this parameter to the needs on real problems.
 510
 511   NumberOfInsects
 512     This key indicates the number of insects or particles in the swarm. The
 513     default is 100, which is a usual default for this algorithm.
 514
 515   SwarmVelocity
 516     This key indicates the part of the insect velocity which is imposed by the
 517     swarm. It is a positive floating point value. The default value is 1.
 518
 519   GroupRecallRate
 520     This key indicates the recall rate at the best swarm insect. It is a
 521     floating point value between 0 and 1. The default value is 0.5.
 522
 523   QualityCriterion
 524     This key indicates the quality criterion, minimized to find the optimal
 525     state estimate. The default is the usual data assimilation criterion named
 526     "DA", the augmented ponderated least squares. The possible criteria has to
 527     be in the following list, where the equivalent names are indicated by "=":
 528     ["AugmentedPonderatedLeastSquares"="APLS"="DA",
 529     "PonderatedLeastSquares"="PLS", "LeastSquares"="LS"="L2",
 530     "AbsoluteValue"="L1", "MaximumError"="ME"]
 531
 532   SetSeed
 533     This key allow to give an integer in order to fix the seed of the random
 534     generator used to generate the ensemble. A convenient value is for example
 535     1000. By default, the seed is left uninitialized, and so use the default
 536     initialization from the computer.
 537
 538   StoreInternalVariables
 539     This boolean key allows to store default internal variables, mainly the
 540     current state during iterative optimization process. Be careful, this can be
 541     a numerically costly choice in certain calculation cases. The default is
 542     "False".
 543
 544   StoreSupplementaryCalculations
 545     This list indicates the names of the supplementary variables that can be
 546     available at the end of the algorithm. It involves potentially costly
 547     calculations. The default is a void list, none of these variables being
 548     calculated and stored by default. The possible names are in the following
 549     list: ["BMA", "OMA", "OMB", "Innovation"].
 550
 551 **"QuantileRegression"**
 552
 553   *Required commands*
 554     *"Background",
 555     "Observation",
 556     "ObservationOperator"*
 557
 558   Quantile
 559     This key allows to define the real value of the desired quantile, between
 560     0 and 1. The default is 0.5, corresponding to the median.
 561
 562   Minimizer
 563     This key allows to choose the optimization minimizer. The default choice
 564     and only available choice is "MMQR" (Majorize-Minimize for Quantile
 565     Regression).
 566
 567   MaximumNumberOfSteps
 568     This key indicates the maximum number of iterations allowed for iterative
 569     optimization. The default is 15000, which is very similar to no limit on
 570     iterations. It is then recommended to adapt this parameter to the needs on
 571     real problems.
 572
 573   CostDecrementTolerance
 574     This key indicates a limit value, leading to stop successfully the
 575     iterative optimization process when the cost function or the surrogate
 576     decreases less than this tolerance at the last step. The default is 1.e-6,
 577     and it is recommended to adapt it to the needs on real problems.
 578
 579   StoreInternalVariables
 580     This boolean key allows to store default internal variables, mainly the
 581     current state during iterative optimization process. Be careful, this can be
 582     a numerically costly choice in certain calculation cases. The default is
 583     "False".
 584
 585   StoreSupplementaryCalculations
 586     This list indicates the names of the supplementary variables that can be
 587     available at the end of the algorithm. It involves potentially costly
 588     calculations. The default is a void list, none of these variables being
 589     calculated and stored by default. The possible names are in the following
 590     list: ["BMA", "OMA", "OMB", "Innovation"].
 591
 592 Reference description for ADAO checking cases
 593 ---------------------------------------------
 594
 595 List of commands and keywords for an ADAO checking case
 596 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 597
 598 .. index:: single: CHECKING_STUDY
 599 .. index:: single: Algorithm
 600 .. index:: single: AlgorithmParameters
 601 .. index:: single: CheckingPoint
 602 .. index:: single: Debug
 603 .. index:: single: ObservationOperator
 604 .. index:: single: Study_name
 605 .. index:: single: Study_repertory
 606 .. index:: single: UserDataInit
 607
 608 The second set of commands is related to the description of a checking case,
 609 that is a procedure to check required properties on information somewhere else
 610 by a calculation case. The terms are ordered in alphabetical order, except the
 611 first, which describes choice between calculation or checking. The different
 612 commands are the following:
 613
 614 **CHECKING_STUDY**
 615     *Required command*. This is the general command describing the checking
 616     case. It hierarchically contains all the other commands.
 617
 618 **Algorithm**
 619     *Required command*. This is a string to indicate the data assimilation or
 620     optimization algorithm chosen. The choices are limited and available through
 621     the GUI. There exists for example "FunctionTest", "AdjointTest"... See below
 622     the list of algorithms and associated parameters in the following subsection
 623     `Options and required commands for checking algorithms`_.
 624
 625 **AlgorithmParameters**
 626     *Optional command*. This command allows to add some optional parameters to
 627     control the data assimilation or optimization algorithm. It is defined as a
 628     "*Dict*" type object, that is, given as a script. See below the list of
 629     algorithms and associated parameters in the following subsection `Options
 630     and required commands for checking algorithms`_.
 631
 632 **CheckingPoint**
 633     *Required command*. This indicates the vector used,
 634     previously noted as :math:`\mathbf{x}^b`. It is defined as a "*Vector*" type
 635     object, that is, given either as a string or as a script.
 636
 637 **Debug**
 638     *Required command*. This define the level of trace and intermediary debug
 639     information. The choices are limited between 0 (for False) and 1 (for
 640     True).
 641
 642 **ObservationOperator**
 643     *Required command*. This indicates the observation operator, previously
 644     noted :math:`H`, which transforms the input parameters :math:`\mathbf{x}` to
 645     results :math:`\mathbf{y}` to be compared to observations
 646     :math:`\mathbf{y}^o`. It is defined as a "*Function*" type object, that is,
 647     given as a script. Different functional forms can be used, as described in
 648     the following subsection `Requirements for functions describing an
 649     operator`_.
 650
 651 **Study_name**
 652     *Required command*. This is an open string to describe the study by a name
 653     or a sentence.
 654
 655 **Study_repertory**
 656     *Optional command*. If available, this repertory is used to find all the
 657     script files that can be used to define some other commands by scripts.
 658
 659 **UserDataInit**
 660     *Optional command*. This commands allows to initialize some parameters or
 661     data automatically before data assimilation algorithm processing.
 662
 663 Options and required commands for checking algorithms
 664 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 665
 666 .. index:: single: AdjointTest
 667 .. index:: single: FunctionTest
 668 .. index:: single: GradientTest
 669 .. index:: single: LinearityTest
 670
 671 .. index:: single: AlgorithmParameters
 672 .. index:: single: AmplitudeOfInitialDirection
 673 .. index:: single: EpsilonMinimumExponent
 674 .. index:: single: InitialDirection
 675 .. index:: single: ResiduFormula
 676 .. index:: single: SetSeed
 677
 678 We recall that each algorithm can be controlled using some generic or specific
 679 options given through the "*AlgorithmParameters*" optional command, as follows
 680 for example::
 681
 682     AlgorithmParameters = {
 683         "AmplitudeOfInitialDirection" : 1,
 684         "EpsilonMinimumExponent" : -8,
 685         }
 686
 687 If an option is specified for an algorithm that doesn't support it, the option
 688 is simply left unused. The meaning of the acronyms or particular names can be
 689 found in the :ref:`genindex` or the :ref:`section_glossary`. In addition, for
 690 each algorithm, the required commands/keywords are given, being described in
 691 `List of commands and keywords for an ADAO checking case`_.
 692
 693 **"AdjointTest"**
 694
 695   *Required commands*
 696     *"CheckingPoint",
 697     "ObservationOperator"*
 698
 699   AmplitudeOfInitialDirection
 700     This key indicates the scaling of the initial perturbation build as a vector
 701     used for the directional derivative around the nominal checking point. The
 702     default is 1, that means no scaling.
 703
 704   EpsilonMinimumExponent
 705     This key indicates the minimal exponent value of the power of 10 coefficient
 706     to be used to decrease the increment multiplier. The default is -8, and it
 707     has to be between 0 and -20. For example, its default value leads to
 708     calculate the residue of the scalar product formula with a fixed increment
 709     multiplied from 1.e0 to 1.e-8.
 710
 711   InitialDirection
 712     This key indicates the vector direction used for the directional derivative
 713     around the nominal checking point. It has to be a vector. If not specified,
 714     this direction defaults to a random perturbation around zero of the same
 715     vector size than the checking point.
 716
 717   SetSeed
 718     This key allow to give an integer in order to fix the seed of the random
 719     generator used to generate the ensemble. A convenient value is for example
 720     1000. By default, the seed is left uninitialized, and so use the default
 721     initialization from the computer.
 722
 723 **"FunctionTest"**
 724
 725   *Required commands*
 726     *"CheckingPoint",
 727     "ObservationOperator"*
 728
 729   No option
 730
 731 **"GradientTest"**
 732
 733   *Required commands*
 734     *"CheckingPoint",
 735     "ObservationOperator"*
 736
 737   AmplitudeOfInitialDirection
 738     This key indicates the scaling of the initial perturbation build as a vector
 739     used for the directional derivative around the nominal checking point. The
 740     default is 1, that means no scaling.
 741
 742   EpsilonMinimumExponent
 743     This key indicates the minimal exponent value of the power of 10 coefficient
 744     to be used to decrease the increment multiplier. The default is -8, and it
 745     has to be between 0 and -20. For example, its default value leads to
 746     calculate the residue of the scalar product formula with a fixed increment
 747     multiplied from 1.e0 to 1.e-8.
 748
 749   InitialDirection
 750     This key indicates the vector direction used for the directional derivative
 751     around the nominal checking point. It has to be a vector. If not specified,
 752     this direction defaults to a random perturbation around zero of the same
 753     vector size than the checking point.
 754
 755   ResiduFormula
 756     This key indicates the residue formula that has to be used for the test. The
 757     default choice is "Taylor", and the possible ones are "Taylor" (residue of
 758     the Taylor development of the operator, which has to decrease with the power
 759     of 2 in perturbation) and "Norm" (residue obtained by taking the norm of the
 760     Taylor development at zero order approximation, which approximate the
 761     gradient, and which has to remain constant).
 762
 763   SetSeed
 764     This key allow to give an integer in order to fix the seed of the random
 765     generator used to generate the ensemble. A convenient value is for example
 766     1000. By default, the seed is left uninitialized, and so use the default
 767     initialization from the computer.
 768
 769 **"LinearityTest"**
 770
 771   *Required commands*
 772     *"CheckingPoint",
 773     "ObservationOperator"*
 774
 775   AmplitudeOfInitialDirection
 776     This key indicates the scaling of the initial perturbation build as a vector
 777     used for the directional derivative around the nominal checking point. The
 778     default is 1, that means no scaling.
 779
 780   EpsilonMinimumExponent
 781     This key indicates the minimal exponent value of the power of 10 coefficient
 782     to be used to decrease the increment multiplier. The default is -8, and it
 783     has to be between 0 and -20. For example, its default value leads to
 784     calculate the residue of the scalar product formula with a fixed increment
 785     multiplied from 1.e0 to 1.e-8.
 786
 787   InitialDirection
 788     This key indicates the vector direction used for the directional derivative
 789     around the nominal checking point. It has to be a vector. If not specified,
 790     this direction defaults to a random perturbation around zero of the same
 791     vector size than the checking point.
 792
 793   ResiduFormula
 794     This key indicates the residue formula that has to be used for the test. The
 795     default choice is "CenteredDL", and the possible ones are "CenteredDL"
 796     (residue of the difference between the function at nominal point and the
 797     values with positive and negative increments, which has to stay very small),
 798     "Taylor" (residue of the Taylor development of the operator normalized by
 799     the nominal value, which has to stay very small), "NominalTaylor" (residue
 800     of the order 1 approximations of the operator, normalized to the nominal
 801     point, which has to stay close to 1), and "NominalTaylorRMS" (residue of the
 802     order 1 approximations of the operator, normalized by RMS to the nominal
 803     point, which has to stay close to 0).
 804
 805   SetSeed
 806     This key allow to give an integer in order to fix the seed of the random
 807     generator used to generate the ensemble. A convenient value is for example
 808     1000. By default, the seed is left uninitialized, and so use the default
 809     initialization from the computer.
 810
 811 Requirements for functions describing an operator
 812 -------------------------------------------------
 813
 814 The operators for observation and evolution are required to implement the data
 815 assimilation or optimization procedures. They include the physical simulation
 816 numerical simulations, but also the filtering and restriction to compare the
 817 simulation to observation. The evolution operator is considered here in its
 818 incremental form, representing the transition between two successive states, and
 819 is then similar to the observation operator.
 820
 821 Schematically, an operator has to give a output solution given the input
 822 parameters. Part of the input parameters can be modified during the optimization
 823 procedure. So the mathematical representation of such a process is a function.
 824 It was briefly described in the section :ref:`section_theory` and is generalized
 825 here by the relation:
 826
 827 .. math:: \mathbf{y} = O( \mathbf{x} )
 828
 829 between the pseudo-observations :math:`\mathbf{y}` and the parameters
 830 :math:`\mathbf{x}` using the observation or evolution operator :math:`O`. The
 831 same functional representation can be used for the linear tangent model
 832 :math:`\mathbf{O}` of :math:`O` and its adjoint :math:`\mathbf{O}^*`, also
 833 required by some data assimilation or optimization algorithms.
 834
 835 Then, **to describe completely an operator, the user has only to provide a
 836 function that fully and only realize the functional operation**.
 837
 838 This function is usually given as a script that can be executed in a YACS node.
 839 This script can without difference launch external codes or use internal SALOME
 840 calls and methods. If the algorithm requires the 3 aspects of the operator
 841 (direct form, tangent form and adjoint form), the user has to give the 3
 842 functions or to approximate them.
 843
 844 There are 3 practical methods for the user to provide the operator functional
 845 representation.
 846
 847 First functional form: using "*ScriptWithOneFunction*"
 848 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 849
 850 .. index:: single: ScriptWithOneFunction
 851 .. index:: single: DirectOperator
 852 .. index:: single: DifferentialIncrement
 853 .. index:: single: CenteredFiniteDifference
 854
 855 The first one consist in providing only one potentially non-linear function, and
 856 to approximate the tangent and the adjoint operators. This is done by using the
 857 keyword "*ScriptWithOneFunction*" for the description of the chosen operator in
 858 the ADAO GUI. The user have to provide the function in a script, with a
 859 mandatory name "*DirectOperator*". For example, the script can follow the
 860 template::
 861
 862     def DirectOperator( X ):
 863         """ Direct non-linear simulation operator """
 864         ...
 865         ...
 866         ...
 867         return Y=O(X)
 868
 869 In this case, the user can also provide a value for the differential increment,
 870 using through the GUI the keyword "*DifferentialIncrement*", which has a default
 871 value of 1%. This coefficient will be used in the finite difference
 872 approximation to build the tangent and adjoint operators. The finite difference
 873 approximation order can also be chosen through the GUI, using the keyword
 874 "*CenteredFiniteDifference*", with 0 for an uncentered schema of first order,
 875 and with 1 for a centered schema of second order (of twice the first order
 876 computational cost). The keyword has a default value of 0.
 877
 878 This first operator definition allow easily to test the functional form before
 879 its use in an ADAO case, greatly reducing the complexity of implementation.
 880
 881 **Important warning:** the name "*DirectOperator*" is mandatory, and the type of
 882 the X argument can be either a python list, a numpy array or a numpy 1D-matrix.
 883 The user has to treat these cases in his script.
 884
 885 Second functional form: using "*ScriptWithFunctions*"
 886 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 887
 888 .. index:: single: ScriptWithFunctions
 889 .. index:: single: DirectOperator
 890 .. index:: single: TangentOperator
 891 .. index:: single: AdjointOperator
 892
 893 The second one consist in providing directly the three associated operators
 894 :math:`O`, :math:`\mathbf{O}` and :math:`\mathbf{O}^*`. This is done by using
 895 the keyword "*ScriptWithFunctions*" for the description of the chosen operator
 896 in the ADAO GUI. The user have to provide three functions in one script, with
 897 three mandatory names "*DirectOperator*", "*TangentOperator*" and
 898 "*AdjointOperator*". For example, the script can follow the template::
 899
 900     def DirectOperator( X ):
 901         """ Direct non-linear simulation operator """
 902         ...
 903         ...
 904         ...
 905         return something like Y
 906
 907     def TangentOperator( (X, dX) ):
 908         """ Tangent linear operator, around X, applied to dX """
 909         ...
 910         ...
 911         ...
 912         return something like Y
 913
 914     def AdjointOperator( (X, Y) ):
 915         """ Adjoint operator, around X, applied to Y """
 916         ...
 917         ...
 918         ...
 919         return something like X
 920
 921 Another time, this second operator definition allow easily to test the
 922 functional forms before their use in an ADAO case, reducing the complexity of
 923 implementation.
 924
 925 **Important warning:** the names "*DirectOperator*", "*TangentOperator*" and
 926 "*AdjointOperator*" are mandatory, and the type of the X, Y, dX arguments can be
 927 either a python list, a numpy array or a numpy 1D-matrix. The user has to treat
 928 these cases in his script.
 929
 930 Third functional form: using "*ScriptWithSwitch*"
 931 +++++++++++++++++++++++++++++++++++++++++++++++++
 932
 933 .. index:: single: ScriptWithSwitch
 934 .. index:: single: DirectOperator
 935 .. index:: single: TangentOperator
 936 .. index:: single: AdjointOperator
 937
 938 This third form give more possibilities to control the execution of the three
 939 functions representing the operator, allowing advanced usage and control over
 940 each execution of the simulation code. This is done by using the keyword
 941 "*ScriptWithSwitch*" for the description of the chosen operator in the ADAO GUI.
 942 The user have to provide a switch in one script to control the execution of the
 943 direct, tangent and adjoint forms of its simulation code. The user can then, for
 944 example, use other approximations for the tangent and adjoint codes, or
 945 introduce more complexity in the argument treatment of the functions. But it
 946 will be far more complicated to implement and debug.
 947
 948 **It is recommended not to use this third functional form without a solid
 949 numerical or physical reason.**
 950
 951 If, however, you want to use this third form, we recommend using the following
 952 template for the switch. It requires an external script or code named
 953 "*Physical_simulation_functions.py*", containing three functions named
 954 "*DirectOperator*", "*TangentOperator*" and "*AdjointOperator*" as previously.
 955 Here is the switch template::
 956
 957     import Physical_simulation_functions
 958     import numpy, logging
 959     #
 960     method = ""
 961     for param in computation["specificParameters"]:
 962         if param["name"] == "method":
 963             method = param["value"]
 964     if method not in ["Direct", "Tangent", "Adjoint"]:
 965         raise ValueError("No valid computation method is given")
 966     logging.info("Found method is \'%s\'"%method)
 967     #
 968     logging.info("Loading operator functions")
 969     Function = Physical_simulation_functions.DirectOperator
 970     Tangent  = Physical_simulation_functions.TangentOperator
 971     Adjoint  = Physical_simulation_functions.AdjointOperator
 972     #
 973     logging.info("Executing the possible computations")
 974     data = []
 975     if method == "Direct":
 976         logging.info("Direct computation")
 977         Xcurrent = computation["inputValues"][0][0][0]
 978         data = Function(numpy.matrix( Xcurrent ).T)
 979     if method == "Tangent":
 980         logging.info("Tangent computation")
 981         Xcurrent  = computation["inputValues"][0][0][0]
 982         dXcurrent = computation["inputValues"][0][0][1]
 983         data = Tangent(numpy.matrix(Xcurrent).T, numpy.matrix(dXcurrent).T)
 984     if method == "Adjoint":
 985         logging.info("Adjoint computation")
 986         Xcurrent = computation["inputValues"][0][0][0]
 987         Ycurrent = computation["inputValues"][0][0][1]
 988         data = Adjoint((numpy.matrix(Xcurrent).T, numpy.matrix(Ycurrent).T))
 989     #
 990     logging.info("Formatting the output")
 991     it = numpy.ravel(data)
 992     outputValues = [[[[]]]]
 993     for val in it:
 994       outputValues[0][0][0].append(val)
 995     #
 996     result = {}
 997     result["outputValues"]        = outputValues
 998     result["specificOutputInfos"] = []
 999     result["returnCode"]          = 0
1000     result["errorMessage"]        = ""
1001
1002 All various modifications could be done from this template hypothesis.
1003
1004 Special case of controled evolution operator
1005 ++++++++++++++++++++++++++++++++++++++++++++
1006
1007 In some cases, the evolution or the observation operators are required to be
1008 controled by an external input control, given a priori. In this case, the
1009 generic form of the incremental evolution model is slightly modified as follows:
1010
1011 .. math:: \mathbf{y} = O( \mathbf{x}, \mathbf{u})
1012
1013 where :math:`\mathbf{u}` is the control over one state increment. In this case,
1014 the direct operator has to be applied to a pair of variables :math:`(X,U)`.
1015 Schematically, the operator has to be set as::
1016
1017     def DirectOperator( (X, U) ):
1018         """ Direct non-linear simulation operator """
1019         ...
1020         ...
1021         ...
1022         return something like X(n+1) or Y(n+1)
1023
1024 The tangent and adjoint operators have the same signature as previously, noting
1025 that the derivatives has to be done only partially against :math:`\mathbf{x}`.
1026 In such a case with explicit control, only the second functional form (using
1027 "*ScriptWithFunctions*") and third functional form (using "*ScriptWithSwitch*")
1028 can be used.