Probabilistic Toolkit
Probabilistic Toolkit
User Manual
Version: 2.0
SVN Revision: 00
12 December 2023
Probabilistic Toolkit, User Manual
Contents
List of Figures ix
List of Tables xi
I Introduction 1
1 Reading guide 3
2 Introduction 5
2.1 Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Run Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Output uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.5 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.6 Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Run 9
3.1 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Console . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
II User Guide 11
4 General 13
4.1 Persistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Charts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.5 Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.6 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5 Model 17
5.1 Internal models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.1.1 C# syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.1.2 Python syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.1.3 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.2 External models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2.1 External model types . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2.1.1 Executable . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2.1.2 Command shells . . . . . . . . . . . . . . . . . . . . . 21
5.2.1.3 Batch files . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2.1.4 Python script . . . . . . . . . . . . . . . . . . . . . . 22
5.2.1.5 Assembly . . . . . . . . . . . . . . . . . . . . . . . . 23
5.2.1.6 Excel . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2.1.7 Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2.1.8 Manual . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2.2 Input file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.2.2.1 Multiple scenarios . . . . . . . . . . . . . . . . . . . . 26
5.2.2.2 Stochastic input file . . . . . . . . . . . . . . . . . . . 27
5.2.3 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.2.3.1 Parameter definitions . . . . . . . . . . . . . . . . . . 27
Deltares iii
Probabilistic Toolkit, User Manual
5.2.3.2 Components . . . . . . . . . . . . . . . . . . . . . . . 29
5.2.3.3 File values . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2.3.4 Model factor . . . . . . . . . . . . . . . . . . . . . . . 30
5.2.4 File formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2.4.1 Xml files . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2.4.2 Json files . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2.4.3 Config files . . . . . . . . . . . . . . . . . . . . . . . 31
5.2.4.4 Table files . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2.4.5 Other ascii files . . . . . . . . . . . . . . . . . . . . . 31
5.2.4.6 Keyword enriched files . . . . . . . . . . . . . . . . . 32
5.2.4.7 Net CDF files . . . . . . . . . . . . . . . . . . . . . . 32
5.2.4.8 Zip files . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.4.9 Propriatary files . . . . . . . . . . . . . . . . . . . . . 32
5.2.5 Running . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.5.1 Arguments . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.5.2 Working directory . . . . . . . . . . . . . . . . . . . . 33
5.2.5.3 Pre and postprocessing . . . . . . . . . . . . . . . . . . 33
5.2.5.4 Unsuccessful runs . . . . . . . . . . . . . . . . . . . . 34
5.3 Imported models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.3.1 Fragility curves . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.3.1.1 Define fragility curves . . . . . . . . . . . . . . . . . . 34
5.3.1.2 Import fragility curves . . . . . . . . . . . . . . . . . . 35
5.3.1.3 Composite fragility curves . . . . . . . . . . . . . . . . 36
5.3.1.4 Two-dimensional fragility curves . . . . . . . . . . . . 37
5.3.2 Response surface . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.4 Known applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.5 No model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.6 Composite model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.6.1 Multiple rounds . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.6.2 Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.6.3 Connection by file . . . . . . . . . . . . . . . . . . . . . . . . . 41
6 Analysis 43
6.1 Analysis type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.2 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.3 Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.4 Pre and postprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7 Variables 45
7.1 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.1.1 Distribution depending on realization . . . . . . . . . . . . . . . . 47
7.2 Data and test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.2.1 Prior distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.2.2 Data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.3 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.4 Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.4.1 Auto correlations . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8 Run model 53
8.1 Design values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.2 Run model results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.2.1 Single run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.2.2 Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.2.3 Batch processing . . . . . . . . . . . . . . . . . . . . . . . . . . 55
iv Deltares
Contents
8.2.4 Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.2.5 Response surface . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9 Sensitivity 59
10 Output uncertainty 61
11 Calibration 65
12 Reliability 67
12.1 Reliability algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
12.2 Failure definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
12.2.1 Multiple failure definitions . . . . . . . . . . . . . . . . . . . . . 68
12.2.2 Import design points . . . . . . . . . . . . . . . . . . . . . . . . 69
12.2.3 Updating reliability . . . . . . . . . . . . . . . . . . . . . . . . . 69
12.2.4 Upscaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
12.2.5 Fragility curves . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
12.2.5.1 Combining fragility curves . . . . . . . . . . . . . . . . 71
12.3 Design point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
12.3.1 Reliability index . . . . . . . . . . . . . . . . . . . . . . . . . . 72
12.3.2 Contribution per variable . . . . . . . . . . . . . . . . . . . . . . 72
12.3.3 Limit state point . . . . . . . . . . . . . . . . . . . . . . . . . . 73
12.4 Reliability results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
12.4.1 Single run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
12.4.2 Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
12.4.3 Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
12.4.4 Fragility curve . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
13 Response surfaces 75
14 Realizations 77
14.1 Overview of realizations . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
14.2 Realizations of a design point . . . . . . . . . . . . . . . . . . . . . . . . 78
14.3 Inspect a realization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
15 Notes 79
Deltares v
Probabilistic Toolkit, User Manual
IV Scientific Background 95
17 Distributions 97
17.1 Standard normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . 97
17.2 Distribution properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
17.3 Distribution types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
17.3.1 Deterministic distribution . . . . . . . . . . . . . . . . . . . . . . 99
17.3.2 Normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . 99
17.3.3 Log normal distribution . . . . . . . . . . . . . . . . . . . . . . . 99
17.3.4 Uniform distribution . . . . . . . . . . . . . . . . . . . . . . . . 100
17.3.5 Triangular distribution . . . . . . . . . . . . . . . . . . . . . . . 100
17.3.6 Trapezoidal distribution . . . . . . . . . . . . . . . . . . . . . . . 101
17.3.7 Exponential distribution . . . . . . . . . . . . . . . . . . . . . . . 102
17.3.8 Gumbel distribution . . . . . . . . . . . . . . . . . . . . . . . . . 102
17.3.8.1 Decimation value . . . . . . . . . . . . . . . . . . . . 102
17.3.9 Weibull distribution . . . . . . . . . . . . . . . . . . . . . . . . . 103
17.3.10 Frechet distribution . . . . . . . . . . . . . . . . . . . . . . . . . 104
17.3.11 Generalized Extreme Value distribution . . . . . . . . . . . . . . . 104
17.3.12 Rayleigh distribution . . . . . . . . . . . . . . . . . . . . . . . . 104
17.3.13 Pareto distribution . . . . . . . . . . . . . . . . . . . . . . . . . 105
17.3.14 Generalized Pareto distribution . . . . . . . . . . . . . . . . . . . 105
17.3.15 Student’s T distribution . . . . . . . . . . . . . . . . . . . . . . . 106
17.3.16 Gamma distribution . . . . . . . . . . . . . . . . . . . . . . . . . 106
17.3.17 Beta distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 107
17.3.18 Poisson distribution . . . . . . . . . . . . . . . . . . . . . . . . . 107
17.3.19 Discrete distribution . . . . . . . . . . . . . . . . . . . . . . . . 108
17.3.20 Bernoulli distribution . . . . . . . . . . . . . . . . . . . . . . . . 108
17.3.21 CDF curve distribution . . . . . . . . . . . . . . . . . . . . . . . 108
17.3.22 Histogram distribution . . . . . . . . . . . . . . . . . . . . . . . 109
17.3.23 Inverted distribution . . . . . . . . . . . . . . . . . . . . . . . . . 109
17.3.24 Truncated distribution . . . . . . . . . . . . . . . . . . . . . . . . 110
17.4 Goodness of fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
17.5 Prior distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
17.6 Design values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
18 Correlations 113
18.1 Correlation factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
18.1.1 Distance based correlation factor . . . . . . . . . . . . . . . . . . 113
18.2 Processing of correlation factors . . . . . . . . . . . . . . . . . . . . . . . 114
19 Sensitivity 115
19.1 Single value variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
19.2 Sobol indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
vi Deltares
Contents
21 Calibration 119
21.1 Cost function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
21.2 Calibration algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
21.2.1 Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
21.2.1.1 Move grid if minimum is on edge . . . . . . . . . . . . 120
21.2.1.2 Refinement of the grid . . . . . . . . . . . . . . . . . . 120
21.2.2 Genetic algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 121
21.2.2.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . 122
21.2.2.2 Parental selection . . . . . . . . . . . . . . . . . . . . 123
21.2.2.3 Crossover . . . . . . . . . . . . . . . . . . . . . . . . 123
21.2.2.4 Mutation . . . . . . . . . . . . . . . . . . . . . . . . 123
21.2.2.5 Child selection . . . . . . . . . . . . . . . . . . . . . . 123
21.2.3 Adaptive Particle Swarm Optimization . . . . . . . . . . . . . . . 124
21.2.4 Levenberg-Marquardt . . . . . . . . . . . . . . . . . . . . . . . . 125
21.2.5 Dud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
21.2.6 Simplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
21.2.7 Powell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
21.2.8 Conjugate gradient . . . . . . . . . . . . . . . . . . . . . . . . . 127
21.2.9 Broyden-Fletcher-Goldfarb-Shanno . . . . . . . . . . . . . . . . . 128
21.2.10 Shuffled Complex Evolution . . . . . . . . . . . . . . . . . . . . 128
21.2.11 Generalized Likelihood Uncertainty Estimation . . . . . . . . . . . 128
22 Reliability 129
22.1 Reliability algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
22.1.1 Numerical integration . . . . . . . . . . . . . . . . . . . . . . . . 129
22.1.2 Numerical bisection . . . . . . . . . . . . . . . . . . . . . . . . . 129
22.1.3 Crude Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . 130
22.1.3.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 130
22.1.3.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . 130
22.1.4 Importance sampling . . . . . . . . . . . . . . . . . . . . . . . . 132
22.1.4.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 132
22.1.4.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . 133
22.1.4.3 Mean realization . . . . . . . . . . . . . . . . . . . . . 134
22.1.4.4 Adaptive importance sampling . . . . . . . . . . . . . . 135
22.1.5 Subset simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 137
22.1.6 Directional sampling . . . . . . . . . . . . . . . . . . . . . . . . 138
22.1.6.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 138
22.1.6.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . 139
22.1.7 Latin hypercube . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
22.1.8 Cobyla . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
22.1.9 FORM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
22.1.9.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 141
22.1.9.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . 143
22.1.9.3 Start point . . . . . . . . . . . . . . . . . . . . . . . . 144
22.1.9.4 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . 144
22.1.10 External method . . . . . . . . . . . . . . . . . . . . . . . . . . 146
22.2 Contribution per variable . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
22.2.1 Center of gravity . . . . . . . . . . . . . . . . . . . . . . . . . . 147
22.2.2 Center of angles . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
22.2.3 Nearest to mean . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
22.3 Combining design points . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
22.3.1 Integrated . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
22.3.2 Equivalent planes . . . . . . . . . . . . . . . . . . . . . . . . . . 148
22.3.2.1 Directional sampling . . . . . . . . . . . . . . . . . . . 149
Deltares vii
Probabilistic Toolkit, User Manual
24 References 155
viii Deltares
List of Figures
List of Figures
1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
7.1 Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.2 Distribution depending on realized variable . . . . . . . . . . . . . . . . . . 47
7.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.4 Fitted distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.5 Prior distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.6 Prior fitted distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.7 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.8 Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.9 Self correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Deltares ix
Probabilistic Toolkit, User Manual
11.1 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
12.1 Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
12.2 Failure definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
12.3 Map of reliabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
12.4 Failure of fragility curve . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
12.5 Design point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
12.6 Design point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
12.7 Build fragility curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
14.1 Realizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
x Deltares
List of Tables
List of Tables
Deltares xi
Probabilistic Toolkit, User Manual
xii Deltares
Introduction
1 Reading guide
This document is a manual for the probabilistic toolkit. This document leads the user through
a probabilistic analysis using the probabilistic toolkit.
This document uses a number of examples. The examples are part of the Probabilistic Toolkit.
They can be found in "Documents\Deltares\Probabilistic Toolkit" or, using the open dialog in
the Probabilistic Toolkit, in the Probabilistic Toolkit section.
Deltares 3 of 155
Probabilistic Toolkit, User Manual
4 of 155 Deltares
2 Introduction
The probabilistic toolkit is able to perform probabilistic analyses on any model, ranging from
python scripts to dedicated applications in the geotechnical or hydrodynamical field or to any
other field to combinations of models.
The Probabilistic Toolkit offers a number of analysis types. They are performed in the order
described in the next paragraph.
2.1 Workflow
The user should take a number of steps to perform a probabilistic analysis. These steps are
1 Setting up the model (outside probabilistic toolkit)
2 Define the variables
3 Perform the probabilistic analysis. Possible analyses are:
3.1 Run the model and compare results
3.2 Investigate the sensitivity of the model
3.3 Get input variables by calibation
3.4 Determine the output uncertainty
3.5 Determine the failure probability
start
Attach model
Define variables
Inspect realizations
end
Figure 2.1: Workflow through probabilistic toolkit
Deltares 5 of 155
Probabilistic Toolkit, User Manual
The workflow sequence is reflected by the tabs in the main tab control of the application.
Depending on the selected analysis type, model and calculation options the visible tabs may
change a bit.
The order of the steps ’Attach model’, ’Define variables’ and ’Perform analysis’ is fixed, al-
though one can always return to a previous step. The steps within the step ’Perform analysis’,
which are ’Run model’, ’Sensitivity’, ’Output uncertainty’, ’Calibration’ and ’Reliability’ are
optional and can be performed in any order.
Running a model helps the user to get a feel for the model and check whether the model has
been attached to the Probabilistic Toolkit in the right way. See chapter 8 for more information.
2.3 Sensitivity
In sensitivity analysis the effects of changes to input variables are investigated.
This will give insight in which input parameters are important and help the user decide which
input parameters must be measured more precisely. See chapter 9 for more information.
This is useful when the user is interested in possible values in the future of a physical property.
For example, due to a load subsidence of the soil surface will occur. It is interesting how much
subsidence will occur in the next ten years. Another example, due to a side stream a gully will
move sidewards. It is interesting to know the location of the gully in the next year.
2.5 Calibration
Calibration is a method where input parameters can be derived from measured output parame-
ters, including uncertainty. Calibration can be used to obtain parameters, which are difficult to
obtain directly.
For example, the roughness of a river bed is hard to obtain. Using measured values such as
time series of water levels at different locations, the roughness coefficients are determined.
6 of 155 Deltares
Introduction
2.6 Reliability
Reliabiity analysis determines the reliability, or probability of failure, of a physical construc-
tion. This gives the user insight of the probabilty that an unwanted phenomenon will happen,
possibly within a given period of time.
For example, the probability that a dike will fail can be calculated. This is used in assessment
of dikes and using this probability, a decision will be made to strengthen it. An accurate calcu-
lation of the probability of failure is important, since it can save money by not strengthening it
more than necessary.
To calculate an accurate probability of failure, survived situations can be used. Even lots of
effort have been put in measuring input values, it is impossible to know the subsoil in long
stretches of dikes completely. So it might be possible that a model predicts dike failure, while
in reality it has been observed that the dike has survived. The Probabilistic Toolkit uses these
events to update the probability of failure.
Another example is risk based asset management. In risk based asset management a mainte-
nance schedule will be set up, so that total costs over a long period are minimal. The total costs
include strengthening actions and risk (risk is probability times damage).
Deltares 7 of 155
Probabilistic Toolkit, User Manual
8 of 155 Deltares
3 Run
The Probabilistic Toolkit supports the following ways of running a probabilistic analysis:
3.1 Application
The Probabilistic toolkit should be started with its user interface, which can be found in the
start menu or as icon on the desktop. Then the user can enter all data required by the workflow
and execute the run button. All data and results be saved and later opened again from a tkx-file.
3.2 Console
The console version enables execution of a probabilistic analysis without user interface. An
input file should be available (made with the user interface). The console version performs the
calculation as specified in the input file.
If no output file ispecified, the input file is used and overwritten as the file containing the
calculation results.
3.3 Python
Using Python, the input file of the Probabilistic Toolkit can be modified and run (only for re-
liability analysis). The Probabilistic Toolkit installs a python file, named toolkit.py, by which
properties in an existing toolkit file can be modified and run. This file is located in the sub direc-
tory Python of the installation directory (usually C:/Program Files (x86)/Deltares/Probabilistic
toolkit/Python).
Deltares 9 of 155
Probabilistic Toolkit, User Manual
10 of 155 Deltares
User Guide
4 General
This chapter describes the general handling of the Probabilistic Tookit.
4.1 Persistency
All data entered by the user, such as models, variables and calcualtion settings, are saved in a
tkx file. All calcualtion results are saved in this file too. Intermediate calcualtion results are
saved in a dat file with the same name as the tkx file. It is not mandatory to keep the dat file:
When opening a tkx file with missing dat file, the intermediate results will be missing, but all
input and final results will be available. The dat file can be big, therefore these data are stored
in a separate file.
where
Deltares 13 of 155
Probabilistic Toolkit, User Manual
The file name of the current file in use is displayed in the caption of the application
4.2 Tables
Tables are used to present similar data. By using the toolbar at the top of the table several
actions can be performed.
The buttons in the toolbar correspond with the following actions. Several actions are available
in the right mouse menu too.
Add Adds a new data item to the table. This action is not always available;
Remove Removes the selected rows from the table. This action is not always avail-
able;
Copy Copies the selected cells to the clipboard. It can be used to copy data from
the table to another table or to another application, such as Excel;
Paste Pastes data in a tabular form to the table, e.g. from another table or from
Excel. If not enough rows are available, new rows are added;
Edit Edits data of all selected cells ina dialog. This action can be used to modify
serveal data items at once. This action is only available when all selected
cells are from thesame type, such as numeric values, booleans or text;
Export Exports the entire table to a file. A dialog is displayed in which the user can
specifiy the file and type (e.g. csv);
Fit columns Fits the columns so that content will be readable;
Clear sorting Sorts the data to the original sorting;
Grouping Groups or ungroups the data;
Swap panels Arranges the panel of the table and a corresponding chart with a horizontal
or vertical splitter. This option is not always available;
Import Imports data from another tkx file. This option is not always available;
The table supports the following actions by performing actions on the headers:
14 of 155 Deltares
General
4.3 Charts
Charts are used to display data in a graphical way. Charts support the following action by the
right mouse menu.
Bring to front Brings the series at which the mouse is located to the front. This is only
available if the mouse is positioned on a point (for which a popup is
displayed with values) or if the mouse is positioned on a legend item;
Send to back Sends the series to the back. See previous item when this option is avail-
able;
Logarithmic Converts the horizontal axis to logaritmic data;
Display table Displays data in a table;
Save image Exports the chart to an image;
4.4 Validation
Before a calculation can be performed, all data must have acceptable values. Therefore auto-
matic valiadion takes place after each user edit action. The results are displayed at the bottom
of the application. The user should solve all errors in this table.
Severity Error or warning. Errors must be solved before a calculation can be started;
Message The error or warning;
Subject The data item to which the message refers. When clicking on this link, the
Probabilistic Toolkit will make this data item visible. The part of the data
item which causes the message will be displayed in orange (for warnings) or
red (for errors);
Repair When clicking on this link, a repair action will be performed as suggested
by the text of the link;
Deltares 15 of 155
Probabilistic Toolkit, User Manual
4.5 Calculation
When all data is acceptable (see section 4.4), the calculatio can be started. The following
actions are available in the calculation menu. These actions are available in the main toolbar
too.
When the calculation is started, the user will not be able to modify any data. Also the Proba-
bilistic Toolkit can not be closed.
During the calculation a progress bar is displayed at the bottom of the Probabilistic Toolkit
indicating the progress of the calculation.
4.6 Options
Some options are available under the Tools, Options menu. These options are:
Initial Indicates whether a new project or the last project should be loaded when
project starting the Probabilistic Toolkit;
Language Language in the user interface;
Units Indicates the units in which data should be displayed;
16 of 155 Deltares
5 Model
A model is a process which takes some input variables and produces some response variables.
Most often a model simulates a physical process, such as the hydrodynamics of a river or
geotechnical behaviour of the subsoil.
The first step in the workflow is setting up the model. We distinguish the following types of
models:
1 Internal model. The model is defined within the probabilistic toolkit.
2 External model. The model is defined outside the probabilistic toolkit.
3 Data model. No calulation takes place, but only values in a data file are used.
4 Fragility curves. Data which describe failure for a certain situation. They can only be used
for reliability calculation.
5 No model. No model is attached, since stochastic variables are sufficient
6 Composite models. Combinations of the models above.
The list of input parameters and output parameters must be a unique list of parameters (con-
taining alphanumeric characters and no special symbols, digits nor spaces). Parameters can be
added using the ’+’ sign. The parameter can be renamed.
The source code must be entered in C# or Python style, to the preference of the user. The code
should be designed that output parameters get a value based on the input variables. Multiple
lines are allowed, intermediate variables can be used and external libraries can be used. Vari-
ables do not have to be declared (they are all declared as doubles in C#, Python does not require
declaration of variables), except when intermediate variables are introduced.
Compilation takes place automatically when the coding area is left by a clicking elsewhere.
Any errors are reported in the validation tab. Errors should be solved first, before the calcualtion
can be performed.
Deltares 17 of 155
Probabilistic Toolkit, User Manual
5.1.1 C# syntax
Select the C-Sharp option in the language drop down box to use C# as language.
An example of a C# script is given below (with input parameters a and b and output parameter
z):
var radius = a*a + b*b;
Console . WriteLine (" radius = {0}", radius );
z = Math.Sqrt( radius );
When Python style is selected, Python must have been installed (not part of the Probabilistic
Toolkit install file). All Python 3.* versions are supported (only C-python, not IronPython nor
Python for .Net). Python can be downloaded from which can be downloaded from https://www.python.org/downloads
or https://www.anaconda.com/products/individual.
When specifying a version, the newest known python interpreter is used, which fulfills the ver-
sion specification (e.g. Python versions 3.1, 3.7 and 3.7.1 all fulfill version 3, python versions
3.7, 3.7.1 fulfill version 3.7, but not python version 3.1).
18 of 155 Deltares
Model
Corresponding to the python version, the python interpreter must have been installed. If it has
been installed in its default location, the Probabilistic Toolkit will find it automatically. If not,
the user should select it by pressing the dots in the version field. In the dialog which appears,
the interprester can be selected.
Ru = ξ × H s (5.1)
where
Deltares 19 of 155
Probabilistic Toolkit, User Manual
The input file for this example can be found in the Examples.
5.1.3 Arrays
Internal models support the usage of arrays. When used, the parameters are interpreted as
arrays. The size of the arrays should be given by the user.
All array values get the same distribution defined in section 7.1. Their mutual correlation can
be defined in section 7.4.1.
20 of 155 Deltares
Model
External models always us input files and result files. The select of varibles and results to be
used by the Probabilistic Toolkit is described in ??.
5.2.1.1 Executable
When using an executable, the location of the executable must be specified. The executable is
supposed to run without user interaction.
Executables always use input files. The default input file has to be specified and it should be
specified which numeric values in this file should be regarded as a stochastic variable (see ??).
When executables are used, input and output is communicated via a file. The input file which is
passed to the executable must be entered as {input} in the command line arguments. This will
be used by the calculations to pass the essential information to the executable. It is also used to
extract the input variables if the output file is not specified. Other arguments are allowed too,
see section 5.2.5.1.
Note that the Probabilistic Toolkit will create copies of the original input file with altered data,
which will be saved in another location. Therefore do not write the name of the input file in the
arguments, but use the codes above.
To let the probabilistic toolkit know that it involves a command shell the arguments must be
preceded by a < sign, so the argument should be <{input}.
A batch file cannot be started with arguments. To make use of an input file, the batch script
itself should contain codes specifying the input. The codes, such as {input}, are the same as the
codes which can be used in the executable arguments (see chapter 3.2.1).
Deltares 21 of 155
Probabilistic Toolkit, User Manual
mkdir {name}
move {input } {name}
Convert .exe -convert "{ directory }\{ name}"
move {name }\ converted_ {file} {file}
del {name }\{ file}
rmdir {name}
In this example we assume there is an executable Convert.exe, which converts all files in a
given directory to files with prefix “converted_”. This batch script runs the conversion in a
temporary directory (named after the original input file).
Another example, which copies a file from an evaluation of a composite model run to a common
directory, is defined as follows. It should be placed after the model that generates the calcpnt.his
file and the indication "Use files from previous model" should be switched on.
copy calcpnt .his ..\ hisfiles \calcpnt -{ realization }-{ round }. his
A Python script can handle an input file or accepts numeric values via a method call. The
user must supply the method to be used or, if the complete scripts is to be used, the method
<main>. Depending on the signature of the method, the toolkit derives whether a file will be
used or numeric input values are used as arguments to the method. If the signature contains
more than one argument or if a default value is supplied with the argument, numeric values are
used, otherwise a file name is assumed.
The <main> method always assumes that a file is used. In this case arguments should be used
to the complete Python script (which usually contains a reference to the input file name).
In case a file is used, the user should supply the file name in the same way as an executable. In
case numeric values are used, the toolkit will derive the input and output variable names from
the signature of the method.
When an input file is used, the working directory is the directory where the input file is located.
Otherwise the working directory is the directory where the Python script is located.
22 of 155 Deltares
Model
The input and out variables have been detected by the Probabilistic Toolkit by parsing this
method. This resulted, after run, in the following input and output variables.
5.2.1.5 Assembly
The assembly model is a library model (dll). The model must have been built in a .Net compli-
ant language.
Depending whether the model accepts numeric values or an input file, implement a correspnd-
ing signature of the method which must be used.
If numeric values are passed to the model, implement a method accepting numeric values. The
results of the method must be provided by the return value or as an out parameter. If a file is
passed, implement a method whith a single string argument. This value will be populated by
the Probabilistic Toolkit with the full path to the input file. Methods can be, but do not have to
be static.
See the example below for an implementation of a model, which can be used in this way:
public class Model
{
public void Run( string fileName ) { ... }
Deltares 23 of 155
Probabilistic Toolkit, User Manual
5.2.1.6 Excel
Excel files (only *.xlsx) can be used as a model.
Numeric cells in the excel file can be used as input parameters, but only if the cell is not defined
as a formula. All numeric cells can be used as a response parameter.
Not all features in excel can be used. The Prbabilistic Toolkit raises an error if features are not
supported.
5.2.1.7 Cloud
The Probabilistic Toolkit can run models (executables and python scripts which accept an input
file) in the cloud. The Microsoft Azure cloud platform is supported. The model to be run in
the cloud platform and a pool should be configured using the Microsoft Azure cloud platform
utilities. This should be done as follows:
1 Acquire a storage account and batch account;
2 Create a pool. The pool must have the Windows operating system. Make sure at least one
node is available;
3 Upload the application to the pool. This must be a zip file containing the executable or
python script and all needed additional files (such as dlls). Include a version number;
4 In case of a python script, install Python in the pool. First a container should be created
in the storage containing the Python installer. Then a start task should be specified in the
pool as follows (including numpy to be installed): cmd /c "python-3.9.6-amd64.exe /quiet
InstallAllUsers=1 PrependPath=1 Include_test=0" & "pip install numpy". This task should
be performed with administrator privileges. If the task was added after pool creation, reboot
all nodes.
Within the Probabilistic Toolkit, enter the accounts and keys. Specify the command, which
is the name of the executable or python script. Specify the url of the batch server, which is
something like "mybatchserver.westeurope.batch.azure.com".
5.2.1.8 Manual
The manual model does not perform a calculation, but only generates the files to be calculated.
This allows the user to perform the calculations himself. This is useful when the calculations
are to be performed on a cluster or requires action by the user in a user interface.
When this calculation is performed, a list of tasks is generated which must be performed by
the user. When the user has completed the tasks, which means that the result files are available
at the location where the Probabilistic Toolkit expects them, the user should indicate that the
24 of 155 Deltares
Model
tasks are completed by checking a check box. The Probabilistic Toolkit will resume from this
point on.
The user is allowed to exit the Probabilistic Toolkit when he is to perform the calculations.
Therefore he should perform the following steps
Pause the calculation by pressing the pause button.
Save and leave the application. Saving and leaving is not allowed when the calcualtion is
not paused.
Perform the tasks outside the Probabbilistic Toolkit.
Open the Probabilistic Toolkit and open the toolkit file.
Specify which tasks have been performed.
Press the run button. THe Probabilistic Toolkit will resume the calculation when all tasks
have been performed. This step may also be taken before the previous step.
Note: When the calcualtion is stopped, all generated task files will be removed.
When using an external model (see section 5.2.1), an input file is required most of the times
for running. When the input file is specified, the user should indicate which numeric values in
the input file are stochastic variables. The same applies to the output: The user should specify
which numeric values are output and desired within the Probabilistic Toolkit.
Deltares 25 of 155
Probabilistic Toolkit, User Manual
The open button opens the file in the application belonging to the model. This applicataion can
be set in the ’Run options’ tab, if not available this button will not be displayed.
In the ’Run options’ the user should specifiy whether only this file is used by the model or
other files as well. For example, a number of files in the same directory could be used too. The
Probablistic Toolkit needs this information when performing a model run. It makes a copy of
the input file or a copy of the directory in which the input file is located.
If the output file is different form the input file, it should be indicated in the ’Run options’.
When scenarios are used, the user should inddicate in the input templates whether different
values are to be used in the ’Variable per scenario’ column.
26 of 155 Deltares
Model
Scenarios are useful when one wants to correct for realizations which were predicted by the
model as failing, but have been observed as surviving. See section 12.2.3.
The user must enter a number of input files with probability. The probabilities should sum up
to 1.
Per scenario a set of input files should be selected, if scenarios are applicable and have input
files (see section 5.2.2.1).
5.2.3 Parameters
File The file which should be used. This is a referenced file from the main file.
If left blank, the input file is used.
Deltares 27 of 155
Probabilistic Toolkit, User Manual
Variable per sce- Input only: Indicates whether a variable should be generated for each sce-
nario nario, see section 5.2.2.1.
Aggregation Output only: Indicates how the output value is generated, when multiple
values are available
First: Takes the first value
Last : Takes the last value
Min: Takes the minimum value
Max: Takes the maximum value
Mean: Uses the mean of all values
Sum: Sum of all values
Height: In case of a series, the difference between the most extreme
value and the base level of the series (see section 7.3)
Area: In case of a series, the area of the enclosed area between the base
level and series values (see section 7.3)
A variable name is composed as follows from the items above (run, file and variation are
omitted if they have default values):
28 of 155 Deltares
Model
The Probabilistic Toolkit provides drop down lists from where the user can select the desired
value. This happens for both input and output. Therefore it is necessary that the calculation has
run once already, otherwise the output characteristics are not available. The user can enforce
this by executing the command ’Generate output files’.
5.2.3.2 Components
Components are used to distinguish numeric values which have the same caption and attribute
in the parameter definition (see section 5.2.3.1).
The user can make a selection of the numeric values he is interested in.
Deltares 29 of 155
Probabilistic Toolkit, User Manual
Groups can be derived automatically from the input file by using the ’group by’ field.
The ’proportional to highest value’ indication, which becomes available when merge type
group or collection is selected, indicates whether the value of the variable is assigned to each
component or whether a proportioanl value, relative the the highest value, is assigned to the
components.
Component decompositions are shared beween parameter definintions with the same caption.
If this is not desired, a new component decomposition can be defined by use=ing the green
’plus’ button.
When the input file is changed later, file values will be used. This is useful when the Proba-
bilistic Toolkit runs in an automated environment.
The model factor is applied to the output variables, which have "Apply model factor" checked.
The output of the model is multiplied with or divided by the model factor.
In case a composite model is used (see section 5.6), each individual model can have a model
factor.
30 of 155 Deltares
Model
The following code is an example of a config file, which contains a nested structure and an
array of numeric values in Values:
[ Location ]
Name= North
[Soil]
Name=Sand
Weight =18.2
Values =[5.3 6.7 8.0 9.8]
[Soil]
Name=Clay
Weight =16.1
Values =[2.3 4.5 8.8]
[End of Location ]
[ Location ]
Name= South
[Soil]
Name=Sand
Weight =17.6
Values =[5.1 6.4 7.7 9.1]
[End of Location ]
A header line is expected and then a number of value lines. Between the header line and the
values a limited number of other lines may appear. The number of headers is expected to be
the same as the number of values in one line.
Deltares 31 of 155
Probabilistic Toolkit, User Manual
If non numeric values appear in the file, this value is used as attribute. If multiple values
appear after the non numeric value, attriibutes followed by an index are generated. But if these
numeric values are between squared brackets, they will be handled as a series in the same way
as section 5.2.4.3. An example of such a file is given below:
North Sand Weight 18.2 Values [5.3 6.7 8.0 9.8]
North Clay Weight 16.1 Values [2.3 4.5 8.8]
South Sand Weight 17.6 Values [5.1 6.4 7.7 9.1]
When this file is used in a calculation, the keywords (the whole section between the ’%’-
signs) are replaced with numeric values. This example would result in the first example in
section 5.2.4.5.
His and Nefis files Binary format, which contains parameter/location/time values;
CMT files Case Management Tool file, which refers to other files;
Sobek-3 files They are ini files, but some specific syntax to cover tables
5.2.5 Running
When running a file based model, the following can be specified:
5.2.5.1 Arguments
Command line arguments are used, at least to specify which input file is to be used. To specify
the input location, use the following codes in the argument text field:
32 of 155 Deltares
Model
{input} The file name of the input file, including the full path
{file} The file name of the input file, without the full path
{name} The file name of the input file, without the full path and without the extension
{index} The generated suffix of the input file or directory
{dir} The directory where the input file is located, also accessible by {directory}
{parentdir} The parent directory of the directory where the input file is located, also accessible
by {parentdirectory}
{source} The full path to the original input file
{calcdir} The directory where the executable is located
{realization} The sequence number of a realization
{round} The round number in case of a composite model
where
It can be specified whether to clean up the copied files or directories after the run. By default
this option is set, but might be turned off in case one wants to inspect generated files.
Deltares 33 of 155
Probabilistic Toolkit, User Manual
The arguments can define variables. They will appear in the list of variables.
In case a run is not successful, it can be retried a few times. If still not successful, the algorithm
decided how to handle it.
To enter a fragility curve, a table must be populated with failure probabilities per value of a
certain parameter. This parameter, which will appear in the variables and to which a stochastic
distribution is assigned, is fixed for a probability value in the fragility curve.
34 of 155 Deltares
Model
Often the water level is used for this parameter, but not necessarily so.
The fragility curve model is treated as any other model, which means that it receives input data
and produces output data. For each fragility curve in the model, one input value is accepted.
This input value denotes the reliability and its variable definition is a standard normal distribu-
tion. Since this input value is fixed and cannot be altered by the user, it is hidden and therefore
not visible in any table. The output value is the critical value corresponding to the input value.
Probabilistic Toolkit file (*.tkx) Another Probabilistic Toolkit file in which fragility
curves have been calculated, see section 12.4.4;
Fragility curve file (*.json) File containing a fragility curve, generated by another
application;
To import fragility curves, use the import button. In the dialog which appears, select the Prob-
abilistic Toolkit file from which to import the fragility curves and select the fragility curves.
Deltares 35 of 155
Probabilistic Toolkit, User Manual
This will result in a copy of the fragility curves of the original file. But in addition to self
defined fragility curves, for each point in the fragility curve the design point is added. This
results in more precise reliability calculations.
When the fragility curve in the referenced file is renewed, a warning message appears that the
imported fragility curve is out of date. The message has an option to import the fragility curve
again.
A new calculation can be created from the File, New from menu. Select the ’External fragility
curve’ command and the user will be asked for a file containing the fragility curves. In this
case only the json file can be selected.
Select ’Internal fragility curve’ when the fragility curve is available in the currently loaded
calculation.
The Probabilistic Toolkit will be set up for a reliability calculation using the fragility curves.
The user only has to define the variable over which the fragility curve will be integrated.
36 of 155 Deltares
Model
4 In the panel which appears right of the fragility curves, add all fragility curves of which the
composite fragility curve consists;
5 In the Composing fragilty curves panel, add a record and specify per fragility curve its
constribtion to the composite fragility curve. These contributions must add upp to 1.
6 If the contribtion depends on the parameter over which is integrated, enter more than one
record. Between the defined contributions the contributing fraction of the fragility are
interpolated. Beyond the first and last contribution, the contribution will remain the same
as the first or last contribution;
It is not ncesseary to import response surfaces to use them. They can also be used in the file
where response surfaces are derived (see chapter 13). The disadvantage of importing response
surafaces is that they can not be updated run time. The advantage of importing response sur-
faces is that there might be a huge computational effort to derive them and it can be reused
several times.
Deltares 37 of 155
Probabilistic Toolkit, User Manual
The ’Run Options’ tab is hidden, because all the input fields in this tab have been preconfigured.
The parameters (see section 5.2.3) appear in a simplified form, so that the user does not have
to know the input file structure. When a template should be used, the ’Active’ check box must
be selected.
In case the preconfigured parameters are not sufficient or other advanced properties must be
modified, press the ’Convert’ button. This will convert the model to its underlying model type,
for example an executable, and all configuration options will be available.
To go back from a converted model to an application, select the ’Application’ model type. All
added parameters will still be available, but other advanced properties will be reset.
A new calculation can be created from the File, New from menu. Select the ’Model input file’
command and the user will be asked for an input file of one of the knwon applications. When
using this command, some input and response data are selected automatically and probabilistic
data are copied from the model input file, if available.
5.5 No model
In case no model is selected, no model run will be performed, but variables can be added
manually. This can be useful when stochastic varibales are sufficient. In fact output variables
have the same value as input variables.
38 of 155 Deltares
Model
The composite model is a sequence of model types mentioned above. The models in this
sequence will run sequentially and will be regarded as a whole for the probabilistic request.
In a composite model, a table is displayed where the user can add (or remove, or change the
order) of sub models. Each sub model can be of any type except a composite model again.
They should be treated as stand alone models, regarding their specifications and input data.
Composite models are useful when output data from one model are used as input for a next
model. Therefore the user can indicate their connections in the connections tab.
Sub models can be run single or per component (see section 5.6.2 for components). This is
indicated by the ’for each component in’ column in the table with all sub models.
Deltares 39 of 155
Probabilistic Toolkit, User Manual
Input variables for different rounds can have equal or different values, depending on hte corre-
lation factor between rounds (see ??). The input variables will always have the same stochastic
definition for all rounds.
5.6.2 Connections
The user should specify which data must be transferred between models. Therefore he selects
per input variable which output variable, which has run before the input variable is needed. If
the value is left empty, the variable will appear in the tab variables, where a stochastic distribu-
tion can be defined. The connections are available in the tab Connections, which appears when
a composite model is selected.
When multiple components are in play, it is possible to create a connection for all components
in one line. Therefore use the variable which has the component set in its name between square
brackets, in the example above [Boundary] and [Location]. All variables which are part of the
component will be connected to the corresponding output variable of a prior model, if “Default
connection” is checked.
When multiple rounds are used, it is possible to make a connection to output variables cal-
culated in a previous round. Variables connected in this way will still be available in the tab
variables, where their initial value can be entered.
When multiple rounds are used, it is also possible to the round number, which is named Round.
The first round is numbered 1 and the final round the number of rounds at the composite model
definition.
40 of 155 Deltares
Model
Deltares 41 of 155
Probabilistic Toolkit, User Manual
42 of 155 Deltares
6 Analysis
Next to the selection of the analysis type, the requested output can be select for some analysis
types.
6.2 Performance
The following options are available to reduce calculation time:
Recalculate realizations The Probabilistic Toolkit stores results of all model runs. In case a
model run is requested which has been calculated before, the previ-
ous results can be used. If this value is set, this option will not be
used.
A reason not to use this option is when the model has changed or
other input values have changed which are not used as an input pa-
rameter (see section 5.2.3)
Max parallel runs The number of model runs which can be performed parallel. The
maximum value is is the number of processors on the computer. A
reason to limit this value is the memory needed by a model.
Use response surface A response surface is an approximation of the real model. It should
be trained and gives almost immediate response, but still is an ap-
proximation and therefore introduces an error. See chapter 13.
Deltares 43 of 155
Probabilistic Toolkit, User Manual
6.3 Logging
This section is used to specify which data are logged.
Log messages Textual information about the calcualtion. The following levels are
supported: Error, Warning, Info and Debug.
Realizations Realizations needed for a calculation are logged. They are presented
in the tab ’Realizations’ (see chapter 14)
Convergence The convergence of a calculation is displayed.
Arguments can be used when the pyton scripts is called, for example to specify a data source.
The special phrase ’id’ can be used, which takes the value of the identifier. The identifier can
also be passed to the Probabilistic Toolkit. In that case it does not have to be passed as an
argument.
Typical usage of preprocessing is to get input values from another datasource and assign them
to input values of the Probabilistic Toolkit. To get access to the Probabilistic Toolkit input
values via Python, the Python interface should be used, see section 16.1.2.
Typical usage of postprocessing is saving the results of the Probabilistic Toolkit in another data
source. The Python interface should be used.
44 of 155 Deltares
7 Variables
7.1 Distributions
Once the user has set up the model, including input and output, he can define the stochastic
characteristics of the input variables. The list of available input parameters is derived from
the model definition and set by default to deterministic distribution type, which indicates an
invariant variable (during all calculations they will have the same value).
Depending on the task uncertainty parameters are available. When the task is "Run Model"
(except using design values), only the mean value is available. In other cases all uncertainty
parameters are available.
The user has to specify the distribution type and corresponding characteristics for each variable.
Next to a number of usual distribution types, such as normal, log normal, truncated normal, etc..
A special distribution type is the table distribution type. For this distribution type the user can
enter a table with ranges and occurrences. This enables definition of an own distribution type.
Deltares 45 of 155
Probabilistic Toolkit, User Manual
When specifying distributions, consider the possible values a variable might achieve during
probabilistic analysis. For example, if the model cannot handle values below zero, do not
select a distribution where realizations can be generated outside certain limits. So instead of
using a normal distribution, consider using a truncated normal distribution or a log normal
distribution.
For each variable, the user has to specify a number of distribution properties. Some proper-
ties can be converted into each other. For example the mean and standard deviation can be
converted into a scale and shape parameter (depends on distribution type). The Probabilistic
Toolkit displas all parameters. When the user enters a mean, the location and scale are adjusted,
but the standard deviation is not changed. On the opposite, when changing the scale, the mean
and standard deviation are adjusted, but not the shape parameter.
When the user has entered a distribution and selects the variable in the table, the probability
density function and the cumulative density function are displayed in a chart, to give insight in
the chosen characteristics.
46 of 155 Deltares
Variables
When obtaining the distribution during the calculation, interpolation is applied between all
characteristic properties of the distribution, using the realized value of the variable from which
it depends. Beyond the first and last value, the values at the first and last are used.
The ’Variables’ tab displays the interpolated distribution for the mean value of the variable
from which it is dependent.
Deltares 47 of 155
Probabilistic Toolkit, User Manual
To enable this feature, specify per variable that it should be derived from data by setting the
Source value to Data.
Not only the distribution characteristics are calculated, but also an indication of the goodness
of fit and p-value. A goodness of fit value of 0 indicates a perfect fit and a value of 1 indicates
the worst possible fit. See section 17.4 for calculation of the goodness of fit value and p-value.
The p-value is related to the confidence level and null hypothesis. The null hypothesis is the
assumption that the data can be fitted to the provided data. The p-value tells whether this
assumption can be rejected. Suppose a confidence level of 95% is required, then the fit is
rejected when the p-value is lower than 1 - 0.95.
When the distribtion should not be fitted, but only an indication of goodness of fit and p-value
are required, set the Source to Test.
The result of the fit is displayed in the graphical representation of the distribution. In lighter
colors the distribution of the data are displayed. The data are fitted against a histogram distri-
bution. The similarness between the data and fitted distribution give a visual indication of the
goodness of the fit.
48 of 155 Deltares
Variables
To enable this feature, check the ’Has prior distribution’ checkbox. A prior distribution must
be specified. The prior distribution has the same distribution type as the fitted distribution.
The result of the fit is displayed in the graphical representation of the distribution, see Figure ??.
In lighter colors the distribution of the data and the distribution of the prior are displayed.
The benefit of using data sets is that correlation factors (see section 18.1) are calculated between
all variables in the data set.
7.3 Series
The series properties define how a stochastic realization will change a series.
Usually a stochastic realization will update a series by setting a value, multiplying or shifting
all values in the series in the same way. This definition allows the user to modify the series in
a more advanced way.
Deltares 49 of 155
Probabilistic Toolkit, User Manual
Side of base level The side of the base where the series will be modified. This can be on both
sides, above or below.
Base level Horizontal level. One can think of this level as a minimum or maximum,
beyond which no series modification will take place
Modification The type of modification per point in the series:
Uniform: All points will get the same value
Stretch: All points are stretched relatively to the base level. The most
extreme point will get the value applied to the series (for example, if the
variable is of type value, this point will get the supplied value, if it is a shift,
the shift will be applied to this point. Other points will have a modification
smaller than the most extreme point until no modification at the base level
(if the factor at base level is zero)
Proportional: Similar to stretch, but now the shape of the series is pre-
served. This means that there is a correction to each point to preserve the
shape. No additional points will be added to the series, but the correction
only takes place at the existing X coordinates
Factor to base level For stretch and proportional, the factor of the modification applied to points
at the base level. Other points will get a factor interpolated between 1 (for
the most extreme point) and this factor.
Factor at shift level The shift level is the level to which the most extreme point will be moved.
Optionally, the user can set a factor for this level. The factor used for stretch
and proportional will be derived by interpolation with this factor too.
One can simulate a truncated series by setting this value to zero.
By using the quantile value, the user can see the effect on the series during editing.
50 of 155 Deltares
Variables
7.4 Correlations
Next to the distributions, the correlations form an essential part of the variable definitions. The
probabilistic toolkit accepts correlation coefficients between any two variables. By default the
correlation coefficient between two variables is zero, meaning there is no correlation. The
allowed values are between 1 (fully correlated) and -1 (fully anti correlated).
It is only possible to use the correlation matrix when it is positive definite. There will be an
error message if this rule is violated.
To indicate the consequences of correlation coefficients, there is a chart with random realiza-
tions between two variables of the user’s choice. Based on the distributions and correlation
coefficients, the chart is populated with a number of realizations.
Arrays An internal model where variables are defined as arrays, see sec-
tion 5.1.3.
Groups Groups were used to merge components, seesection 5.2.3.2.
Multiple rounds Multiple rounds were used in a composite model, see section 5.6.1.
Imported fragility curves If fragility curves (see section 5.3.1) have been imported from an-
other Probabilistic Toolkit file, the variables in the design points of
the points in the fragility curves can be correlated. Equally named
variables will be correlated. In this case the correlation between
fragility curves can be derived from the underlying design points
and variable correlations.
Imported design points If design points have been imported (see section 12.2.2), equally
named variables in different design points can be correlated. There
is an option to let the correlation factor be distance dependent (see
section 18.1.1). Therefore the location in the file from which the de-
sign point was imported, must have been specified (see chapter 15).
Deltares 51 of 155
Probabilistic Toolkit, User Manual
52 of 155 Deltares
8 Run model
Now the model and variables have been defined, they can be used. The simplest thing to do is
just run the model and inspect its results. This is useful if the model should be inspected.
Therefore select the option ’Run model’ in the ’Analysis’ tab and then press the ’run’ button.
If it is not enabled, messages are present in the validation panel. They should be solved before
the model can be run.
To run the model, the probabilistic input is used in a limited way. The mean value of all
variables is used to calculate the result. The result consists of values for the output parameters.
The result are displayed in the evaluations tab.
8.2.2 Table
The user can also perform a number of runs and let one of the input variables vary by using the
option ’Table’. A tab in the main tab control is added named ’Table’.
An input variable should be selected and values for which to generate the table values. This
can be done with the following Step type:
Table: By specifying the minimum, maximum and step size;
Deltares 53 of 155
Probabilistic Toolkit, User Manual
Variations over multiple variables can be specified. Therefore, press the button ’Add more
variables’ and a table appears in which the variables can be specified which will have variations.
All combinations of variables will be calculated.
54 of 155 Deltares
Run model
When run, the realizations table will contain a record for each table value. A chart is displayed
giving insight in the results as function of the select variable value. This will result in:
It is possible to make variations on more than one variable. Therefore press the button "Add
more variables". The scenario panel will switch to a mode where for each variable the input
values can be selected.
The panel which appears at the left side allows the user to select which data items in the file
should be used. This is similar to the specification of parameters (see section 5.2.3), so any
format supported by the Probabilistic Toolkit can be used, for example *.csv, * xml, *.json,
etc.. The panel in the center is used for creatring the mapping between data items and variables.
The panel at the right allows selection of items in the file.
Deltares 55 of 155
Probabilistic Toolkit, User Manual
When the underlying model is the Probabilistic Toolkit, an identifier can be specified. This
value is passed to the Probabilistic Toolkit as the identifier value, see section 6.4.
A new batch calculation can be created from the File, New from menu. Select the ’Batch file’
command and the user will be asked for a file containing the batch data.
The Probabilistic Toolkit will be set up for batch processing. The user only has to define the
mappings between the batch file and the variable properties.
8.2.4 Search
The ’Search’ option finds the input variable value which leads to a specified output result. A
tab in the main tab control is added named ’Search’.
56 of 155 Deltares
Run model
The probabilistic toolkit will iterate until the corresponding input variable value is found. Bi-
section is used to get the input value. When completed, all realizations needed to find the result
are displayed. The last realization is the final result. These results are also displayed in the
’Search results’ panel.
When the response surface is generated, it can be used in another Probabilistic Toolkit file,
in which it is imported. This is useful when generation of the response surface takes a lot of
calculation time. The generated response surface can not be updated during its usage any more.
Deltares 57 of 155
Probabilistic Toolkit, User Manual
58 of 155 Deltares
9 Sensitivity
In performing a probabilistic analysis, we investigate how much influence inidividual parame-
ters have on the model output. In other words, an analysis how much the output variables will
change due to a variation in the input values.
Therefore select the option "Sensitivity". In the tab calculation options specify what variation
of the input variables is to be used. For example, select the exceeding and non-exceeding
probability and the probabilistic toolkit will calculate the corresponding values of the variables,
based on the stochastic definitions entered in an earlier stage.
Now run the calculation. For each variable two calculations are made, one with the input
variable set that it matches the exceeding probability and one that it matches the non-exceeding
probability. All other variables are set to their expectation value.
The result is a table with all run evaluations, displaying the output variable values. A chart
displays the same information.
When correlations (see section 7.4) are used, only full correlations are used. Other correlations
are ignored.
Deltares 59 of 155
Probabilistic Toolkit, User Manual
60 of 155 Deltares
10 Output uncertainty
An output uncertainty calculation results in the distribution of the output variables. Based on
the distribution of the input variables, different results are expected for the output.
This results are displayed for a low and high value, which are quantiles of the distributions.
After calculation, one can modify the quantiles and the distribution low and high values are
adapted without recalculation of the model.
The result of this calculation is a number of distributions, one for each output variable, and
a correlation matrix (not for all techniques). These distributions can be used in a subsequent
analysis, where they can be imported.
In principle, probability of failure can be derived from output uncertainty results. However, the
probabilistic techniques focus on the failure definition and leave out as many as possible non
interesting calculations, which lead to shorter calculation times. This depends on the selected
failure calculation technique. See chapter 20 for explanation of available techniques.
Deltares 61 of 155
Probabilistic Toolkit, User Manual
We split this analysis in two parts. First we will calculate the distribution of the vertical positins
at both ends and then we will use these results in a reliability calculation. In the first part we
have modeled the subsoil in two peat layers with equal properties. These properties have equal
uncertainties. The properties are partially correlated to each other, because we assume that they
are related to each other, but are not equal: The properties may have slightly different values.
We run the output uncertainty calculation with Monte Carlo method, which results in distribu-
tions of settlements at verticals 2 and 3. These distributions are equal. Essential for subsequent
analysis is the correlation factor found in this analysis.
We will use these results in a subsequent analysis in a new Probabilistic Toolkit file. Therefore
62 of 155 Deltares
Output uncertainty
these results should be saved and in the new file, in the tab "Variables", these results can be
imported. Press the import button and a dialog will be displayed, in which the user can select
the file where the uncertainties were calculated. Next the user identifies to variables in the new
file, into which the uncertainties are copied. The correlation is imported too. Once imported,
the original file is not needed any more, but if changed after the import, the Probabilistic Toolkit
will generate a warning in the new file that imported variables are out of date.
Deltares 63 of 155
Probabilistic Toolkit, User Manual
64 of 155 Deltares
11 Calibration
Calibration is the process of deriving input values from given output values. In fact calibration
is a minimization process, where a cost function is minimized. The cost function is a measure
how much a model results differ from the observed values. Optionally, the cost function in-
cludes how much input values differ from initially defined values for the input parameters (keep
close to input). This option takes care that calibrated input values do not differ very much from
initially defined values.
Deltares 65 of 155
Probabilistic Toolkit, User Manual
66 of 155 Deltares
12 Reliability
An important analysis is the calculation of probability of failure, in other words the probability
that the model result is to be regarded as failure. For example, a model that computes water
supply might be regarded as failing if the water discharge is less than a certain amount. Or if
flooding is calculated, it might be regarded as failure if the water level is higher than a certain
level.
Once run, all realizations which were needed during the calculation are displayed under the
table Realizations. Progress is indicated during the calculation process.
Deltares 67 of 155
Probabilistic Toolkit, User Manual
The user has to select to which output variable must be evaluated and when failure occurs. This
is a comparison with a critical value, which is either a fixed value or another variable. When
using another value, the user can (but does not have to) select such a variable from a list of
additional variables. This is useful when this variable is not part of the input of one of the
models. The list of additional variables can be found under the Variables tab.
Series The combination is regarded as failing if at least one of the failure defini-
tions is failing;
Parallel The combination is regarded as failing if all failure definitions are failing.
68 of 155 Deltares
Reliability
Afterwards The failure probability for each failure definition is calculated separately
and then combined to the combined probability of failure (see ??);
Integrated Per realization the limit state value is calculated for all failure definitions
and then the minimum (at least one failing) or maximum (all failing) is
used (see section 22.3.1).
This is useful when individual design point calculations have long calculation time. Also the
location of the design points can be different (see chapter 15), which is used when correlating
variables in the design point (see section 7.4.1). When the location is specified in a design
point, a map is displayed in the reliability results with the reliability per location. The color
scheme is fixed.
To define the realizations which must be included, the definition should be entered in the evi-
dence section. The included section serves as a condition when to take realizations into account.
This is the lower part in the panel in section 12.2.
When using Afterwards (see section 12.2.1), Bayes rule is applied. Using this rule there will
be a correction on the calculated probability of failure, because it calculates the probability of
failure only for included realizations. It is defined as follows:
Pfailure ∩ Pincluded
Pfailure|included = (12.1)
Pincluded
There are two ways of combining failure probabilities, depending on the value ’Calculation’:
Deltares 69 of 155
Probabilistic Toolkit, User Manual
Afterwards The afterwards method will combine the probabilities after they have been
calculated individually. Bayes rule is applied (see Equation (12.1)). All
individual probabilities are combined using directional sampling;
Integrated All failure definitions are combined within the processing of one realiza-
tion. If it has to be excluded, the realization is set to non succeeded, so
it will be ignored by the probabilistic technique. For example, in Crude
Monte Carlo it does not contribute to the counting. Note that this option
should not be used in FORM, because it does not cover the whole parame-
ter space, but searches directly for the design point.
If it is known that with a specific observed situation no failure occurs, but in another (to be
assessed) situation (with another input file) one is not sure about this, it is useful to use runs
(see ??). With the same realization both the observed situation and the assessed situation
are calculated. Using results from both situations and excluding the failure of the observed
situation, one has taken into account the survived situation of the observed event.
12.2.4 Upscaling
Upscaling is a special form of combining failure definitions. The resulting design point is
combined a number of times with itself. For example, if the design point is calculated for one
dike section, by upscaling the probability of failure is calculated for several dike sections. This
is called the upscale factor f and is given by the user. The probability of at least one failure is
calculated.
The user also specifies a self correlation factor of each variable in the design point. A self
correlation of 1 means that all sections are dependent and the result would be the probability
of failure of the original design point.
When a fragility curve is defined as function of a certain physical value and the distribution of
this physical value is known, one can calculate the probability of failure of this combination.
The failure definition should be a comparison of the critical value of the fragility curve with
the physical value.
70 of 155 Deltares
Reliability
The ’lower than’ comparison is used for descending fragility curves. The ’greater than’ com-
parison should be used for ascending fragility curves. This results in the same probability when
integrating the reliabilities over the physical value domain. The ’integrate over’ option figures
out automatically which options should be used.
Note one can use a comparison with a fixed value. This means that one calculates the proba-
bility of failure at a fixed value. If one has observed that in reality there was no failure, one can
use this comparison as evidence and by such, excluding all realizations where the model would
predict failure. In this way one can take into account proven strength of a model.
The calculation options define the way the probability is calculated. The special option ’Fragility
curve integration’ is available for calculating the probability of failure of fragility curves. When
design points (see section 12.3) are available for each point in the fragility curve, their con-
tributins are used ine the design point of the fragility curve calculation.
In reliability updating the fragility curves are kept separate. When calculating the limit state
function, all fragility curves are used and the most unfavourable result is used. Usually numer-
ical integration or numerical bisection are used.
The design point consists of a reliability index β and contributions per variable α.
Deltares 71 of 155
Probabilistic Toolkit, User Manual
The reliability index is the value of u in the standard normal distribution, which has the same
probability of non-exceeding as the probability of failure.
The following figure indicates the relation between the probability of failure and the reliabil-
ity. A subjective indication about the interpretation is added, which is also dependent on the
incurred damage when failure happens.
uv = −βαv (12.2)
where uv is the value of the variable in the u-space. This corresponds to a value in the x-space,
which is given in the results too. The sum of all alpha values squared i(the influence factors) is
one.
The sign of the alpha- values is important. It gives information in which direction of a param-
eter value one goes into failure area or non failure area.
72 of 155 Deltares
Reliability
1 Positive alpha : A higher value of the corresponding variable would go into the non failure
area (Z > 0) and thus the reliability increases
2 Negative alpha : A higher value of the corresponding variable would go into the failure
area (Z < 0) and thus the reliability decreases
One should be able to see the same trend in the sensitivity analysis. Inspect the output variable
which is used in the failure definition and watch the sensitivity of the variable associated with
the alpha value.
The limit state point is the point on the limit state in the direction of the design point. This point
is an approximation of the point with the highest probability density which fails. In u-space
this point corresponds with the point closest to the origin (point with all 0 coordinates).
In the calculation settings, the option can be checked that the limit state point should be calcu-
lated. This results in a limit state index (the length of the u-vector in the direction of the design
point which is on the limit state) and physical values of the parameters on the limit state.
12.4.2 Table
This result type is similar to the table option of the ’Run model’ analysis, see section 8.2.2.
12.4.3 Search
This result type is similar to the search option of the ’Run model’ analysis, see section 8.2.4.
Deltares 73 of 155
Probabilistic Toolkit, User Manual
When multiple limit state functions are entered, the user decides how this influences the fragility
curves. If the combine value is left blank, for each limit state function a fragility curve is gen-
erated. If a combination is entered (series or parallel), each point in the fragility curve is com-
bined as requested and leads to one fragility curve. The combination per point is performed in
the same way as section 12.2.1.
To calculate the reliablity index of the generated fragility curve, the fragility curve must be
imported in a new Probabilistic Toolkit file, see section 5.3.1.
74 of 155 Deltares
13 Response surfaces
When performing a probabilistic analysis, most computing time will be used by the models.
They may become a limiting factor on the possibilities of probabilistic analysis, in the way of
precision and the number of variables with a statistical distribution.
Therefore response surfaces are used to simulate models. Response surfaces are used to ap-
proximate output values for a realization. They consist of a number of coefficients, which are
used to approximate the response value. Per response value there is a separate set of coeffi-
cients.
Deltares 75 of 155
Probabilistic Toolkit, User Manual
section 23.2.1.1
When using response surfaces during runtime, there is an option to specify whether model
results based on response surfaces should be displayed. If this option is selected, those realiza-
tions are displayed in a dimmed way.
For the function type ’Gaussian Process Regression’ it is possible to calculate an uncertainty
band, since this response surface is able to calcualte the uncertainty of each predicted value.
This gives an impression of the quality of the response surface.
76 of 155 Deltares
14 Realizations
When many columns are displayed, the user can filter them using the "Display" pane at the
right. When more than 100 columns are displayed, the user can scroll through them by using
the arrow buttons in the toolbar.
Deltares 77 of 155
Probabilistic Toolkit, User Manual
Succeeded Indicates whether a model run has been succesfull. If not, a tab "Log" is
displayed at the bottom containing messages about this realization. The
messages are generewated by the model itself. If the Succeeded check
box is greyed out, messages are available but not interpreted as an error.
Beta the distance in u-space to the realization.
Limit State For realibility only: the Z -value, which is the value calculated for the
failure definition (see section 12.2) converted in such a way that a value
less than zero means failing and greateer than zero means no failure.
Response surface For response surfaces only: Indication whether a response surface was
used (see chapter 13).
Limit state value For response surfaces only and if model run for comaprisaon was
checked: Limit state value by using the model instaed of response sur-
face (see chapter 13).
Std Dev For response surfaces Gaussian Process Regression only: The calcu-
lated uncertainty of a predicted value by the Gaussian Process Regres-
sion algorithm (see section 23.2.3).
Wrong qualification For response surfaces Gaussian Process Regression only: The prbability
that a realization was wrongly classified regarding failure or non failure
in u-space (see section 23.2.3).
Weight For limited calculations only: The weight applied to a realization.
Input values Multiple columns indicating the input values.
Response values Multiple columns indicating the response values.
When a model is a multiple rounds model (see section 5.6.1) or contains series (see section 7.3),
special panels are displayed at the bottom to display the round values or series data.
The save and open button can be used to save the oinput file or open the input file in the user
interface application belonging to a model. This must have been specified to enable this feature.
The same applies to the the saved or opened realization as the display of the realization in the
bottom tab: depending whether the in put file has been kept, the results are or are not up to
date.
78 of 155 Deltares
15 Notes
The "Notes" tab consists of three sections. They are used for the following:
Deltares 79 of 155
Probabilistic Toolkit, User Manual
80 of 155 Deltares
Python interface
16 Python interface
Python scripts can be run in two ways: Python is in control or the Probablistic Toolkit is in
control
Deltares 83 of 155
Probabilistic Toolkit, User Manual
After execution of this script, the Probablistic Toolkit will perform the calculation. Then a
postprocessor can be invoked. A postprocessor will have the following structure:
1 Connect to the project
2 Retrieve the results of the calculation
84 of 155 Deltares
Python interface
project = Project ()
lines = []
16.2 Reference
The following methods are available:
Constructor
Functionality Establishes a connection with the Probablistic Toolkit;
Arguments None;
Finalizer
Functionality Disconnects from the Probablistic Toolkit;
Method load
Functionality Loads a project. All subsequent operations are applied on this project. Only
one project can be loaded at a time;
Arguments full path to *.tkx file;
Returns Project;
Method save
Functionality Saves the current project;
Arguments full path to *.tkx file;
Returns Nothing;
Deltares 85 of 155
Probabilistic Toolkit, User Manual
Method exit
Functionality Optional method to stop the background server, which takes care for han-
dling all commands. This call should not be necessary, since it is invoked
automatically in the Finalizer.
Arguments None;
Returns Nothing;
Constructor
Functionality Creates a project referencing the current project in the Probabilistic Toolkit.
Only call the constructor if python is called in the preproocessor or postpro-
cessor, otherwise use the load method in the class ToolKit;
Arguments None;
Method validate
Functionality Validates the project;
Arguments Nothing;
Returns List of strings containng error validation messages;
Method run
Functionality Runs the calculation and waits until the calculation has finished. Do not call
this method in a preprocessor or postprocessor;
Arguments Nothing;
Returns Indicator ’ok’ or ’failed’;
Property model
Functionality Gets the model instance;
Type Model;
Property settings
Functionality Gets an object containing all settings;
Type Settings;
Property identifier
Functionality Gets or sets the identifier used in section 6.4;
Type Model;
86 of 155 Deltares
Python interface
Property uncertainty_variable
Functionality Gets the first uncertainty variable;
Type UncertaintyStochast;
Method get_uncertainty_variable
Functionality Gets the uncertainty variable with a given name;
Arguments string or ResponseStochast;
Returns UncertaintyStochast;
Property uncertainty_variables
Functionality Gets all resulting uncertainty variables;
Type List of UncertaintyStochast;
Property design_point
Functionality Gets the result of a reliability calculation;
Type DesignPoint;
Property design_points
Functionality Gets all resulting design points;
Type List of DesignPoints;
Property realizations
Functionality Gets all realizations;
Type List of Realizations;
Property input_file
Functionality Gets or sets the input file name of a file based model;
Type string (full path);
Property submodels
Functionality Gets a list of all sub models in case of a composite model;
Type string (full path);
Method get_submodel
Deltares 87 of 155
Probabilistic Toolkit, User Manual
Property variables
Functionality Gets a list of all variables;
Type array of strings;
Method get_variable
Functionality Gets the variable with a specified full name of the variable (name including
model name if the model is a composite model);
Arguments name of variable;
Returns Stochast or None if not found;
Property response_variables
Functionality Gets a list of all response variables;
Type string (full path);
Method get_response_variable
Functionality Gets the response variable with a specified name;
Arguments name of variable;
Returns ResponseStochast or None if not found;
Method run
Functionality Runs the model with the input file specified for this model. This might be
necessary to generate the output values in the output file, which is necessary
for the Probabilistic Toolkit to perform an project.run();
Arguments None;
Returns Nothing;
Property input_file
Functionality Gets or sets the input file name of a file based model;
Type string (full path);
Method run
88 of 155 Deltares
Python interface
Functionality Runs the model with the specified for this submodel. This might be neces-
sary to generate the output values in the output file, which is necessary for
the Probabilistic Toolkit to perform an project.run();
Arguments None;
Returns Nothing;
Property name
Functionality Gets the name of the variable;
Type string;
Property fullname
Functionality If the model is a composite model, gets the model name and variable name,
else gets the variable name;
Type string;
Property distribution
Functionality Gets or sets the distribution type, one of: Deterministic, Normal, Log-
Normal, StudentT, Uniform, Triangular, Trapezoidal, Exponential, Gamma,
Beta, Frechet, Weibull, Gumbel, GeneralizedExtremeValue, Rayleigh,
Pareto, GeneralisedPareto, Table (= histogram), Discrete, Poisson, Fragili-
tyCurve). The distribution type is not case sensitive;
Type string;
Method get_quantile
Functionality Gets the value belonging to a given quantile;
Arguments Quantile;
Returns Value at quantile;
Method get_design_value
Functionality Gets the design value;
Arguments None;
Returns Value based on design_fraction and design_factor;
Deltares 89 of 155
Probabilistic Toolkit, User Manual
Method clear
Functionality Resets the distribution type to deterministic and removes all fragility values,
discrete values and histogram bins;
Arguments None;
Returns Nothing;
Method set_fragility_reliability_index
Functionality Adds or updates a fragility value of a variable which has distribution type
fragility curve;
Arguments Variable name;
Value of the conditional variable;
Reliability index at the value of the conditional variable;
Returns Nothing;
Method set_discrete_value
Functionality Adds or updates a discrete value of a variable which has distribution type
discrete;
Arguments Variable name;
Value of the conditional variable;
Occurrences at the value of the conditional variable;
Returns Nothing;
Method set_histogram_value
Functionality Adds a bin of a variable which has distribution type histogram;
Arguments Variable name;
Lower boundary of the bin;
Upper boundary of the bin;
Numer of occurences in the bin;
Returns Nothing;
90 of 155 Deltares
Python interface
Property name
Functionality Gets the name of the response variable;
Type string;
Property fullname
Functionality Gets the full name (name including model if the model is a composite model)
of the response variable;
Type string;
Property method
Functionality Gets or sets the calculation method;
Type Calculation method (one of: NumericalIntegration, NumericalBisection,
MonteCarlo, LatinHyperCube, DirectionalSampling, ImportanceSampling,
SubsetSimulation, Cobyla, FORM, FOSM, FragilityCurveIntegration, Ex-
perimental). The calculation method is not case sensitive;
Property start_method
Functionality Sets the calculation start method;
Type Calculation start method (one of: None, RaySearch, SensitivitySearch,
SphereSearch). The calculation start method is not case sensitive;
Method get_variable_settings
Functionality Gets an object containing calculation settings of a variable;
Deltares 91 of 155
Probabilistic Toolkit, User Manual
Property name
Functionality Gets the variable name;
Type string;
Property start_value
Functionality Gets or sets the start value;
Type float;
Property name
Functionality Gets the name of the reponse variable;
Type string;
Method get_quantile
Functionality Gets the quantile of the uncertainty variable;
Arguments Quantile;
Returns Value at quantile;
Property identifier
Functionality Gets the numeric identifier of the design point, for example the varing pa-
rameters in a table scenario;
Type array of floats;
Method get_alpha
Functionality Gets the contribution of a variable to the design point;
92 of 155 Deltares
Python interface
Property realizations
Functionality Gets the realizations needed to calculate the design point;
Type List of Realizations;
Method get_value
Functionality Gets the value of a variable (input or response) in the realization;
Arguments Full name of the variable;
Returns Value of the stochastic variable in the realization;
Deltares 93 of 155
Probabilistic Toolkit, User Manual
94 of 155 Deltares
Scientific Background
17 Distributions
1
ϕ (u) = √ e−u /2
2
(17.1)
2π
and the cumulative density function, or in other words the non exceeding probability, is
Z u
Φ (u) = ϕ (v) dv (17.2)
−∞
Other distribution types are converted to the standard normal distribution. The physical value in
another distribution type, called x, is converted to a value u in the standard normal distribution,
in such a way that the non exceeding probability of x is equal to the exceeding probability of
u. With Φ the cumulative density function of the standard normal distribution, the converted
value u is
The probabilities of the standard normal distribution are displayed in the following figure.
Deltares 97 of 155
Probabilistic Toolkit, User Manual
Each distribution reflects actual values. From these values xi the following properties can be
derived
Mean or expectation value µ: The long run average of randomly chosen values, calculated
as follows:
N
1X
µ= xi (17.4)
N i=1
The distribution defines the following functions and predicts the values derived from measure-
ments listed (µ, σ, etc.)
Probability density function (PDF) f (x): This is a function which indicates the likelihood
of occurrence of a random chosen value x, relative to other values;
Cumulative density function (CDF) F(x): This is a function which indicates the probability
that a random chosen value y is less or equal than x. It is related to f (x) as
Z x
F (x) = P (y < x) = f (y) dy (17.7)
−∞
Therefore the distribution needs one or more of the following parameters (depending on the
distribution type).
Location m: An indication where the distribution is located. For some distributions the
mean µ is equal to the location m.
Scale s: An indication how much randomly chosen values differ from the location. For
some distributions the scale s is equal to the deviation σ.
Shape k: Describes the shape of the distribution.
Shift c: The distribution is shifted a certain amount. Not present in all distributions;
Minimum and Maximum a and b: Minimum and maximum possible values of ramndomly
chosen values. Not present in all distributions;
98 of 155 Deltares
Distributions
(x − m)2
!
1
PDF f (x) = √ exp −
s 2π 2s2
x−m
CDF F (x) = Φ
s
Mean µ=m
Deviation σ=s
N
1X
Fit m= xi
N i=1
N
1 X
s =
2
(xi − m)2
N − 1 i=1
Deltares 99 of 155
Probabilistic Toolkit, User Manual
(ln (x − c) − m)2
!
1
PDF f (x) = √ exp −
(x − c) s 2π ! 2s2
ln (x − c) − m
CDF F (x) = Φ
s
Mean µ = exp m + 2 s ⇐⇒ m = ln (µ − c) − 21 σ2
1 2
!2
σ
σ = (µ − c) · exp s − 1 ⇐⇒ s = ln 1 +
2 2 2 2
Deviation
µ−c
N
1X
Fit m= ln (xi − c)
N i=1
N
1 X
s =
2
(ln (xi − c) − m)2
N − 1 i=1
x<a∨x>b 0
f (x) = 2(x−a)
x ≥ a ∧ x ≤ c (b−a)(c−a)
PDF
2(b−x)
x ≥ c ∧ x ≤ b (b−a)(b−c)
x<a
0
(x−a)2
x ≥ a ∧ x ≤ c (b−a)(c−a)
F (x) =
CDF (b−x) 2
x ≥ c ∧ x ≤ b 1 − (b−a)(b−c)
x>b
1
Mean µ = 3 (a+ b + c)
1
Deviation σ2 = 18
1
a2 + b2 + c2 − ab − ac − bc
Fit a = xmin − δ
b = xmax + δ
c = 3µx − (a + b)
with
xmax − xmin
δ=
N
N
1X
µx = xi
N i=1
b−a+d−c
Length L=
2
x<a∨x>b 0
x−a
x ≥ a ∧ x ≤ c L(c−a)
f (x) =
PDF
x ≥ c ∧ x ≤ d L1
x ≥ d ∧ x ≤ b b−x
L(b−d)
<
x a 0
(x−a)2
x ≥ a ∧ x ≤ c
2L(b−a)
F (x) =
2x−a−c
≥ ∧ ≤
CDF x c x d 2L
(b−x)2
x ≥ d ∧ x ≤ b 1 − 2L(b−d)
x>b
1
3 3 3 3
!
b − d c − a
Mean µ = 6L
1
−
b−d c−a !
4 4
b −d c4 − a4
Deviation σ = 12L
2 1
− − µ2
b−d c−a
1 x − c
PDF f (x) = exp −
s sx − c
CDF F (x) = 1 − exp −
s
Mean µ= s+c
Deviation σ=s
1
Rate λ=
s
N
1X
Fit s= (xi − c)
N i=1
1 x − c
PDF f (x) =exp − x + exp −
s x − c s
CDF F (x) = exp exp −
s
Mean µ=c+s·γ
π2
Deviation σ2 = s2
6
PN x
N x i exp − si
1X i=1
Fit (Forbes, s = xi + N
2010) N i=1
exp − xsi
P
i=1
N
x
1 X i
c = −s · ln exp −
N i=1
s
where
Fexc (xexc + D) = 1
F
10 exc
(xexc ) (17.8)
with
The distribution properties can be derived as follows from the decimation height
D D
Fit from D s= ≈
z (Fexc ) − z 101 Fexc ln (10)
m = xexc + s · z (Fexc (xexc ))
with
z (F) = ln (−ln (1 − F))
k x − c k−1 x − c k !
PDF f (x) = exp −
s s s
x − c k !
CDF F (x) = 1 − exp −
! s
1
Mean µ=c+s·Γ 1+
!k
2
Deviation σ2 = s2 · Γ 1 + − µ2
k
N N
xik · ln (xi )
P P
ln (xi )
1 i=1 i=1
Fit (García, k = + −
1981) k N N
xik
P
i=1
N
xik
P
i=1
sk =
N
where
where
k = 0 DGumbel (s, c)
DGEV (s, k, c) = > , ,
s 1 s
k 0 D c −
Frechet k k (17.10)
k
k < 0 DWeibull, inverted − s , − 1 , c − s
k k k
(x − c)2
!
x−c
PDF f (x) = 2 exp −
s 2s2 !
(x − c)2
CDF F (x) = 1 − exp −
2s2
π
r
Mean µ= ·s+c
2
4−π 2
Deviation σ2 = ·s
2
Fit c = xmin − δ
N
1 X
s =
2
(xi − c)2
2N i=1
with
xmax − xmin
δ=
N
sk
PDF f (x) = k ·
xk+1k
s
CDF F (x) = 1 −
x
k·s
Mean µ=
k−1
k · s2
Deviation σ =
2
(k − 1)2 · (k − 2)
Fit s = xmin
N
1 1X
= (ln (xi ) − ln (s))
k N i=1
1 x − m −( 1k +1)
PDF f (x) = · 1 + k
2 x −sm − 1k
CDF F (x) = 1 − 1 +
s s
Mean µ=m+
1−k2
s
Deviation σ =
2
(1 − k)2 · (1 − 2k)
2 − ν+1
ν+1 2
Γ 2
x−m
s
f (x) = √ 1 +
PDF
νπΓ 2ν ν
F (x) = 1 − 21 It(x) 2ν , 12
CDF
with
ν
t (x) = 2
x−m
s
+ν
Mean µ=m
ν 2
Deviation σ2 = ν−2 s
PN
wi xi
i=1
Fit (Prob- m =
N
ability- P
wi
islogic , i=1
https://stats.stackexchange.com/users/2392/probabilityislogic)
N
wi (xi − m)2
P
i=1
s2 =
ν+1
with
(ν + 1) s2
wi =
νs2 + (xi − m)2
1 x
PDF f (x) = x k−1
exp −
Γ (k) sk s
1 x
CDF F (x) = 1 − γ k,
Γ (k) s
Mean µ=k·s
Deviation σ2 = k · s2
p
3 − z + (3 − z)2 + 24z
Fit k=
12z
N
1 X
s= xi
kN i=1
with
N N
1 X 1 X
z = ln xi − ln (xi )
N i=1
N i=1
where
xk1 −1 (1 − x)k2 −1
PDF f (x) =
B (k1 , k2 )
CDF F (x) = I x (k1 , k2 )
k1
Mean µ=
k1 + k2
k1 k2
Deviation σ =
2
(k1 + k2 ) (k!1 + k2 + 1)
2
1−m 1 2
Fit k1 = − m
s2 !m
1
k2 = k1 −1
m
with
N
1X
m= xi
N i=1
N
1X
s2 = (xi − m)2
N i=1
where
x = xi PNwi
f (x) = wj
PDF j=1
x , xi 0
X
CDF F (x) = f (x)
xi ≤x
N
P
wi · xi
i=1
Mean µ= N
P
wi
i=1
N
wi · (xi − µ)2
P
i=1
Deviation σ2 = N
P
wi
i=1
When fitted, the aim is to generate 100 non empty bins, all with the same width. The lowest
bin has a lower boundary a and the highest bin has an upper boundary b.
If the lowest fitted value appears multiple times (unrounded), it is assumed that this value is
a lower limit and no values lower than this value are possible. A bin with width zero and
boundaries equal to the lowest value is added to the histogram. The same is applied for the
highest model result.
N (xmin ) = 1 xmin − δ
a=
Fit
N (xmin ) > 1 xmin
N (xmax ) = 1 xmin + δ
b=
N (xmax ) > 1 xmax
where
N (xmin ) = the number of appeances of value xmin
N (xmax ) = the number of appeances of value xmax
δ = xmax −xmin
N
i−δ
k = max | F (xi ) − | (17.11)
i,δ∈{0,1} N
k
lim p = 1 − √ (17.12)
N→∞ N
where
The goodness of fit is also applied to observation sets in the tab "Source". This is an indication
whether two series of observations originate from the same measurements.
data which are only applicable in a certain situation. The resulting updated distribution is
applicable in that situation. The advantage is that even with a limited number of measuremens
a meaningful distribution can be derived.
Fitting from a posterior distribution is only possible for normal distributions. The updated
posterior distribution is calculated as follows [Lynch (2007)]:
and
σ2data · σ2prior
σ2post = (17.14)
n · σ2prior + σ2data
where
µpost , σpost are the mean and standard deviation of the updated (or posterior) distribu-
tion;
µprior , σprior are the mean and standard deviation of the prior distribution;
µdata , σdata are the mean and standard deviation of the distribution fitted from the data
only;
n is the number of data values;
Design values are values derived from the stochastic definition as follows:
Q (q)
Vdesign = (17.15)
γ
where
The Pearson correlation factor, which is used in the Probabilistic Toolkit, is calculated as fol-
lows:
P
i up,i uq,i
ρp,q =P P (18.1)
i up,i i uq,i
where
2
Dp, q
ρt,p,q = ρt, rest + 1 − ρt, rest
· exp − 2 (18.2)
dt
where
In other words, before a model evaluation is carried out, uuncorrelated is converted to ucorrelated .
This is done as follows:
The major drawback of ths method is that combined effects of input variables are not taken into
account.
The first step consists of two Monte Carlo simulations of each N samples, leading to the sets
A and B. Of the combined set, the variance v of the model results is calculated.
Then foreach variable p new sample sets A p and B p are generated, with value for sample i:
j = p ui, j, B
A p = ui, j, A =
(19.1)
j , p ui, j, A
and
j = p ui, j, A
B p = ui, j, B =
(19.2)
j , p ui, j, B
Then the model values z belonging to samples u are calculated. The first order index and total
index of variable p are calculated as follows:
N
1 1 X
Ifirst order = · · z (Ai ) z Bip − z (Ai ) z (Bi ) (19.3)
N v i=1
and
N
1 1 Xh i2
Itotal = · · z (Ai ) − z Aip (19.4)
2N v i=1
where
The total index Itotal gives the relevant influence of the parameter on the model result.
The Crude Monte Carlo simulation results in a histogram distribution (see section 17.3.22) or,
if all model results are equal, a determisitic distribution (see section 17.3.1). The distribution
is fitted with all model results.
Importance sampling is useful when the user is interested in the tail of the distribution.
20.4 FORM
FORM (First Order Reliability Method) is like FORM in reliability analysis (see section 22.1.9).
Starting from the origin in u-apce, steps of fixed length are taken along the steepest gradi-
ent. For each step a realization is performed and its result is added to a CDF curve (see sec-
tion 17.3.21).
where
20.5 FOSM
FOSM (First Order Second Moment) is a much faster technique than Crude Monte Carlo. It
takes the gradient in the origin and then it predicts the further shape of the result distributions.
It assumes the results have a normal distribution.
Although very fast, its assumptions than results have a normal distribution is often not true.
But for a quick feeling of the output uncertainty it can be very useful.
v
t
X mmodel, i − mobserved, i !2
C= (21.1)
i
σi
where
mmodel, i is the calculated result value (by the model) of the ith parameter in the current
realization;
mobserved, i is the observed mean value of the ith parameter;
σi is the deviation of the ith parameter
In case the observed value and the calculated result value refer to series, all individual observed
points and their related calculated values are represented as a parameter in Equation (21.1). But
to prevent over-emphasis of series over individual parameters or series with many points over
series with few points, the deviation of a point in a series is corrected as follows:
√
σpoint = σseries × N (21.2)
where
21.2.1 Grid
A grid search has a number of dimensions, called N. Each dimension contains a number of
numeric items. All item combinations, also called points, are evaluated with the cost function.
In essence, the grid search does no more that sequentially evaluating all these points and return
the point with the minumum cost value.
Each dimension has a number of items which should be evaluated. The items are defined as
a range with a minimum and maximum value and the number of steps into which the range
should be subdivided.
Possibly one item cannot be evaluated, because the realizaation can not be calculated. It will
be ignored in the optimization process.
Only dimensions with at least three items can be moved and the items should be defined as
minimum/maximum/steps. Per dimension this feature is optional.
This process will be repeated until the optimum is not on the edge of the dimension area any
more.
Only dimensions with at least two items can be refined and the items should be defined as
minimum/maximum/steps. Per dimension this feature is optional.
Several refinement steps can be taken. Refinement stops after a provided number of refinement
steps.
The evolution starts from a population of randomly (parabolic) generated individuals and is an
iterative process, with the population in each iteration called a generation. In each generation,
the safety factor (fitness) of every individual in the population is evaluated. This happens
through a given limit equilibrium method (the fitness function). Slip planes with a low safety
factor are considered to be more fit.
When evolving from one generation to the next, two individuals are combined (see ?? and sec-
tion 21.2.2.2) to make a new genome (individual). This genome is possibly modified (mutated,
see section 21.2.2.4) to form a new child. Two children “fight” to decide which one will go to
the next generation, see section 21.2.2.5. Elitism is introduced (section 21.2.2.5) to make sure
the best solution thus far will not be lost. The algorithm terminates when a given number of
generations have passed.
21.2.2.1 Initialization
Like in any other optimization method an initial guess of the final outcome is required. This
guess concerns a choice of all individual genes in the population. In many problems the in-
fluence of the individual genes is fully independent. By implication, a random choice of the
values of the genes between its appropriate limits is adequate.
In some problems a weak connection between the gene values is observed. An example is a
free slip plane in slope analysis. Such a plane must be as flexible as possible in order to allow
a wide variety of shapes. However, buckling planes will not survive as they have a surplus of
resisting power. It is better to exclude these already at initialization.
In the uniform case every gene has its own random number. The corresponding value of the
gene ranges between its minimum and maximum value. These bounds may even be different
for every gene. The parabolic case is defined by three random numbers. They are chosen in
such a way that the parabola for all genes ranges between 0 and 1.
Note that ηc is the relative position of the parabola’s extreme. If this position is located outside
the gene’s position, the available space is not uniformly filled up, since the extreme enforces
an extra offset. Therefore, it could be decided to fix the extreme always in the gene’s range,
0 < ηc < 1.
This selection type is implemented in the kernel. More advanced options exist, based on the
fitness scaling function or a remainder approach but are not (yet) implemented.
21.2.2.3 Crossover
The crossover rules realize the combination of two parents to form children for the next gen-
eration. Several types are in use, which meet certain specific demands in the problem at hand.
The crossover mask is a vector of choices: 0 for the mother and 1 for the father.
Scattered:
The father and mother are combined by having a completely random crossover mask. This
works well for a small genome with independent genes. The crossover mask reads:
χ = (0|1)n (21.6)
Single Point: A single point crossover will combine the first part of one parent with the
second part of the other parent. For genome types which specify a coherent object it will
facilitate a more smooth convergence. It is defined as:
χ = {0...0, 1...1} (21.7)
Double point: A double point crossover combines the middle of one slip plane with the
edges of another. The definition is:
χ = {0...0, 1...1, 0...0} (21.8)
21.2.2.4 Mutation
Mutation options specify how the genetic algorithm makes small random changes in the indi-
viduals in the population to create mutation children. Mutation provides genetic diversity and
enables the genetic algorithm to search a broader space. Three options are defined, a jump,
creep and an inversion.
Jump: applies a completely random mutation. A jump helps finding the global minimum
by jumping out of a local minimum.
Creep: applies a small mutation. For example, one parameter can change 10% as a muta-
tion. The creep method applies only for one point of the genome. A Creep Reduction factor
must be specified. This reduction factor is the maximum percentage a point can change.
The mutation is random between zero and the maximum percentage.
Inversion: is the inverse of the entire genome.
The genetic algorithm uses elitism. The elite is passed directly through to the next generation.
The size of the elite is variable.
The search terminates after a fixed number of generations. This number is variable.
rc < C (1 − β) Xit + βBti + αra
=
t+1
Xi,APSO (21.9)
rc > C Xit
with
α = δt (21.10)
where
The algorithm keeps an elite, which consists of the best particles found so far. The elite has size
K . To improve the population, the APSO algorithm is combined with a differential evolution
algorithm, which is defined by
rc < C Bti + F × Er1,i
t t
− Er2,i + Er3,i
t t
− Er4,i
=
t+1
XDE,i (21.11)
rc > C Xit
where
rs < S
t+1
XAPSO,i
=
Xit+1 (21.12)
rs > S
t+1
XDE,i
where
S is the APSO/DE separation factor, which linearly declines from 0.9 in the first
generation to 0.1 in the last generation;
rs is a random number, generated per particle and generation [0, 1];
Using a random factor, in each iteration and per each particle it is decided whether the new
particle comes from Equation (21.9) or from Equation (21.11). This is a random process which
uses the APSO/DE separation factor S . Next there is another random process, which uses the
cross over fraction C , which decides per parameter in the particle whether the value comes
from from Equation (21.9) or from Equation (21.11).
21.2.4 Levenberg-Marquardt
The Levenberg-Marquardt algorithm, also known as the damped least-squares method, a steep-
est descent type of algorithm, provides a numerical solution to the problem of minimizing a
function, over a space of parameters of the function.
Levenberg-Marquardt needs a starting point, a minimal tolerated error, and the function to be
optimized.
The algorithm iterates a number of times to a situation where the drift δ is less than a predefined
value δmax . The drift is calculated as follows:
v
t
PN ∂ f 2
i=1 ∂xi
δ= (21.13)
N
The iteration takes a situation (xi ) as input. In each iteration step, the algorithm calculates a
new situation (xi;new ). The relation between the old and new situation is
∂2 f ∂f
" # " #
× 1 + λi, j × xi;new N =
× xi;new N (21.14)
∂xi ∂x j N×N ∂xi N
with
i= j λ
λi, j =
(21.15)
i, j 0
where
When the drift at the new situation is less than a maximum value, or if a maximum number of
iterations has taken place, the algorithm stops and delivers (xi;new ) as the solution.
21.2.5 Dud
Dud is one of the optimization algorithms, which do not use any derivative of the function
being evaluated. It can be seen as a Gauss-Newton method, in the sense that it transforms
the nonlinear least square problem into the well-known linear square problem. The difference
is that instead of approximating the nonlinear function by its tangent function, the Dud uses
an affine function for the linearization. For N calibration parameters, Dud requires (N+1)
set of parameters estimates. The affine function for the linearization is formed through all
these (N+1) guesses. Note that the affine function gives exact value at each of the (N+1)
points. The resulting least square problem is then solved along the affine function to get a new
estimate, whose cost is smaller than those of all other previous estimates. If it does not produce
a better estimate, the Dud will perform different steps, like searching in opposite direction
and/or decreasing searching-step, until a better estimate is found. Afterwards, the estimate
with the largest cost is replaced with the new one and the procedure is repeated for the new set
of (N+1) estimates. The procedure is stopped when one of the stopping criteria is fulfilled.
21.2.6 Simplex
The simplex algorithm used is the one due to Nelder-Mead. This is also called a Downhill
Simplex Method. Simplex algorithm is one of the popular techniques in solving minimization
problems, without using any derivative of the cost function to minimize. For minimization of
a cost function with N parameters, it uses a polytope of (N+1) vertices. Each vertex repre-
sents a possible set of parameters, which is associated with a cost value. The minimization is
performed by replacing the highest-cost vertex with a new vertex with a lower cost. The new
vertex is sought for systematically by trying a sequence of different steps of moving the vertex
relative to the other N vertices, until a new vertex with smaller cost is found. This procedure
is repeated for the new polytope until convergence is reached, that is when the costs of all
vertices are within a predefined bound of tolerance. The first trial step is simply to reflect the
worst vertex with respect to the centroid of the remaining N vertices. Depending on whether
the reflected cost is higher or lower than the previous ones, the algorithm will try out different
step to see if it will produce a vertex with even smaller cost. In short the possible steps are
reflection, expansion, contraction, and reduction. The figures below illustrate all these steps
for a case of two parameters (N=2). The left most vertex is the one with highest cost. The gray
vertices are newly tried vertices.
21.2.7 Powell
Powell algorithm, similar to Simplex and Dud algorithms, is also an optimization method,
which does not require any derivative of the cost function being minimized. In this method, for
minimizing a function with N parameters, a user should provide an initial parameter guess as
well as N search vectors. The method transforms the multi-dimensional minimization problem
into a sequence of one-dimensional minimization problem. It is also an iterative procedure. At
each iteration, a new parameter guess is determined by a sequence of line searches, starting
from the previous parameter guess. The new parameter guess therefore can be written as a lin-
ear combination of all search vectors. Afterwards, the search direction which gives the largest
impact is replaced by a new search direction. The iteration is performed until convergence.
The line search is the so called Brent’s method.
21.2.9 Broyden-Fletcher-Goldfarb-Shanno
The BFGS (Broyden-Fletcher-Goldfarb-Shanno) Algorithm is a Quasi Newton Method. It is
based on the Newton Method which iteratively determines a root of an N-dimensional function
(based on the first order Taylor approximation of this function). Since at a minimum of a
(cost-)function the gradient is equal to zero, the method can also be used as a minimization
algorithm. The algorithm needs to be able to evaluate the gradient of the function in any N-
dimensional parameter. Since the inverse of the Hessian is also needed but hard to find, a
certain approximation of this inverse is determined at each iteration (hence the term Quasi).
A user should provide an initial parameter guess (and possibly an initial approximation of
the inverse of the Hessian) and the method determines iteratively better approximations of
the actual minimum, after updating the approximation of the inverse of the Hessian. This is
done by reducing the multi-dimensional problem into a sequence of line minimizations. The
line minimizations are done by Brent’s method, which determines the minimum of the (cost-
)function on a line with starting point the current parameters and direction the so-called search
direction. The search direction is determined by the approximation of the inverse Hessian. The
iteration is performed until one of the stopping criteria is satisfied.
The L-BFGS (Limited Memory-BFGS) Algorithm does not compute and store an entire matrix
but instead uses a number of vectors of updates of the position and of the gradient, which
represent the approximation of the inverse Hessian implicitly.
Initially, a set of points are drawn randomly from the specified distributions. Each point con-
sists of a set of values of the calibration parameters. For each point, a cost is assigned. These
points are then ordered and grouped into "complexes" based on their costs. The next step is an
iterative procedure, where the first step is to divide each complex into "simplexes" and propa-
gate each simplex to find a new point with smaller cost using the simplex method. Afterwards,
the complexes are merged back, all the points are reshuffled and regrouped into a new set of
complexes. After each iteration the points will tend to become neighbours of each other around
the global minimum of the cost function.
The GLUE analysis is implemented as to consist of (1) random draws of sets of the calibration
parameters from the respective distributions, (2) run the model with each parameter set and
evaluate the likelihood of each set. User needs to select manually the most probable sets of
parameters based on their likelihood. The random draw of the parameters set can be done
either from uniform distributions (with user specified ranges) or from a table of the most likely
sets of the calibration parameters. For the latter, user needs to prepare such a table manually
and can use the results of the previous analysis.
When the underlying model does not succeed for a certain realization, its results will be ig-
nored. The user should decide whether the number of non succeeded realizations is acceptable.
The minimum and maximum value for which the integration will run, are defined in the u-
space. The numerical integration will fill up the left over space between these values and -8
and 8 by additional cells, so the whole integration domain will be covered always. (Values -8
and 8 are used, because numerical problems arise beyond these values)
This process ends when the cells, which don’t have a similar result, represent a probability
lower than an accepted allowed difference in the reliability. This value can be supplied by the
user.
22.1.3.1 Algorithm
Crude Monte Carlo is the "standard" Monte Carlo simulation. Random realizations will be
generated proportional to their probability density. It will be counted how many realizations
lead to failure and how many realizations do not lead to failure. The probability of failure is
calculated as follows:
Nfailure
pfailure = (22.1)
N
where
22.1.3.2 Convergence
The Monte Carlo simulation also leads to a standard deviation of the probability of failure.
This standard deviation σ p is based on a confidence level α.
r
p (1 − p)
σp = z (22.2)
N
with
α
z=Φ 1− −1
(22.3)
2
where
q
σp
p< =z
1 1−p
2 p Np
ε=
σp
q (22.4)
p≥
1
= z p
2 (1−p) N(1−p)
The user can set the maximum variation coefficient εmax . The Monte Carlo simulation will
stop when ε ≤ εmax (and limited to the minimum and maximum number of realizations). The
confidence level α can not be set by the user. Its value is such that z is 1.
When the underlying model does not succeed for a certain realization, its results will be ig-
nored. The user should decide whether the number of non succeeded realizations is acceptable.
The minimum and maximum value for which the integration will run, are defined in the u-
space. The monte carlo analysis will detect by a minimum number of samples whether the left
over area fails or does not fail.
22.1.4.1 Algorithm
Importance sampling is a variation on Crude Monte Carlo (see section 22.1.3). The difference
is that realization are selected in a smarter way, preferably in the area near the limit state.
Usually Monte Carlo realizations are drawn propertional to their probability density ϕ (ui ),
since in the Probabilistic Toolkit uses the standard normal space. With importance sampling
each realization u is translated to uimp . The Probabilisitic Toolkit supports the following trans-
lation for each variable separately:
where
µvar The user defined mean value per variable. The combination of all variables is
the mean realization of the importance sampling algorithm.
σvar The user defined standard deviation per variable
A correction is applied in the calculation of failure to compensate for this translation. This
is done by giving each realization a weight, which is calculated as follows (the multiplication
with σvar is performed to compensate for the dimensionality):
σvar · ϕ uimp
wvar = (22.6)
ϕ (uvar )
and
Y
Wrealization = wvar (22.7)
variables
where
P
failing realizations W
pfailure = P (22.8)
all realizations W
X
lim W=N (22.9)
N→∞
all realizations
P
failing realizations W
pfailure = (22.10)
N
22.1.4.2 Convergence
Corresponding to Equation (22.2) and Equation (22.4), the standard deviation and variance
coefficient become:
s
p Wdesign point − p
σp = z (22.11)
N
and
q
σp Wdesign point −p
p< =z
1
2
ε=
p
q Np
(22.12)
σp Wdesign point −(1−p)
1
=
p≥
2 (1−p)
z N(1−p)
where
z is a value related to the confidence level (see Equation (22.3)). In the Proba-
bilistic Toolkit, z is 1;
Wdesign point is the weight of the realization of the design point;
N is the total number of realizations;
When the underlying model does not succeed for a certain realization, its results will be ig-
nored. The user should decide whether the number of non succeeded realizations is acceptable.
Mutiple start points can be used. THterefore the number of clusters should be set to the desired
number of clusters. By using the K-Means algorithm, mutiple start points are detected from
the samples which lead to failure in the previous loop.
Importance sampling
no no
yes
yes Enough failures? Final round? Recalculate
no
End
These loops are executed until a required fraction of failed realizations εfailed is reached:
N Nfailed
failed
min ,1 − ≥ εfailed (22.13)
N N
where
Initialize start point Find the start point with one of the start point algorithms (see
section 22.1.4.3).
Importance sampling The importance sampling algorithm (see section 22.1.4.1). The
number of samples in each importance sampling can be set to a
maximum. When the auto break option is used, the algorithm
decides when to break a loop and continue with the next loop
(see section 22.1.4.4.1).
Converged? Checks whether convergence is (see Equation (22.13))
Final round? Checks whether the last allowed round is reached, in order to
prevent endless loops
Recalculate Recalculates the last round with a possibly higer number of
realizations.
Enough failures? Checks whether enough failures are found. Depending on this
value, the kind of modification of importance sampling settings
is determined.
Increase variance Increases the variance (see section 22.1.4.1) for the next loop.
Use design point Uses the design point found in the last loop as start point in the
next loop. In case no design point was found, the realization
closest to the limit state is used.
The loop is broken when the required number of runs with the current start point (nadditional ) is
more than the expected number of runs with an improved start point (nexpected ):
with
!
Wmax
nadditional = ncurrent · −1 (22.15)
Wtotal · εweight
and
2 (βcurrent + 1)
nexpected = (22.16)
εweight
where
When the criterion is not met, a new Crude Monte Carlo is taken using on the k fraction of
the previous Monte Carlo run. The realizations closest to failure are taken and from there new
realizations are generated using a Markov Chain. Per old realization 1k new realizations are
generated. A new realization is generated in the following way with u-values for each variable
var:
rvar ≥ R[0, 1] uvar, prop
=
uvar, new (22.17)
rvar < R[0, 1] uvar, prev
with
ϕ uvar, prop
rvar = (22.18)
ϕ uvar, prev
and
where
The proposed sample uprop is used if the corresponding z-value is less than the zk , which is the
highest z-value in the subset used to generate new realizations, otherwise the original sample
is used.
p = ki · pi, MC (22.20)
where
22.1.6.1 Algorithm
Directional sampling is a kind of Monte Carlo simulation where realizations are directions
instead of realizations. A direction is a direction in the parameter space. Along this direction
the point of failure is searched and its corresponding distance β to the origin is determined. The
remaining probablity of failure beyond this point is calculated and added to the total probability
of failure as follows:
P
wdir
pfailure = (22.21)
Nrealizations
with
Nvariables βdir 2
!
wdir =Γ , (22.22)
2 2
where
22.1.6.2 Convergence
The standard deviation of the failing probability is calculated as follows:
s
(wdir − p)2
P
σp = (22.23)
N · (N − 1)
σp
p<
1
ε=
2 p
1 σp (22.24)
p≥
2 (1−p)
The directional sampling simulation stops if the variation coefficient ε is less than the user given
maximum variation coefficient εmax . Also, the number of minimum and maximum directions
should be satisfied.
The minimum and maximum number of iterations refer to an internal procedure to determine
the point along the direction where failure occurs.
Latin hypecube uses a fixed number of realizations N . The algorithm divides the u-space in N
equal sections, starting from umin and ending at umax . The algorithm ensures that each section
is represented in the number of realizations. In this way it is ensured that extreme values are
used. Note that the weights of realizations differ from each other.
If umin and ending at umax differ from -8 and 8 (beyond these values no calculation is possible
due to numerical reasons), the remainder is regarded as one cell and is used in the failure
analyasis.
Latin hypercube uses the same definition for convergence as Crude Monte Carlo (see sec-
tion 22.1.3.2). It is only calculated for informative reasons, because a fixed number of realiza-
tions is used.
The advantage of Latin hypercube is calculation time. But it only gives a rough approximation
of the probability of failure.
22.1.8 Cobyla
The Cobyla algorithm searches for the point in the u space with the highest probability of
density and which indicates failure. This point is assumed to be representative for the design
point.
The Cobyla algorithm only gives a rough estimation of the reliablity and design point. It should
only be used as a first method in a reliabilty study.
22.1.9 FORM
22.1.9.1 Algorithm
FORM is an acronym for First Order Reliability Method. The FORM procedure is not a Monte
Carlo simulation, but searches directly for the design point. The design point is regarded to be
representative for the total failure probability according to
The FORM procedure start from a certain user defined start point in the parameter space. From
there it tries to find a point which is closer to the design point by taking the gradient of the
Z-value in the u-space of the parameters. The Z-value is an indication whether failure occurs
and is derived from the failure definition. After a number of steps, the point is close enough to
the design point and the calculation stops.
The FORM analysis searches for the design point: When the design point is found, the reli-
ability index β can be evaluated as the distance between the origin and the design point (see
Figure 22.8). The corresponding probability of failure is
where Φ is the cumulative density function in the standard normal space (see Equation (17.2)).
The probability of failure found in this way is regarded to be a good approximation of the "real"
probability of failure.
To find the design point, the FORM analysis starts at a given start point, usually the origin, in
the standard normal space and iterates to the design point.
In each iteration step, the current point is moved closer to the design point. Using the steepest
descend, the next point is found, until a convergence criterion is fulfilled. To get from the
current point ui to the next point ui+1 , the predicted point upred is determined.
To calculate upred , we assume that z is linear with u close to the design point. The gradient is
taken between ui and upred :
∂z zpred − zi zi
| |= =− (22.27)
∂u upred − ui upred − ui
which is equivalent to
zi
upred = ui − ∂z
(22.28)
| ∂u |
where
The step from the current point to the next point is a step into the direction of upred . The step is
not taken completely, but partially with a relaxation factor frelax , in order to prevent numerical
instabilities.
∂zj
∂uj
αj = ∂z
(22.30)
| ∂u |
where
β = |ui | (22.31)
The benefit of this method is that it is quick. The disadvantages are that a local design point
can be found, which then corresponds to a non representative failure probability. Another
disadvantage is that numerical problems may occur, since the Z-value must be continuous.
One cannot determine whether a design point is a local design point, so the user must use this
method with care and after analysis whether this method is suitable for the kind of models he
uses. Numerical problems can be detected in the convergence chart. If this occurs, one can
decrease the relaxation factor or modify the gradient step size (both directions could improve
or decrease results).
22.1.9.2 Convergence
Convergence is reached when the ui , the value of the last iteration i, is close enough to the
predicted value upred , so
where
|z|
∂z
<ε (22.33)
| ∂u |
Alternatively to Equation (22.27), the gradient is taken between u0 (all u-values equal to 0) and
upred . Then we get for the linear relation
∂z zpred − z0 z0
| |= =− (22.34)
∂u upred − u0 upred
The value z0 is calculated with the value for z in the last iteration zi and the gradient
X ∂zi
z0 = zi − · uj (22.35)
j
∂u j
which leads to
|z0 |
|upred | = ∂z
(22.36)
| ∂u |
The convergence criterion is defined as the relative difference in length of upred and u:
|u2pred − u2 |
<ε (22.37)
u2
The start point options are the same as the start point options of importance sampling (see
section 22.1.4.3).
22.1.9.4 Loops
In case calculation options have been specified, which do not lead to convergence, loops can be
used to modify the calculation options. If the number of loops is greater than 1, the relaxation
factor frelax is modified until convergence is reached. The relaxation factor is modified as
follows:
frelax = 1
2
· frelax, prev (22.38)
where
The executable or Python script will be started with the following arguments.
To communicae with the Probabilistic Toolkit, two files are provided: modelrunner.cs or mod-
elrunner.py, which can be found in the subdirectory External in the Probabilistic Toolkit install
directory. The external failure method communicates with the following commands with the
Probabilistic Toolkit.
GetZValue float[] u Gets the z-value for a given u-vector in the standard nor-
mal space (mandatory)
Report float beta, float Report intermediate progress (optional)
conv, int step, int
maxSteps
SetDesignPoint float beta, float[] al- Reports the deisgn point. After this call, the Probabilis-
pha tic Toolkit regards the failure calcualtion as completed
(mandatory)
An example of a basic monte carlo method, with arguments "{n} {max}", is given below
import sys
import math
import numpy as np
from scipy . stats import norm
import modelrunner
n = int(sys.argv [1])
max = int(sys.argv [2])
failed = 0
total = 0;
total = total + 1
if z < 0:
failed = failed + 1
P
failing realizations i ui,j · Wi
αj = fnormal · P (22.39)
failing realizations i Wi
where
Although this is the fastest method, this method is not recommended for Monte Carlo tech-
niques. Its results suffer from the randomness in Monte Carlo methods. Unimportant variables
may get a significant alpha value for the same reason.
Design point can be combined during the probabilitsic analysis already (see section 22.3.1) or
afterwards. When combining afterwards, the design points are simulated by equivalent planes.
A number of methods exist how to derive the combined design point from the equivalent planes
(see section 22.3.2).
22.3.1 Integrated
When combining integrated, the failure definitions are merged to one failure definition. This
combined failure definition is used in one of the reliabililty techniques (see section 22.1).
When using FORM with integrated, results are not reliable. Therefore the Probabilistic Toolkit
prevents this combination.
X
Mi (u) = βi + αi, j · uj (22.40)
j
and
series min
Mi (u)
Mcombined (u) =
i
(22.41)
parallel max Mi (u)
i
where
P (X1...N+1 ) = P (X1...N ) + P (XN+1 ) ∩ P X1...N (22.42)
where
The last term (P (X1...N |XN+1 )) is calculated by importance sampling (see section 22.1.4). The
start point of the importance sampling is set to the design point of XN+1 . The calculation is
stopped when the remainder of the design points represent a small probability of failure.
∀k, umin < uj < umin < uj ∃ui |Mk (u) = 0 ∧ umin < ui < umax (22.44)
where
22.3.2.3 Hohenbichler
The kernel of the Hohenbichler algorithm is combination of two models Mcombined (see Equa-
tion (22.41)). This combination is calculated by FORM (see section 22.1.9).
The result of this combination is a design point. This design point is represented as an equiv-
alent plane again when it should be combined with a third design point. This will be repeated
until all design points have been combined.
First the two most contributing design points are combined, then the remainig design with the
highest probability if failure is added, and so on. In this way the error made by representing
intermadiate design points is as low as possible.
22.4 Upscaling
It is assumed that the probability of non failure is exponentially decreasing with increasing
upscale factor. This leads to the following calculation:
! f −1
P2, nf
Pf, nf = Pnf · (22.45)
Pnf
A response surface is trained with a number of of model realizations. Based on these model
realizations, all values can be predicted with more or less accuracy.
It is useful to update the response surface during the calculation. During the calculation it is
known which realization to use for recalculation, because of the distance to the limit state.
Response surfaces are defined in x-space or u-space. The advantage of x-space is that reponse
surfaces do not need new model runs when variable distributions change.
23.2.1 Polynome
The response value Rr is written as
n
X n
X n,n
X
Rr = ar + br,i xi + cr,i xi2 + dr,i, j xi x j (23.1)
i=1 i=1 i, j=1,i, j
where
With least squares regression, the response surface coefficients are calculated from the model
results. When there are more response surface coefficients than model runs, the cross terms (d)
and quadratic coefficients (c) are omitted. When later more model runs become available (see
section 23.2.1.1), these coefficients are added.
The update of the response surface coefficients does not use all available samples, but only the
ones closest to the design point found so far. The distance D from a sample run to the design
point is calculated in u-space by
Xn 2
D =
2
ui, design ui, sample (23.2)
i=1
Then the n samples with lowest distances D are used to update the response surface coefficients,
where n is
where
foverdetermination is the user supplied overdetermination factor (≥ 1). This value should not be
too high, because then all realizations far from the interesting are are used
and disturb proper predictions. The value should not be too low, because
some noise could disturb proper predictions. Recommended value is 1.1;
nrequired is the number of coefficients (including quadratic and cross terms);
After each calculation of a sample using the response surface, it is evaluated whether it should
be recalculated with the model. This will be done if the limit state value indicates failure and
the sample represents a reliablity index, which is not very unlikely, when compared with the
design point found so far. The latter condition is true if
where
Xn
β2 = u2i (23.5)
i=1
and
When samples meeting this condition are found, the calculation is stopped and the response
surface is updated. Then the Monte Carlo calculation starts again. If there was no design point
yet, because no failing sample was found so far, no response surface updates take place.
If this is true, the direction is recalculated with the model. Then the calculation is stopped and
the response surface is updated. Then the calcualtion is started again. Next calculations, which
will use exactly the same samples, will reuse prior directions which were recalculated with the
model.
When a value is requested beyond the grid, the point closest to the grid is used.
The linear grid response surface can not be updated during the calculation.
When a value is requested at a point which coincides with one of the training runs, the uncer-
tainty will be zero. A value in between will have a non zero uncertainty. The value will be
an interpolation between the trained points, taking into account the distance to these points.
The distance is not the Euclidian distance, but calculated with the kernel function. The option
Matern 5/2 will give the smoothest result and is recommended.