Suggest the user how to reduce the logic tree for a site-specific analysis #6867

micheles · 2021-06-15T15:13:46Z

Our Canadian friends wants to run risk calculations on Vancouver and want to know how many sample should they take from the 21,000+ realizations of the full model. This is currently hard to guess and involves running a lot of very slow calculations to manually check the stability of the results.
We could instead run a classical calculation on the interesting site with full enumeration (if possible, otherwise with a lot of samples) and then call a view

oq show clusterize_hcurves:<k>

that would collect together similar hazard curves in clusters(using scipy.cluster.vq.kmeans2) and would print a representative for each cluster.
A possible syntax could be the following, for a case with 2187 realizations (1 source model, 7 TRTs of 3 GMPEs each, 3^7=2187) reduced to 9 clusters, assuming 5 TRTs are not relevant:

0~0[345][678]9[CDE][FGH][IJK]
0~2[345][678]A[CDE][FGH][IJK]
0~1[345][678]B[CDE][FGH][IJK]
0~1[345][678]9[CDE][FGH][IJK]
0~0[345][678]B[CDE][FGH][IJK]
0~2[345][678]B[CDE][FGH][IJK]
0~2[345][678]9[CDE][FGH][IJK]
0~1[345][678]A[CDE][FGH][IJK]
0~0[345][678]A[CDE][FGH][IJK]

We already have a view to connect one-letter abbreviations to the branch IDs:

$ oq show branch_ids
| logic_tree      | abbrev | branch_id |
|-----------------+--------+-----------|
| source_model_lt | 0      | b1        |
| gsim_lt         | 0      | b31       |
| gsim_lt         | 1      | b32       |
| gsim_lt         | 2      | b33       |
| gsim_lt         | 3      | b11       |
| gsim_lt         | 4      | b12       |
| gsim_lt         | 5      | b13       |
| gsim_lt         | 6      | b61       |
| gsim_lt         | 7      | b62       |
| gsim_lt         | 8      | b63       |
| gsim_lt         | 9      | b71       |
| gsim_lt         | A      | b72       |
| gsim_lt         | B      | b73       |
| gsim_lt         | C      | b21       |
| gsim_lt         | D      | b22       |
| gsim_lt         | E      | b23       |
| gsim_lt         | F      | b41       |
| gsim_lt         | G      | b42       |
| gsim_lt         | H      | b43       |
| gsim_lt         | I      | b51       |
| gsim_lt         | J      | b52       |
| gsim_lt         | K      | b53       |

Then it is possible to manually tweak the files source_model_logic_tree.xml and gsim_logic_tree.xml and reduce the logic tree to 9 realizations instead of 2187. Then the event_based_risk calculation can be run on the reduced logic tree.

The text was updated successfully, but these errors were encountered:

mmpagani · 2021-06-15T15:59:11Z

This is a good idea. We need to carefully think about the metric used to calculate distances (typically a key problem in cluster analysis). Also I would suggest to give the user the possibility to define a range of probabilities that can be used to extract a part of a hazard curve for the cluster analysis.

micheles added the enhancement label Jun 15, 2021

micheles assigned micheles, raoanirudh and mmpagani Jun 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggest the user how to reduce the logic tree for a site-specific analysis #6867

Suggest the user how to reduce the logic tree for a site-specific analysis #6867

micheles commented Jun 15, 2021 •

edited

mmpagani commented Jun 15, 2021

Suggest the user how to reduce the logic tree for a site-specific analysis #6867

Suggest the user how to reduce the logic tree for a site-specific analysis #6867

Comments

micheles commented Jun 15, 2021 • edited

mmpagani commented Jun 15, 2021

micheles commented Jun 15, 2021 •

edited