ExaHyPE-Engine issueshttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues2017-07-13T12:35:21+02:00https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/161AMR+Limiting leads to dispersion errors2017-07-13T12:35:21+02:00Ghost UserAMR+Limiting leads to dispersion errors![amr_limiter_error](/uploads/4f1591cfd8d0db8a2fbc76675deec3a6/amr_limiter_error.png)
* Shock is not at the correct position.
* Problem appears for both, Godunov and MUSCL-Hancock methods.
* Problem affects MUSCL-Hancock more
P...![amr_limiter_error](/uploads/4f1591cfd8d0db8a2fbc76675deec3a6/amr_limiter_error.png)
* Shock is not at the correct position.
* Problem appears for both, Godunov and MUSCL-Hancock methods.
* Problem affects MUSCL-Hancock more
Problem is not solved yet
https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/162Combine adapters SolutionUpdate and TimeStepSizeComputation2017-08-16T13:06:55+02:00Ghost UserCombine adapters SolutionUpdate and TimeStepSizeComputation......https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/163Have a maximum-timestep-size in the spec file2017-07-12T17:49:05+02:00Ghost UserHave a maximum-timestep-size in the spec fileThis value can then be used to prescribe a time step size as long.
As the solver does not require a smaller one at least.This value can then be used to prescribe a time step size as long.
As the solver does not require a smaller one at least.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/164Extrapolate space-time polynomials and wrap boundary condition imposition in ...2017-07-12T20:24:40+02:00Ghost UserExtrapolate space-time polynomials and wrap boundary condition imposition in time-integralThis will further be a first step for local time steppingThis will further be a first step for local time steppinghttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/165Minimise limiter status based halo refinement2017-08-15T16:59:19+02:00Ghost UserMinimise limiter status based halo refinementIf the a-posteriori troubled cell indicators indicate a troubled cell, we refine
this cell down to the finest mesh level specified by the user.
I further have to ensure that I can place helper cell layers around the troubled cell
also o...If the a-posteriori troubled cell indicators indicate a troubled cell, we refine
this cell down to the finest mesh level specified by the user.
I further have to ensure that I can place helper cell layers around the troubled cell
also on the finest mesh level. Thus, I need some halo refinement.
Currently, I refine all neighbours of a cell with limiter status 1 and 2 in order to ensure
that the diagonal neighbours are refined as well.
In principle, I could however refine the cells with have a limiter status of 1 and
the cells which have two neighbours with a limiter status 1.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/166Peano Heaps and Intel MPI do not work properly together2017-08-25T11:26:38+02:00Ghost UserPeano Heaps and Intel MPI do not work properly together(Open MPI seems to work.)
Single-node MPI tests show:
* MPI is currently not working correctly.
* TBB seems to work correctly.
It's always fun to debug MPI...
# 1 MPI+TBB
## 1.1
```
Euler_ADERDG-no-output-gen-fused...(Open MPI seems to work.)
Single-node MPI tests show:
* MPI is currently not working correctly.
* TBB seems to work correctly.
It's always fun to debug MPI...
# 1 MPI+TBB
## 1.1
```
Euler_ADERDG-no-output-gen-fused-regular-0-p5-TBB-Intel-n1-t1-c24.out
8.68453 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 0 t_min =0
8.68455 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000270382
8.68459 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
9.77685 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 1 t_min =0.000270382
9.77688 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000270382
9.7769 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
10.7874 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 2 t_min =0.000540765
10.7875 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269703
10.7875 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
11.8094 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 3 t_min =0.000810468
11.8094 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269363
11.8095 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
12.8234 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 4 t_min =0.00107983
12.8234 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269194
12.8235 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
13.8676 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 5 t_min =0.00134902
13.8676 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269109
13.8676 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
14.8835 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 6 t_min =0.00161813
14.8835 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269066
14.8836 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
15.8789 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 7 t_min =0.0018872
15.879 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269045
15.879 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
16.9168 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 8 t_min =0.00215624
16.9168 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269034
16.9168 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
17.9286 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 9 t_min =0.00242528
17.9286 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269029
17.9286 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
18.9628 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 10 t_min =0.00269431
18.9629 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269026
18.9629 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
19.9945 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 11 t_min =0.00296333
19.9945 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269025
19.9955 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
21.0106 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 12 t_min =0.00323236
21.0107 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269024
21.0107 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of mesh refinements = 0
21.0107 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of local recomputations = 0
21.0107 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of global recomputations = 0
21.0107 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of predictor reruns = 0
```
## 1.2
```
Euler_ADERDG-no-output-gen-fused-regular-0-p5-TBB-Intel-n1-t24-c1.out
5.84491 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::createGrid(Repository) finished grid setup after 17 iterations
6.1414 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runAsMaster(...) initialised all data and computed first time step size
8.1216 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runAsMaster(...) plotted initial solution (if specified) and computed first predictor
8.12165 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 0 t_min =0
8.12167 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000270382
8.12171 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
11.0395 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) recompute space-time predictor
13.2524 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 1 t_min =0.000270382
13.2525 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000251415 !!! DIFFERENCE !!!
13.2525 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
16.445 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) recompute space-time predictor
18.9004 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 2 t_min =0.000521798
18.9005 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000242749
18.9005 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
22.8658 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) recompute space-time predictor
25.8135 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 3 t_min =0.000764547
25.8136 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000235312
25.8136 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
30.957 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 4 t_min =0.000999859
30.9571 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000235312
156.472 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 13 t_min =0.00315762
156.472 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000242922
156.472 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of mesh refinements = 0
156.472 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of local recomputations = 0
156.472 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of global recomputations = 0
156.472 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of predictor reruns = 6
```
## 1.3
```
Euler_ADERDG-no-output-gen-fused-regular-0-p5-TBB-Intel-n1-t4-c6.out
16.3129 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 0 t_min =0
16.3129 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000270382
16.313 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
25.2424 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) recompute space-time predictor
31.3638 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 1 t_min =0.000270382
31.3638 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000255681 !!! DIFFERENCE !!!
31.3639 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
40.3852 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 2 t_min =0.000526063
40.3853 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000255681
40.3853 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
49.7392 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) recompute space-time predictor
57.1134 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 3 t_min =0.000781745
57.1135 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000248662
57.1135 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
67.3553 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 4 t_min =0.00103041
67.3554 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000248662
...
...
...
175.134 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
193.899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 12 t_min =0.00311021
193.899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.00026858
193.899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of mesh refinements = 0
193.899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of local recomputations = 0
193.899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of global recomputations = 0
193.899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of predictor reruns = 2
```
# 2 MPI+None
## 2.1
```
Euler_ADERDG-no-output-gen-fused-regular-0-p5-None-Intel-n1-t1-c24.out
17.7097 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 0 t_min =0
17.7097 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000270382
17.7098 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
28.1517 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 1 t_min =0.000270382
28.1518 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000270382
28.1518 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
38.5223 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 2 t_min =0.000540765
38.5223 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269703
38.5223 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
48.7899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 3 t_min =0.000810468
48.79 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269363
48.79 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
59.1023 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 4 t_min =0.00107983
59.1023 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269194
...
...
...
130.907 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
141.16 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 12 t_min =0.00323236
141.16 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269024
141.16 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of mesh refinements = 0
141.16 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of local recomputations = 0
141.16 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of global recomputations = 0
141.16 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of predictor reruns = 0
```
## 2.2
```
Euler_ADERDG-no-output-gen-fused-regular-0-p5-None-Intel-n1-t24-c1.out
14.0067 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
16.5843 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) recompute space-time predictor
19.2036 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 1 t_min =0.000270382
19.2036 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000212189
19.2037 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
21.8836 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 2 t_min =0.000482572
21.8836 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000212189
21.8836 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
25.2896 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 3 t_min =0.000694761
25.2896 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000223607
25.2897 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
29.067 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 4 t_min =0.000918368
29.0671 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000229512
...
...
...
109.009 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
125.061 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 13 t_min =0.00303507
125.061 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000243652
125.061 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of mesh refinements = 0
125.062 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of local recomputations = 0
125.062 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of global recomputations = 0
125.062 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of predictor reruns = 1
```https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/167Grid cells are not erased properly2017-08-16T20:03:29+02:00Ghost UserGrid cells are not erased properly... only the cell description is deleted. Empty remains in the grid.
Furthermore, we do not erase all cells and cell descriptions at the end of the simulation.
We rely on the OS's process manager to free the application memory.... only the cell description is deleted. Empty remains in the grid.
Furthermore, we do not erase all cells and cell descriptions at the end of the simulation.
We rely on the OS's process manager to free the application memory.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/168Metadata-merging currently not done properly2017-08-14T12:53:04+02:00Ghost UserMetadata-merging currently not done properlyMy latest changes introduced a bug into codes using AMR or LimitingADERDGSolver:
There is only a merge of local metadata performed in the first iteration.My latest changes introduced a bug into codes using AMR or LimitingADERDGSolver:
There is only a merge of local metadata performed in the first iteration.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/169Do not impose initial conditions in every mesh refinement iteration2017-11-21T13:42:50+01:00Ghost UserDo not impose initial conditions in every mesh refinement iterationIt turned out that this is quite costly.
It might be responsible for time outs encountered during the
initial grid setup.
Min and max search:
--------------------
- This should be done once in FinaliseMeshRefinement
since we have a lar...It turned out that this is quite costly.
It might be responsible for time outs encountered during the
initial grid setup.
Min and max search:
--------------------
- This should be done once in FinaliseMeshRefinement
since we have a larger concurrency level here.
- We further should move around some of the behaviour of LocalRecomputation
into FinaliseMeshRefinement.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/170Clear out the private member variables in the solvers2017-11-21T13:40:26+01:00Ghost UserClear out the private member variables in the solversMany of the private member variables in in the ADERDGSolver,FiniteVolumesSolver,
LimitingADERDGSolver can be computed. This includes cardinalities.
Only store the order of approximation, number of variables and number of parameters.
Comp...Many of the private member variables in in the ADERDGSolver,FiniteVolumesSolver,
LimitingADERDGSolver can be computed. This includes cardinalities.
Only store the order of approximation, number of variables and number of parameters.
Compute all other cardinalities; provide getter functions.
- It would be better if the optimised solver would dismiss the "getBnd.." functions and would
overwrite the existing, now virtual, "getUnknowns..." and "getData.." functions.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/171Bug in LimitingADERDGSolver MPI implementation2018-03-20T16:14:23+01:00Ghost UserBug in LimitingADERDGSolver MPI implementationMin and max is not send correctly to neighbour if Heap neighbour comm. is
configured as non-blocking (CreateCopiesOfSentData=false).Min and max is not send correctly to neighbour if Heap neighbour comm. is
configured as non-blocking (CreateCopiesOfSentData=false).https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/172High time outs required during mesh refinement iterations2017-08-25T11:24:13+02:00Ghost UserHigh time outs required during mesh refinement iterationsI could get a p=9 243^3 Euler scenario run (64x14 ranks) but only if
I set the timeout to a large value of 360 sec.
Issue presumed to be related to imposing initial conditions
on-the-fly during the grid setup iterations.
This would exp...I could get a p=9 243^3 Euler scenario run (64x14 ranks) but only if
I set the timeout to a large value of 360 sec.
Issue presumed to be related to imposing initial conditions
on-the-fly during the grid setup iterations.
This would explain the worse behaviour in 3d as well as
for the limiting ADER-DG solver.
For the latter, also the min and max is computed on-the-fly.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/173LimitingADERDGSolver with limiter layers = 1 crashes in certain scenarios2017-08-29T18:48:26+02:00Ghost UserLimitingADERDGSolver with limiter layers = 1 crashes in certain scenarios* ~~I might need to stop merging the limiter status on-the-fly if
there is only 1 limiter layer.~~
* Does also affect helper-layers =2 runs.
* The crash is prevented by choosing a stricter dmp-relaxation-parameter.
It seems to be a ...* ~~I might need to stop merging the limiter status on-the-fly if
there is only 1 limiter layer.~~
* Does also affect helper-layers =2 runs.
* The crash is prevented by choosing a stricter dmp-relaxation-parameter.
It seems to be a numerical issue with the MUSCL-Hancock scheme.
* I will check if the issue does not appear if I use the Godunov scheme again.
https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/174Generic kernels: Select fluxes,ncp similar as for optimised kernels?2017-10-04T16:06:01+02:00Ghost UserGeneric kernels: Select fluxes,ncp similar as for optimised kernels?This issue is open for discussion.
Issue
-----
Working with new ExaHyPE users has revealed that they often are not familar
with the object-oriented programming concept of inheritance.
Especially, they do not know how to overriding vir...This issue is open for discussion.
Issue
-----
Working with new ExaHyPE users has revealed that they often are not familar
with the object-oriented programming concept of inheritance.
Especially, they do not know how to overriding virtual functions
of the AbstractMySolver class.
They are further not familiar with the keywords "virtual" and "override".
This issue makes it difficult for them to select the
right PDE kernels (flux,ncp,...) for their application.
Even worse: The code might even compile and run but it will not perform
the expected calculations.
Such an error is very hard to detect in practice.
Especially for a new ExaHyPE user.
Toolkit-based solution (open for discussion)
--------------------------------------------
JM has moved the selection of the
PDE kernels to the toolkit by requiring the
user to specify the kernels in the following way:
```
kernels const = optimised::fluxes::nonlinear // flux only
```
or
```
kernels const = optimised::fluxes::ncp::nonlinear // flux and ncp
```
or
```
kernels const = optimised::fluxes::ncp::source::nonlinear // flux and ncp, source
```
In my opinion, this is the better approach.
The "const" modifier of "kernels" indicates that the user has to
rerun the toolkit everytime he selects different PDE-kernels
The toolkit will then update the AbstractSolver Header file.
The compiler will deal with any inconsistencies between
the files:
* The compiler will tell you if you have not implemented a
kernel you have specified - not an assertion
as it is the case right now.
(Users often do not even know about the Asserts Mode.)
* The compiler will tell you if you have implemented
a method which is not called. You should then
comment out the implementation or remove it.
What is your opinion?
---------------------
Please comment below.Jean-Matthieu GallardJean-Matthieu Gallardhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/175Symbolic flux calculations reduce speed significantly2019-03-21T10:56:35+01:00Ghost UserSymbolic flux calculations reduce speed significantlyI just completed some Likwid performance measurements for the generic and optimised kernels.
For the optimised kernels, I tested a variant using "symbolic variables" in the
flux and eigenvalue computation and another variant using
classi...I just completed some Likwid performance measurements for the generic and optimised kernels.
For the optimised kernels, I tested a variant using "symbolic variables" in the
flux and eigenvalue computation and another variant using
classic array indexing (optimised-nonsymbolic).
The files are suffixed by a ".likwid.csv".
I further attached measured Peano adapter times
The files are suffixed by a ".csv".
Setup
--------
* Compressible Euler equations (Euler_Flow)
* pure ADER-DG scheme (no limiter)
* polynomial orders p=3,5,7,9;
regular 27^3 grid (3D)
* TBB threads=1,12,24.
* Intel icpc17 (USE_IPO=on).
* nonfused (3 algorithmic phases) vs. fused (a single pipelined algorithmic phase) ADER-DG implementation
* no predictor reruns did occur for the fused implementation
Preliminary Results
----------------------------
* Optimised kernels are faster than the generic ones (I kind of expected this ðŸ˜‰)
* Raw array access (optimised-nonsymbolic) is significantly faster than using the "symbolic variables"(optimised).
* Fused scheme pays off (as long as number of reruns is low; very interesting for linear PDEs (no reruns here))
Files
-----
[Euler_ADERDG-no-output-generic.csv](/uploads/f671c93a33e70c9549853f1f51518c9d/Euler_ADERDG-no-output-generic.csv)
[Euler_ADERDG-no-output-generic.likwid.csv](/uploads/0d52ad214f0b1d6cfae4d658a9997cb5/Euler_ADERDG-no-output-generic.likwid.csv)
[Euler_ADERDG-no-output-optimised.csv](/uploads/422288cdc222b1b250b3aad9e2ad73a1/Euler_ADERDG-no-output-optimised.csv)
[Euler_ADERDG-no-output-optimised-nonsymbolic.csv](/uploads/0f84240941230d4bc9f06c76b154396b/Euler_ADERDG-no-output-optimised-nonsymbolic.csv)
[Euler_ADERDG-no-output-optimised.likwid.csv](/uploads/135bd71c668ca0c4c4ee5b7988ea2f17/Euler_ADERDG-no-output-optimised.likwid.csv)
[Euler_ADERDG-no-output-optimised-nonsymbolic.likwid.csv](/uploads/7606eefe559499a1f6de62b5f33905d0/Euler_ADERDG-no-output-optimised-nonsymbolic.likwid.csv)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/176Bug in optimised kernel generation for limiting ader-dg solver2017-09-06T13:58:22+02:00Ghost UserBug in optimised kernel generation for limiting ader-dg solverInclude file is wrong. Is <MySolver>_ADERDG.h for Limiting-ADER-DG solver.
```
/ddn/home/jdmd33/dev/ExaHyPE-Engine/./Benchmarks/hamilton/Euler/kernels/EulerSolver/stableTimeStepSize.cpp(7): catastrophic error: cannot open source file "E...Include file is wrong. Is <MySolver>_ADERDG.h for Limiting-ADER-DG solver.
```
/ddn/home/jdmd33/dev/ExaHyPE-Engine/./Benchmarks/hamilton/Euler/kernels/EulerSolver/stableTimeStepSize.cpp(7): catastrophic error: cannot open source file "EulerSolver.h"
#include "EulerSolver.h"
^
```Jean-Matthieu GallardJean-Matthieu Gallardhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/177Batching required for LTS and ExaHyPE's limiter-based refinement2017-09-04T14:16:41+02:00Ghost UserBatching required for LTS and ExaHyPE's limiter-based refinementLimiting ADER-DG in ExaHyPE
---------------------------
In ExaHyPE, we do not consider adaptive refinement for the Finite Volumes (FV)
solver.
We thus only use FV on the finest level of the adaptive mesh
if we employ a limiting ADER-DG ...Limiting ADER-DG in ExaHyPE
---------------------------
In ExaHyPE, we do not consider adaptive refinement for the Finite Volumes (FV)
solver.
We thus only use FV on the finest level of the adaptive mesh
if we employ a limiting ADER-DG solver.
If a limiting ADER-DG solver detects a shock, i.e. a cell where we need to
limit non-physical oscillations, we refine it down to
the finest adaptive mesh level.
And we further refine its neighbours down to the finest mesh level.
Local Time Stepping
-------------------
TBC
Local Time Stepping plus limiting ADER-DG in ExaHyPE
-----------------------------------------------------
It is possible that a cell is marked as troubled during the
local time stepping. This will require potentially
more refinement around the cell, or to refine the cell itself.
The newly refined cells might not have done any time stepping
at all during the current LTS batch since
their parents stem from a coarser level.
They lag behind in time.
The question is what to do in those scenarios.
Batching
--------
One idea is to consider the number of local time steps to run as a batch.
We remember the solution before a batch starts.
In case (limiter-based) refinement is triggered, we memorise the cells to refine,
perform a rollback to the memorised solution.
Afterwards, we refine the memorised cells and run the batch again.
It can happen multiple times that we have to rerun a batch.
On the other hand, we ensure that our grid is always tracking a shock correctly.
Further techniques
------------------
On the expense of computational cost, we could refine the mesh generously (according to the restricted limiter status)
This would reduce the number of batch reruns.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/178Limiting ADER-DG: Solution min and max computation is very expensive2017-09-04T16:02:23+02:00Ghost UserLimiting ADER-DG: Solution min and max computation is very expensiveWe found that the min/max computation (in ExaHyPE) can currently be by a factor 10 more
expensive than the space-time predictor computation.
Test case was Euler equations in 3D with p=5 polynomials and a Sod shock tube scenario.We found that the min/max computation (in ExaHyPE) can currently be by a factor 10 more
expensive than the space-time predictor computation.
Test case was Euler equations in 3D with p=5 polynomials and a Sod shock tube scenario.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/179Split TimeStepSizeComputation and merge parts with SolutionUpdate and LocalRe...2017-10-04T17:11:38+02:00Ghost UserSplit TimeStepSizeComputation and merge parts with SolutionUpdate and LocalRecomputation- TimeStepSizeComputation will be reduced to a simple
time step size computation function.
SolutionUpdate and LocalRecomputation will also
compute a time step size and will further advance in time.
This will reduce logic.
- Next step wi...- TimeStepSizeComputation will be reduced to a simple
time step size computation function.
SolutionUpdate and LocalRecomputation will also
compute a time step size and will further advance in time.
This will reduce logic.
- Next step will be a fusion of multiple algorithmic phases of the
ADER-DG and Limiting ADER-DG schemes in a single solver function.
This will hopefully make it easier to optimise for the compiler
and easier for the processor cache to decide what to hold or drop.
- We might be able to get rid of mapping FusedTimeSteppingInitialisation
with the above split.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/180Compression - take ghostLayers and padding into account2017-09-26T10:19:58+02:00Ghost UserCompression - take ghostLayers and padding into account- FiniteVolumesSolver - Compression should take ghostLayers+padding (alignment) into account
- ADERDGSolver - Compression should take padding (alignment) into account- FiniteVolumesSolver - Compression should take ghostLayers+padding (alignment) into account
- ADERDGSolver - Compression should take padding (alignment) into accounthttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/181Get rid of TemporaryVariables2018-03-20T16:13:53+01:00Ghost UserGet rid of TemporaryVariables- We still have to debate if this makes sense.- We still have to debate if this makes sense.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/182Aligned heaps throw segmentation fault upon deleteData invocation2017-10-13T12:16:05+02:00Ghost UserAligned heaps throw segmentation fault upon deleteData invocation- This is especially an issue for adaptive simulations.
- Adaptive simulations are currently also not possible with the optimised kernels
since there are no optimised variants of the prolongation and restriction
kernels available yet...- This is especially an issue for adaptive simulations.
- Adaptive simulations are currently also not possible with the optimised kernels
since there are no optimised variants of the prolongation and restriction
kernels available yet.
TODO
- I will add a test for the heap allocation. Hopefully this will minimise the complexity
of the problem.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/183MPI FV doesnt compile2017-10-02T16:30:46+02:00Ghost UserMPI FV doesnt compileProblem occurs in a line with TODO(Dominic), so @di25cox :
The FiniteVolumeCellDescription does not contain a method getAdjacentToRemoteRank(), so building fails:
```
/home/sven/numrel/exahype/Engine-ExaHyPE/./ExaHyPE/exahype/solvers/...Problem occurs in a line with TODO(Dominic), so @di25cox :
The FiniteVolumeCellDescription does not contain a method getAdjacentToRemoteRank(), so building fails:
```
/home/sven/numrel/exahype/Engine-ExaHyPE/./ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp: In member function â€˜virtual void exahype::solvers::FiniteVolumesSolver::preProcess(int, int) constâ€™:
/home/sven/numrel/exahype/Engine-ExaHyPE/./ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp:2152:22: error: â€˜exahype::solvers::FiniteVolumesSolver::CellDescription {aka class exahype::records::FiniteVolumesCellDescription}â€™ has no member named â€˜getAdjacentToRemoteRankâ€™
!cellDescription.getAdjacentToRemoteRank() // TODO(Dominic): What is going on here?
^
/home/sven/numrel/exahype/Engine-ExaHyPE/./ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp: In member function â€˜virtual void exahype::solvers::FiniteVolumesSolver::postProcess(int, int)â€™:
/home/sven/numrel/exahype/Engine-ExaHyPE/./ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp:2168:24: error: â€˜exahype::solvers::FiniteVolumesSolver::CellDescription {aka class exahype::records::FiniteVolumesCellDescription}â€™ has no member named â€˜getAdjacentToRemoteRankâ€™
!cellDescription.getAdjacentToRemoteRank() // TODO(Dominic): What is going on here?
^
```
This workaround works for me:
```
diff --git a/ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp b/ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp
index ddba006..949ba38 100644
--- a/ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp
+++ b/ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp
@@ -2149,7 +2149,7 @@ void exahype::solvers::FiniteVolumesSolver::preProcess(
cellDescription.getType()==CellDescription::Type::Cell
#ifdef Parallel
&&
- !cellDescription.getAdjacentToRemoteRank() // TODO(Dominic): What is going on here?
+ 1 // !cellDescription.getAdjacentToRemoteRank() // TODO(Dominic): What is going on here? // TODO FIX THIS LINE
#endif
) {
uncompress(cellDescription);
@@ -2165,7 +2165,7 @@ void exahype::solvers::FiniteVolumesSolver::postProcess(
cellDescription.getType()==CellDescription::Type::Cell
#ifdef Parallel
&&
- !cellDescription.getAdjacentToRemoteRank() // TODO(Dominic): What is going on here?
+ 1 // !cellDescription.getAdjacentToRemoteRank() // TODO(Dominic): What is going on here? // TODO FIX THIS LINE
#endif
&&
CompressionAccuracy>0.0
```
I pushed this to https://gitlab.lrz.de/exahype/ExaHyPE-Engine/commit/1d435a25b782e5cc30d10b27db30d8b8e6609c7d on the master. Should have used a merge request instead. Is a workaround anyway. Please fix it.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/184Decompose mapping Merging into two mappings2017-11-21T13:39:29+01:00Ghost UserDecompose mapping Merging into two mappingsMapping merging does currently perform neighbour merges (touchVertexFirstTime etc.) as well as the
merging of the time step data from the master with the worker.
Often only a merging of time step data is necessary. The touchVertexFirstTi...Mapping merging does currently perform neighbour merges (touchVertexFirstTime etc.) as well as the
merging of the time step data from the master with the worker.
Often only a merging of time step data is necessary. The touchVertexFirstTime merges
then reduce to a nop. In a shared memory context, this means that we add unnecessary overhead.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/185[Toolkit] Makefile generation with template engine + remove ARCHITECTURE2017-10-06T16:28:14+02:00Jean-Matthieu Gallard[Toolkit] Makefile generation with template engine + remove ARCHITECTURETODO
* Generate the Makefile using the template engine
* Remove the ARCHITECTURE parameter export, use the one from the spec fileTODO
* Generate the Makefile using the template engine
* Remove the ARCHITECTURE parameter export, use the one from the spec fileJean-Matthieu GallardJean-Matthieu Gallardhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/186Turn mapping events off on coarse levels2019-09-20T15:46:32+02:00Ghost UserTurn mapping events off on coarse levels* We basically will go from here:
```
peano::MappingSpecification
exahype::mappings::XYZ::enterCellSpecification(int level) const {
return peano::MappingSpecification(
peano::MappingSpecification::WholeTree,
peano::...* We basically will go from here:
```
peano::MappingSpecification
exahype::mappings::XYZ::enterCellSpecification(int level) const {
return peano::MappingSpecification(
peano::MappingSpecification::WholeTree,
peano::MappingSpecification::RunConcurrentlyOnFineGrid,true);
}
```
To here:
```
peano::MappingSpecification
exahype::mappings::XYZ::enterCellSpecification(int level) const {
if (level < exahype::solvers::getCoarsestMeshLevelOfAllSolvers()) {
return peano::MappingSpecification(
peano::MappingSpecification::Nop,
peano::MappingSpecification::RunConcurrentlyOnFineGrid,true);
}
return peano::MappingSpecification(
peano::MappingSpecification::WholeTree,
peano::MappingSpecification::RunConcurrentlyOnFineGrid,true);
}
```
* Second, we should set the alterState bool only to true in mappings where this is necessary, i.e.,
where we perform reductions or use temporary variables.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/187No time averaging and kernel's signature2017-10-24T15:25:13+02:00Jean-Matthieu GallardNo time averaging and kernel's signatureHi,
Small design decision. The no time averaging (NTA) option has the effect that the SpaceTimePredictor and VolumeIntegral don't take the same argument with or without the option.
In the case of the SpaceTimePredictor it's a few data ...Hi,
Small design decision. The no time averaging (NTA) option has the effect that the SpaceTimePredictor and VolumeIntegral don't take the same argument with or without the option.
In the case of the SpaceTimePredictor it's a few data storage that aren't required (the one storing the time averaged data) so it's not a bid deal as I can just pass the pointer that happen to be nullptr if NTA is enabled.
In the case of the VolumeIntegral however it's a bit dirtier: it needs either lFhi (time averaged flux) or lFi. Currently I have chosen to always gives the kernel both storage and let it use the right one, ignoring the other (lFhi might be nullptr, lFi is always correct). Another possibility since both lFi and lFhi are double* would be to use the same signature and just pass the correct one, which I could detect by checking if lFhi is nullptr but it would be a bit unclean because it would mean assuming it has to be nullptr when using NTA (which it currently is, but I don't like having such hidden relationship in the code).
Do you agree with the current design (giving everything to the kernel, possibly nullptr, and letting it do the correct choice based on its template parameters) or do you have another design preference ?
Best,
JM
@svenk @di25cox @gi26detJean-Matthieu GallardJean-Matthieu Gallardhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/188Global Reduction of Time Stepping Data2017-10-11T11:50:37+02:00Ghost UserGlobal Reduction of Time Stepping DataI currently do the following to reduce and broadcast global
values, like e.g. the minimum time step size:
* I broadcast time step data from master to worker
all the way down to the "lowest" worker.
* I reduce time step data from worker...I currently do the following to reduce and broadcast global
values, like e.g. the minimum time step size:
* I broadcast time step data from master to worker
all the way down to the "lowest" worker.
* I reduce time step data from worker to master
all the way up to the global master rank.
I could use a simple MPI_Reduce and a simple MPI_Gather to
perform the above steps.~~
Postponed.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/189Bug in Solver mesh level computation if bounding box is virtually expanded2017-10-11T16:47:02+02:00Ghost UserBug in Solver mesh level computation if bounding box is virtually expandedThere is a bug in the mesh level computation if the boundy box is virtually expanded:
```
_coarsestMeshLevel =
exahype::solvers::Solver::computeMeshLevel(_maximumMeshSize,domainSize[0]);
```
Then, ``_domainSize != _boundingBoxSize``.There is a bug in the mesh level computation if the boundy box is virtually expanded:
```
_coarsestMeshLevel =
exahype::solvers::Solver::computeMeshLevel(_maximumMeshSize,domainSize[0]);
```
Then, ``_domainSize != _boundingBoxSize``.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/190Seg Fault in MPI Build if distributed-memory section is missing2019-09-20T15:46:30+02:00Ghost UserSeg Fault in MPI Build if distributed-memory section is missingThis occurs in Runner::initHPCEnvironment(...) and has
nothing to do with other Seg Faults occuring in builds using Alignment.This occurs in Runner::initHPCEnvironment(...) and has
nothing to do with other Seg Faults occuring in builds using Alignment.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/191Pass std::vectors to patchwise functions2017-10-13T20:55:05+02:00Ghost UserPass std::vectors to patchwise functions- It's safer - size is available e.g.
- Signatures are clearer. User kernels still get the raw pointers.
- We can just do vector.data() to pass a pointer to the existing kernels.- It's safer - size is available e.g.
- Signatures are clearer. User kernels still get the raw pointers.
- We can just do vector.data() to pass a pointer to the existing kernels.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/192[Peano] Number of persistent subgrids is changing from iteration to iteration2017-10-16T11:09:26+02:00Ghost User[Peano] Number of persistent subgrids is changing from iteration to iterationWe observed that if a rank has distributed all its local work to workers,
the number of cells of this rank which are stored in regular subgrids
might change every second iteration.We observed that if a rank has distributed all its local work to workers,
the number of cells of this rank which are stored in regular subgrids
might change every second iteration.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/193Inconsitencies with the solver constructors (ParserView, cmd line passing)2017-10-18T09:07:53+02:00Ghost UserInconsitencies with the solver constructors (ParserView, cmd line passing)Note: This is a _minor_ long term issue.
I noticed (once again) that we have inconsistencies in the handling of the user solver constructor, the abstract solver constructor and the user `init` function, all this for FV, ADERDG and Limit...Note: This is a _minor_ long term issue.
I noticed (once again) that we have inconsistencies in the handling of the user solver constructor, the abstract solver constructor and the user `init` function, all this for FV, ADERDG and LimitingADERDG solvers. While the Command line options passing (`std::vector<std::string>& cmdlineargs`) is mandatory in all solvers, they only get the `exahype::Parser::ParserView& constants` when there *are* constants in the specfile. This creates a lot of trouble for the users, for instance when they decide to introduce constants on an existing application because they need to introduce the new signature then on their own.
Even worse: Currently, the `ParserView&` works for FV solvers, but not for ADERDG solvers (at some point, a `ParserView` instead of a `ParserView&` is needed) and therefore also not for coupled LimitingADERDG solvers.
I know that Tobias wants to keep the first code the user see's in his solver as small as possible. So what about always passing the `cmdlineargs` and a `constants` reference and either storing everything in some class
```
struct SpecfileOptions { // or so
std::vector<std::string>& cmdlineargs;
exahype::Parser::ParserView& constants;
// could store even more, for instance general access to the parser,
// static build information, paths, etc.
};
```
and just pass a `SpecfileOptions& options` to the constructors and the `init` function.
As an alternative or additionally, we can make the `init` function virtual and put an empty implementation into the `Abstract*Solver.{cpp,h}`. Then the user could introduce it into his solver only when he needs it.
We are talking about startup code which is only called once, so this should really be as comfortable as possible, performance does not matter at all.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/194Erasing of Peano grid vertices2018-03-20T16:13:04+01:00Ghost UserErasing of Peano grid verticesCurrently, I remove only my data from the heap when I erase a cell.
The Peano grid structure still persists.
I tried a variety of tricks to remove the grid structure on the fly during
the mesh refinement but always run into problems:
- ...Currently, I remove only my data from the heap when I erase a cell.
The Peano grid structure still persists.
I tried a variety of tricks to remove the grid structure on the fly during
the mesh refinement but always run into problems:
- Inconsistent adjacency indices
- Spurious erases of neighbours
- Data races with TBB
- ...
I think now it does make more sense to erase Peano vertices after the
mesh refinement. Here the refined grid is fixed. Refinement does not
interfere with erasing.
I think of either:
- Erase Peano vertices on the fly during the time stepping iterations
however I do not know the overhead and impact on the parallelisation.
- Run a few iterations of a dedicated mesh erasing adapter after the
refinement.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/195Additional refinement along MPI boundaries2018-03-20T16:13:38+01:00Ghost UserAdditional refinement along MPI boundariesPeano's spacetree traversal is inverted every second iteration.
This poses a problem for ExaHyPE's master-worker communication
if the master rank has local subtrees.
Example:
--------
Imagine a uniform grid where a fork was performed on...Peano's spacetree traversal is inverted every second iteration.
This poses a problem for ExaHyPE's master-worker communication
if the master rank has local subtrees.
Example:
--------
Imagine a uniform grid where a fork was performed on Level 2. 3^d - 1 workers of our
master rank have been introduced. They and the master rank hold 1/3^d of the computational domain.
We call the portion belonging to the master rank, the local subtree of the master (rank).
In the first iteration, we correctly kick off all workers on the coarse grid before we descend into
the local subtree of the master.
In the second iteration, we start within the local subtree of the master and perform computations.
As soon as we reach the coarse grid, we kick off all workers.
The workers had to wait while we performed our computations on the master's local subtree.
Now finally, the workers will start their computations.
The whole tree traversal might take twice as long as assumed.
This becomes even worse if there are more Master-Worker boundaries.
The red bars in the plot below indicate such scenarios.
Tobias proposes to add additional refinement along the MPI boundary to prevent the need of vertical
communication. At least from the master to the worker. This might work.
I wonder however if it is really beneficial for ExaHyPE to change the traversal order in every second iteration.
In a MPI setting where we perform asynchronous communication, an inversion of the
traversal order is especially ill-suited since we then have to wait till the last message was received
by the neighbour. Instead, we would need to wait and (block) until all messages for the currently
touched vertex are received.
**Update:** The traversal order is hardwired into Peano. It is necessary to run it forward and backward.
![master_worker_synchronisation](/uploads/e1fc83f445841cb3763e0bfd7d8ec701/master_worker_synchronisation.jpg)
Additional Refinement along the MPI boundary
----------------------------------------------
This can be accomplished by continuously refining the top most parent patch (which is of type Cell) of every cell
of type Descendant which is at a Master-Worker boundary.
This has to be done until we end up with a patch of type Cell at the Master-Worker boundary.
At this point, we then need to send the solution values of the Master cell to the worker.
We further might need to impose initial conditions.
We further need to perform status flag merges in prepareSendToWorker.
**Problems with this approach:** It might introduce rippling refinings around
the artificially refined cell.
Introduce a no-operation traversal
----------------------------
We could further introduce a no-operation traversal before we perform
reductions and broadcasts which would rewind Peano's streams
but does not perform any computations and communication.
In this case, we would always follow the top-down traversal.
The observed Master-Worker synchronisation would not appear.
**Problems with this approach:**
- ~~Batching is currently not possible with~~
~~multiple adapters. We could maybe perform no operation in every second iteration.~~
~~However, we would then have still a Master Worker synchronisation. Or would we not?~~
We could have a single empty traversal in front of a batch. That would
work.
- The BoundaryDataExchanger of the heaps always assumes an inversion of the traversal in
every iteration.
To alter this behaviour, we would need to change the receive methods
in Peano's AbstractHeap,DoubleHeap, and BoundaryDataExchanger methods.
We are required to add a bool "assumeForwardTraversal" (defaults to false) to the signature.
We are further required to update ExaHyPE's solver implementations:
Whenever we receive boundary data after we have run the no-operation traversal,
we need to set this new flag to true when calling receiveData.
If we run of batch iteration, this must be done only
in the first iteration of the batch.
Have both
----------------------------
For optimal performance, it might be useful to employ both techniques.
We might use "loop padding", i.e. insert empty traversals, in order to end up with the forward traversal
any time we need to broadcast / reduce something.
In generally it might useful to handle broadcasts and reductions outside of the mapping.
Therefore, we would however need to plug into both, runAsMaster and runAsWorker.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/196Fix musclhancock scheme for certain patch sizes2018-02-22T09:45:39+01:00Ghost UserFix musclhancock scheme for certain patch sizesFor me, for certain patch sizes, our 2nd order FV scheme in ExaHyPE (musclhancock) fails. Godunov runs fine. This has to be debugged (it certainly depends on the patch size: Some work, some not).
2nd order is crucial for some applicatio...For me, for certain patch sizes, our 2nd order FV scheme in ExaHyPE (musclhancock) fails. Godunov runs fine. This has to be debugged (it certainly depends on the patch size: Some work, some not).
2nd order is crucial for some applications (GRMHD,CCZ4).
I will do this, ticket just for book keeping.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/197Let tarch VTK plotters pass file creation errors to ExaHyPE plotters2018-03-02T12:46:18+01:00Ghost UserLet tarch VTK plotters pass file creation errors to ExaHyPE plottersThis is something where I want to patch Peano and subsequently send the patch to Tobias:
Currently it's annoying that some ExaHyPE plotters stop the program when they cannot open output files (the ASCII/CSV writers as well as my CarpetH...This is something where I want to patch Peano and subsequently send the patch to Tobias:
Currently it's annoying that some ExaHyPE plotters stop the program when they cannot open output files (the ASCII/CSV writers as well as my CarpetHDF5 writer) while other silently ignore this issue (Tobias tarch-based VTK writers). Clearly, we want the user to be able to control whether problems at output shall be severe or not.
I can easily implement this in the ExaHyPE codebase, but I don't control peano, so I have to make sure Peano passes such errors into exahype's domain.
=> TODO @ Sven.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/159DMP is illegally called for ordinary ADERDG-Solver2017-11-02T18:15:15+01:00Ghost UserDMP is illegally called for ordinary ADERDG-SolverThis are actually two problems:
1. For an ordinary ADERDG-Solver (not Limiter) where the variable `dmp-observables` is not given, the default value is **not** `0` but instead just a random number (NaN or MaX or whatever for int). ...This are actually two problems:
1. For an ordinary ADERDG-Solver (not Limiter) where the variable `dmp-observables` is not given, the default value is **not** `0` but instead just a random number (NaN or MaX or whatever for int). This is very bad but solvable in the Parser for me.
2. The method `mapDiscreteMaximumPrincipleObservables` in the abstract ADERDG solver is called. This should not happen at all! Why does it happen?
![dmp](/uploads/fd5837af72fe6f92e871426712399023/dmp.png)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/158AMR+LimitingADERDGSolver crashes for certain limiter status changes2017-11-02T18:15:15+01:00Ghost UserAMR+LimitingADERDGSolver crashes for certain limiter status changes# 1. Issue:
There is another issue with the MUSCL-Hancock solver which crashes in the min and max
determination after a global recomputation.
```
147.062 info exahype::runners::Runner::updateMeshFusedTimeStepping(......# 1. Issue:
There is another issue with the MUSCL-Hancock solver which crashes in the min and max
determination after a global recomputation.
```
147.062 info exahype::runners::Runner::updateMeshFusedTimeStepping(...) recompute solution locally (if applicable) and compute new time step size
assertion in file /home/dominic/dev/codes/c/ExaHyPE/ExaHyPE-Engine/./ExaHyPE/exahype/solvers/LimitingADERDGSolver.cpp, line 1111 failed: *(observablesMin+i)<std::numeric_limits<double>::max()
parameter i: 1
parameter solverPatch.toString(): (solverNumber:0,neighbourMergePerformed:[1,1,1,1],isInside:[1,1,0,1],parentIndex:69,isAugmented:0,newlyCreated:0,type:Cell,refinementEvent:None,level:5,offset:[0.518519,0],size:[0.0123457,0.0123457],previousCorrectorTimeStamp:1.79769e+308,previousCorrectorTimeStepSize:1.79769e+308,correctorTimeStepSize:0.000497671,correctorTimeStamp:0.0909671,predictorTimeStepSize:0,predictorTimeStamp:0.0914648,solution:2838,solutionAverages:2843,solutionCompressed:-1,previousSolution:2839,previousSolutionAverages:2841,previousSolutionCompressed:-1,update:2840,updateAverages:2842,updateCompressed:-1,extrapolatedPredictor:2844,extrapolatedPredictorAverages:2846,extrapolatedPredictorCompressed:-1,fluctuation:2845,fluctuationAverages:2847,fluctuationCompressed:-1,solutionMin:2848,solutionMax:2849,facewiseAugmentationStatus:[0,0,0,0],augmentationStatus:0,facewiseHelperStatus:[2,2,2,2],helperStatus:2,facewiseLimiterStatus:[0,0,0,0],limiterStatus:0,previousLimiterStatus:3,iterationsToCureTroubledCell:10,compressionState:Uncompressed,bytesPerDoFInPreviousSolution:305,bytesPerDoFInSolution:-296662368,bytesPerDoFInUpdate:32765,bytesPerDoFInExtrapolatedPredictor:0,bytesPerDoFInFluctuation:0)
ExaHyPE-Euler: /home/dominic/dev/codes/c/ExaHyPE/ExaHyPE-Engine/./ExaHyPE/exahype/solvers/LimitingADERDGSolver.cpp:1111: void exahype::solvers::LimitingADERDGSolver::determineSolverMinAndMax(exahype::solvers::LimitingADERDGSolver::SolverPatch&): Assertion `false' failed.
```
I have to gather more information about this first.
# 2. Issue
Cell of type 3 or 4 (FV->DG) changes to Troubled.
Neighbour is of Type 1 or 2 (DG->FV).
Example:
Before:
![before](/uploads/0a30bada69ece5db04146246ff72c5c1/before.png)
After:
![after](/uploads/fcf89aaaa40453a52ead9622bd283033/after.png)
* Fix 1: Need to stop iterations if situation detected in neighbour merging
-> write stable values, i.e. own values to ghost layer
-> Perform rollback in affected cells (=>irregular
limiter domain chage; requires local recomputation)
* Fix 2: Use less dissipative FV methods or higher order
ADERDG => finer subcell resolution
Or set parameter steps to cure troubled cells to a higher value
# 3. Found bugs:
* The whole global recomputation thing is more sophisticated than previously thought. I
have to be careful how I go back in time. This is based on the previous limiter status.
I am only allowed to delete patches after the recomputation.
# 4. Ways to increase stability
* Do not change Local Recomputation to Global Recomputation if limiter based mesh refinement is also necessary.
# 5. Algorithms:
(Stuff above is outdated. Keep it for reference.)
## Local Recomputation
1. limiter status spreading
2. local reinitialisation
3. local recomputation + local predictor computation
## Global Recomputation
1. limiter status spreading
2. global rollback (keep new limiter status) <-ensures we adjust the previous solution during mesh refinement
3. mesh refinement according to new limiter status
4. overwrite new limiter status with previous values
5. recompute time step size
6. reinitialise fused time stepping and recompute predictor
Problems:
* TimeStepSizeComputation does update the time stamps (solved)
* Need additional adapter for global rollback (global reinitialisation)
* I am not allowed to deallocate limiter patches during limiter status spreading
* Have to keep in mind to overwrite the limiter status in finalise mesh refinement.
## Mesh Refinement
1. mesh refinement according to ref. crit.
4. recompute time step size
5. reinitialise fused time stepping and recompute predictorhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/156LimitingADERDGSolver currently crashing with TBB and AMR2017-11-02T18:15:15+01:00Ghost UserLimitingADERDGSolver currently crashing with TBB and AMR* ~~Uniform grids: LimitingADERDGSolver fails assertion if TBB is switched on.
Probably something with the status spreading with TBB.~~
Seems to be solved now.
* Adaptive grids: LimitingADERDGSolver mesh update iterates foreve...* ~~Uniform grids: LimitingADERDGSolver fails assertion if TBB is switched on.
Probably something with the status spreading with TBB.~~
Seems to be solved now.
* Adaptive grids: LimitingADERDGSolver mesh update iterates forever
* Adaptive grids: LimitingADERDGSolver fails assertion.
Probably something because of the new domain boundary treatment.
(Not sure if this is still an issue.)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/155Bounding Box Scaling (virtually-expand-domain) Does Not Work2017-11-02T18:15:15+01:00Ghost UserBounding Box Scaling (virtually-expand-domain) Does Not WorkVirtually expanding the bounding box around the computational domain
is a Peano trick to shut off neighbour communication with the global master (rank 0).
Virtually expanding the bounding box does currently lead to problems in ExaHyP...Virtually expanding the bounding box around the computational domain
is a Peano trick to shut off neighbour communication with the global master (rank 0).
Virtually expanding the bounding box does currently lead to problems in ExaHyPE:
* It enables scenarios where a coarse grid vertex is on the boundary/outside of the domain,
but a fine grid vertex (and its h environment) is inside of the domain.
* It confuses the isFaceInside function in class Cell.
# Virtually Expanding the Domain
Some background on the virtually expand domain flag:
* Peano does only consider inside and boundary vertices for the (MPI) neighbour
merging. Outside vertices are ignored for this purpose.
* Since rank 1 is placed into a centre of 3^d child cells belonging to rank 0,
it will perform neighbour merging with rank 0 as long as those vertices are
either inside or directly at the boundary of the domain.
* Switching off neighbour merging directly at the domain boundary (``vertex.isBoundary()``) does not
make sense. The reason is that refinement at the boundary will introduce hanging nodes.
Boundary nodes should however be persistent. (Tobias' reasoning. Have to ask further why this is bad.)
* Virtually expanding the domain does place the nodes located at the remote boundary to
rank 0 outside of the domain
* From the above points, it is clear that virtually expanding the boundary is mandatory for reasonable MPI
scalability especially in 3d. This is exactly what we have observed in our 3D MPI experiments.
* This will be a little inconvenient for people who prescribe initial conditions at certain boundaries by means of (x,t).
(Seismic people know about this. Leonhard knows how to deal with it.)
# Remarks
* Virtually expanding the bounding box usually leads to a shrinking of the actual computational domain since
only inside cells are considered as within the computational domain.
ExaHyPE should thus tell the user what the shrinked domain looks like.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/150Buildinfo infrastructure does not update always2017-11-02T18:15:16+01:00Ghost UserBuildinfo infrastructure does not update alwaysThe `buildinfo.h` file is generated at every make call, but the macros are evaluated in `main.cpp` only once, ie. the main file compilation is not retriggered.
To avoid this, we could must have the buildinfo an own compilation context, ...The `buildinfo.h` file is generated at every make call, but the macros are evaluated in `main.cpp` only once, ie. the main file compilation is not retriggered.
To avoid this, we could must have the buildinfo an own compilation context, ie some `buildinfo.cpp` going along with the `buildinfo.h`. That's the only option to get runtime build information.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/149MUSCL-Hancock BC are not imposed correctly?2017-11-02T18:15:16+01:00Ghost UserMUSCL-Hancock BC are not imposed correctly?I observe weird perturbations at the boundary when I use the MUSCL-Hancock FV limiter.
![boundary_effects](/uploads/108aab3f64a1c8d8ac541b212c9d4c39/boundary_effects.png)I observe weird perturbations at the boundary when I use the MUSCL-Hancock FV limiter.
![boundary_effects](/uploads/108aab3f64a1c8d8ac541b212c9d4c39/boundary_effects.png)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/147Dynamic AMR crashes if we switch to gloabl recomputation branch.2017-11-02T18:15:16+01:00Ghost UserDynamic AMR crashes if we switch to gloabl recomputation branch.[webmize](/uploads/db32921836ea432a6ce41060cfb87f97/webmize)Dynamic AMR crashes everytime the solver switches to the global
recomputation branch.
Potential reason:
* ~~We reuse the ADERDGTimeStep adapter which does update the limite...[webmize](/uploads/db32921836ea432a6ce41060cfb87f97/webmize)Dynamic AMR crashes everytime the solver switches to the global
recomputation branch.
Potential reason:
* ~~We reuse the ADERDGTimeStep adapter which does update the limiter
status again. This should not happen.~~
* ~~Erasing is triggered before the limiter status spreading has finished.
Add inertia for erasing requests.~~This is now realised by also taking the previous limiter status into account.
Problems seem to be solved:
![sod_shock_tube](/uploads/2bdedd63eadf6eb0bde5d093e34dbf88/sod_shock_tube.webm)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/198Increase robustness of neighbour merging routines2017-11-21T13:43:49+01:00Ghost UserIncrease robustness of neighbour merging routinesIssue
-----
During the mesh refinement iterations, we have observed that cell description link maps stored in
each vertex sometimes hold wrong but not invalid adjacency indices.
This was observed on a new worker directly after a fork has...Issue
-----
During the mesh refinement iterations, we have observed that cell description link maps stored in
each vertex sometimes hold wrong but not invalid adjacency indices.
This was observed on a new worker directly after a fork has been finished.
Solution
--------
Use geometry information to determine if a patch is really adjacent to
a vertex at a given face.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/199Extend batching to global time stepping2017-11-21T14:41:53+01:00Ghost UserExtend batching to global time steppingI am currently working on extending the batching from "globalfixed" to "global" time stepping.
The idea is here to only reduce the time step size and other data in the last iteration of the batch.
If something went wrong, we then roll ba...I am currently working on extending the batching from "globalfixed" to "global" time stepping.
The idea is here to only reduce the time step size and other data in the last iteration of the batch.
If something went wrong, we then roll back to the state at the beginning of the batch.
This WiP issue is for tracking my progress and issues I ran into.
Reduction of time step size
---------------------------
While we will not update the currently used CFL time step size in every single iteration
in the planned batched global time stepping scheme,
we will search the minimum CFL time step size over all batch iterations.
This allows us to check if the used time step size did violate the CFL constraint
during the execution of the batch.
The consequence might then be to use the newly determined minimum batch time step size as
time step size for a rerun of the whole batch.
Basic modifications to the fused time stepping algorithm
--------------------------------------------------------
- We will store the previous solution only at the beginning of a batch.
The same happens with the previous time stamp and time step size.
- TBC...
Rollbacks:
----------
TBC...
https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/200Cell-wise plotter2017-11-29T17:45:59+01:00Ghost UserCell-wise plotterCreate a VTK plotter which only plots a single cell. Plot then:
* LimitingStatus
* Which MPI rank hosts the cell
This also should support VTU (for MPI).Create a VTK plotter which only plots a single cell. Plot then:
* LimitingStatus
* Which MPI rank hosts the cell
This also should support VTU (for MPI).https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/201Clean up vertical MPI communication2019-09-20T15:46:26+02:00Dominic Etienne CharrierClean up vertical MPI communicationThere is too much data sent around.
I further use too many different structures and methods
(MeshUpdateFlags,SolverFlags,solver->sendDataToWorker/Master...).
This should be cleaned up.There is too much data sent around.
I further use too many different structures and methods
(MeshUpdateFlags,SolverFlags,solver->sendDataToWorker/Master...).
This should be cleaned up.Dominic Etienne CharrierDominic Etienne Charrierhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/202Segfault in some VTK plotters if variables = 02019-03-21T10:48:34+01:00Ghost UserSegfault in some VTK plotters if variables = 0https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/203Peano: Geometrical comparisons must always use relative tolerance2018-02-02T14:12:26+01:00Dominic Etienne CharrierPeano: Geometrical comparisons must always use relative toleranceThe tarch::la:: routines use usually an absolute tolerance (default: machine precision)
for comparing floating point numbers.
This is problematic for numbers with large absolute values.
Fix: These numbers must be normalised and scaled t...The tarch::la:: routines use usually an absolute tolerance (default: machine precision)
for comparing floating point numbers.
This is problematic for numbers with large absolute values.
Fix: These numbers must be normalised and scaled to range [-1,1] before
comparing with respect to the tolerance.Dominic Etienne CharrierDominic Etienne Charrierhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/204Guidebook should not be written in LaTeX2018-01-26T11:39:44+01:00Sven KÃ¶ppelGuidebook should not be written in LaTeX> Note: This issue is really low-importance. Do not continue reading if you care about your time.
The guidebook is currently written in LaTeX, with a primary focus on printing and much efforts put into aligning figures and styling text...> Note: This issue is really low-importance. Do not continue reading if you care about your time.
The guidebook is currently written in LaTeX, with a primary focus on printing and much efforts put into aligning figures and styling texts. IMHO this is the wrong focus. Instead, the guidebook should *focus on content, not on layout*. It should be written in some simple markup language which allows rendering to a web page and a PDF similarly. This is useful for easier deep-linking and reading on screen. It also encourages a more uniform style accross the book.
There are several instances of such language. One very common in the Python world is [Shinx](http://www.sphinx-doc.org), it is very widespread due to its simple syntax in Markdown. I recently stumbled over the [Visit documentation](http://visit-sphinx-user-manual.readthedocs.io) which was rewritten with Sphinx and renders at the same time to a comprehensive website and a [400 page book/pdf](https://media.readthedocs.org/pdf/visit-sphinx-user-manual/latest/visit-sphinx-user-manual.pdf) with a table of contents, numbered images and all that. An example page is http://visit-sphinx-user-manual.readthedocs.io/en/latest/Quantitative/Expressions.html which is generated by the very readable restructuredtext source http://visit-sphinx-user-manual.readthedocs.io/en/latest/_sources/Quantitative/Expressions.rst.txt
Translating our current LaTeX guidebook to something like Restructuredtext or Markdown is a <1h job, therefore switching the publication system is not a mammoth task at all. It is more something people have to agree on. I wonder if Tobias will?https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/205CCZ4: Run BH with2018-01-20T20:50:14+01:00Sven KÃ¶ppelCCZ4: Run BH withhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/206CCZ4: Run BH until T=10002018-02-09T22:50:28+01:00Sven KÃ¶ppelCCZ4: Run BH until T=1000We need some setup of ExaHyPE,CCZ4+Limiter on a reasonable grid (L>25, dx<0.1). This ticket shall report the progress. Just describe your runs in the comments.
*Explanation of notation: The unit "M/h" means "simulation time / physical t...We need some setup of ExaHyPE,CCZ4+Limiter on a reasonable grid (L>25, dx<0.1). This ticket shall report the progress. Just describe your runs in the comments.
*Explanation of notation: The unit "M/h" means "simulation time / physical time in hours". M means "Mass" and roughly indicates for us that we refer to the simulation time.*https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/207"error: Neighbours cannot communicate." (Minimal working example Application)2018-02-12T15:25:32+01:00Sven KÃ¶ppel"error: Neighbours cannot communicate." (Minimal working example Application)In commit https://gitlab.lrz.de/exahype/ExaHyPE-Engine/commit/baacbecadd7321cb25aec130707bd8b6dc08b11a I added the new LimitingADERDG application `ApplicationExamples/Experiments/GridDemonstrator` where I test different limiter criteria....In commit https://gitlab.lrz.de/exahype/ExaHyPE-Engine/commit/baacbecadd7321cb25aec130707bd8b6dc08b11a I added the new LimitingADERDG application `ApplicationExamples/Experiments/GridDemonstrator` where I test different limiter criteria.
In the submitted example, the code crashes after the first timestep with **error: Neighbours cannot communicate**.
```
12.1608 info memoryUsage =333 MB
12.5276 info grid setup iteration #40, idle-nodes=1, vertical solver communication=0
12.5276 info memoryUsage =333 MB
12.8925 info grid setup iteration #41, idle-nodes=1, vertical solver communication=0
12.8925 info memoryUsage =333 MB
13.2603 info grid setup iteration #42, idle-nodes=1, vertical solver communication=0
13.2603 info memoryUsage =333 MB
13.2604 info finished grid setup after 42 iterations
13.2604 info finalise mesh refinement and compute first time step size
13.4246 info initialised all data and computed first time step size
13.7191 info plotted initial solution (if specified) and computed first predictor
13.7191 info step 0 t_min =0
13.7192 info dt_min =0.0125926
13.7192 info memoryUsage =333 MB
13.7192 info plot
13.78 error Neighbours cannot communicate.
cell1=(solverNumber:0,neighbourMergePerformed:[1,0,1,0],isInside:[1,1,1,1],parentIndex:920,isAugmented:0,newlyCreated:0,type:Cell,refinementEvent:None,level:6,offset:[12.428,8.47737],size:[0.164609,0.164609],previousCorrectorTimeStamp:0,previousCorrectorTimeStepSize:0,correctorTimeStepSize:0.0125926,correctorTimeStamp:0,predictorTimeStepSize:0.0125926,predictorTimeStamp:0.0125926,solution:42627,solutionAverages:42630,solutionCompressed:-1,previousSolution:42626,previousSolutionAverages:42629,previousSolutionCompressed:-1,update:42628,updateAverages:42631,updateCompressed:-1,extrapolatedPredictor:42632,extrapolatedPredictorAverages:42634,extrapolatedPredictorCompressed:-1,fluctuation:42633,fluctuationAverages:42635,fluctuationCompressed:-1,solutionMin:-1,solutionMax:-1,facewiseAugmentationStatus:[0,0,0,0],augmentationStatus:0,previousAugmentationStatus:0,facewiseHelperStatus:[1,0,1,0],helperStatus:2,facewiseLimiterStatus:[2,0,1,0],limiterStatus:2,previousLimiterStatus:0,iterationsToCureTroubledCell:0,compressionState:Uncompressed,bytesPerDoFInPreviousSolution:36444784,bytesPerDoFInSolution:0,bytesPerDoFInUpdate:8015768,bytesPerDoFInExtrapolatedPredictor:0,bytesPerDoFInFluctuation:-1287913656)
.cell2=(solverNumber:0,neighbourMergePerformed:[0,0,1,0],isInside:[1,1,1,1],parentIndex:933,isAugmented:0,newlyCreated:0,type:Cell,refinementEvent:None,level:6,offset:[12.5926,8.47737],size:[0.164609,0.164609],previousCorrectorTimeStamp:0,previousCorrectorTimeStepSize:0,correctorTimeStepSize:0.0125926,correctorTimeStamp:0,predictorTimeStepSize:0.0125926,predictorTimeStamp:0.0125926,solution:8218,solutionAverages:8221,solutionCompressed:-1,previousSolution:8217,previousSolutionAverages:8220,previousSolutionCompressed:-1,update:8219,updateAverages:8222,updateCompressed:-1,extrapolatedPredictor:8223,extrapolatedPredictorAverages:8225,extrapolatedPredictorCompressed:-1,fluctuation:8224,fluctuationAverages:42960,fluctuationCompressed:-1,solutionMin:-1,solutionMax:-1,facewiseAugmentationStatus:[0,0,0,0],augmentationStatus:0,previousAugmentationStatus:0,facewiseHelperStatus:[0,0,1,0],helperStatus:2,facewiseLimiterStatus:[0,0,0,0],limiterStatus:0,previousLimiterStatus:0,iterationsToCureTroubledCell:0,compressionState:Uncompressed,bytesPerDoFInPreviousSolution:36491056,bytesPerDoFInSolution:0,bytesPerDoFInUpdate:8015768,bytesPerDoFInExtrapolatedPredictor:0,bytesPerDoFInFluctuation:-1285749088) (file:/home/sven/numrel/exahype/Engine-ExaHyPE/ExaHyPE/exahype/solvers/LimitingADERDGSolver.cpp,line:1570)
terminate called without an active exception
Abgebrochen
```
The specfile describes a PDE with 1 scalar field (doing nothing), the setup runs on a notebook in <10 seconds.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/208Random initial conditions for LimitingADERDGSolver pose a problem2019-04-15T12:39:44+02:00Dominic Etienne CharrierRandom initial conditions for LimitingADERDGSolver pose a problemCurrently, random initial conditions likely lead to undefined
behaviour for the LimitingADERDGSolver.
Background
-----------
The LimitingADERDGSolver replaces the ADER-DG solution locally by a FV solution
if the former does not satisfy...Currently, random initial conditions likely lead to undefined
behaviour for the LimitingADERDGSolver.
Background
-----------
The LimitingADERDGSolver replaces the ADER-DG solution locally by a FV solution
if the former does not satisfy certain conditions (PAD, DMP).
During the imposition of initial conditions, we then allocate a new FV patch and
impose the initial conditions on its FV solution degrees of freedom.
Random initial conditions
-------------------------
We have to ensure that FV and ADER-DG solution are consistent.
This is currently only ensured if we have deterministic initial conditions.
In case of random initial conditions, the imposition of initial conditions
on the ADER-DG solution and on the FV solution might lead to two different
outcomes.
A cell might encounter rough initial conditions while imposing the initial conditions
on the ADER-DG solution but it then might encounter smooth initial conditions
for the FV solution.
But more severely, both differ from each other.
A solution?
-----------
In case a cell allocates an FV patch and has a limiter status such that it will
compute with Finite Volumes in the first iteration, we will then treat this FV solution as
the correct one and will project it back on the ADER-DG solution.
This way ADER-DG and FV solution are consistent.
On the other hand, if the limiter status of the cell is such that the cell
will compute with ADER-DG in the first iteration, we will treat the ADER-DG solution as the correct
one, we will project the ADER-DG solution onto the FV solution. We will not impose initial
conditions on the FV solution.
FV and ADER-DG solution are consistent in this case as well.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/209Toolkit should accept command line arguments for finetuning2019-04-15T12:40:31+02:00Sven KÃ¶ppelToolkit should accept command line arguments for finetuning> **Note:** This is a **minor feature suggestion**. Something which should be done on the **long term** to improve the users daily life. It is **not important right now**
Currently, the toolkit has a way of overwriting some files always...> **Note:** This is a **minor feature suggestion**. Something which should be done on the **long term** to improve the users daily life. It is **not important right now**
Currently, the toolkit has a way of overwriting some files always while creating some files only if they do not yet exist.
Sometimes, especially when the API or toolkit changes, one needs to recreate everything. The typical workflow is then to call `rm Abstract*Solver* KernelCalls*`. Instead, it would be nice if the toolkit supports some option `--overwrite` or similar to always recreate the files. It could also offer an option `--compare` to print whether generated files differ from existing files.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/210Efficent output format for large simulations2018-03-02T13:23:47+01:00Leonhard RannabauerEfficent output format for large simulationsAs soon as output reaches a certain size plotting is our major bottleneck. (a mesh of ~40 mio dofs with 14 unknowns for pvtu on supermuc takes 1h per plot)
We should look into different approaches like the ASYNC lib https://github.com/TU...As soon as output reaches a certain size plotting is our major bottleneck. (a mesh of ~40 mio dofs with 14 unknowns for pvtu on supermuc takes 1h per plot)
We should look into different approaches like the ASYNC lib https://github.com/TUM-I5/ASYNC which sacrifices a singe thread per rank on output.
For supermuc: We should also start to look into parallel file systems like LUSTRE https://en.wikipedia.org/wiki/Lustre_(file_system).Leonhard RannabauerLeonhard Rannabauerhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/211Multicore-ise LoadBalancing mapping2019-09-20T15:46:28+02:00Dominic Etienne CharrierMulticore-ise LoadBalancing mappingThis affects the concurrency level within the grid setup iterations.This affects the concurrency level within the grid setup iterations.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/212Number of Plotters is not detected2018-09-10T19:43:27+02:00Sven KÃ¶ppelNumber of Plotters is not detectedWhen you compile an ExaHyPE application with a Specfile with *n* plotters but run it with *m* plotters (say the types of the first *m-n* plotters are the same), there is no detection that `m != n`. This leads to weird errors which even a...When you compile an ExaHyPE application with a Specfile with *n* plotters but run it with *m* plotters (say the types of the first *m-n* plotters are the same), there is no detection that `m != n`. This leads to weird errors which even are hardly understandable in DEBUG mode:
```
0.0310583 15:50:15 [nils]rank:0 debug exahype::parser::Parser::getIdentifierForPlotter() found token notoken (file:/home/sven/numrel/exahype/Engine-ExaHyPE/ExaHyPE/exahype/parser/Parser.cpp,line:999)
assertion in file /home/sven/numrel/exahype/Engine-ExaHyPE/ExaHyPE/exahype/parser/Parser.cpp, line 1001 failed: token.compare(_noTokenFound) != 0
parameter token: notoken
parameter solverNumber: 0
parameter plotterNumber: 3
ExaHyPE-GRMHD: /home/sven/numrel/exahype/Engine-ExaHyPE/ExaHyPE/exahype/parser/Parser.cpp:1001: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> exahype::parser::Parser::getIdentifierForPlotter(int, int) const: Assertion `false' failed.
Abgebrochen
```
but completely nonunderstandable in Release mode:
```
0.00535548 15:32:20 [nils]rank:0 error exahype::parser::Parser::getFirstSnapshotTimeForPlotter() 'GRMHDSolver_FV' - plotter 3: 'time' value must be a float. (file:/home/sven/numrel/exahype/Engine-ExaHyPE/ExaHyPE/exahype/parser/Parser.cpp,line:1046)
0.00538875 15:32:20 [nils]rank:0 error exahype::parser::Parser::getRepeatTimeForPlotter() 'GRMHDSolver_FV' - plotter 3: 'repeat' value must be a float. (file:/home/sven/numrel/exahype/Engine-ExaHyPE/ExaHyPE/exahype/parser/Parser.cpp,line:1067)
```
note that in this example, `n=4` and `m=3`, so especially plotter 3 looked allright where the error message tried to complain actually about the nonexisting plotter 4.
This is very bad. We need some better enforcement of this rule.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/213Named variables in VTK plots2018-09-10T19:38:52+02:00Sven KÃ¶ppelNamed variables in VTK plotsI want to have them. I will implement this tonight via the plotter mapping class as an optional virtual function (isn't this already there? HDF5 uses it) and pass the data straight forwards throught VTK. This will result in a patch for p...I want to have them. I will implement this tonight via the plotter mapping class as an optional virtual function (isn't this already there? HDF5 uses it) and pass the data straight forwards throught VTK. This will result in a patch for peano.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/214Faster mesh refinement2018-03-02T11:23:39+01:00Dominic Etienne CharrierFaster mesh refinementProblem 1 (Costly operation is performed while no parallelism available)
------------------------------------------------------------------------
Peano's shared-memory parallelism is based on identifying regular subgrids in the
tree. If...Problem 1 (Costly operation is performed while no parallelism available)
------------------------------------------------------------------------
Peano's shared-memory parallelism is based on identifying regular subgrids in the
tree. If a new cells is introduced to the tree it might not be yet be identified
as part of a regular subgrid and the operations performed on the cell
are thus parallelised.
Solution: If imposing initial conditions or evaluating a refinement criterion is too
expensive we could let the user choose to perform it as background task.
This might lead to more mesh setup iteration but potentially to a better
exploitation of the available cores. It should also benefit the hiding of MPI communication
during the mesh setup.
Problem 2 (Overall concurrency)
-------------------------------
ExaHyPE's regular shared-memory parallelism during the mesh setup is currently further limited as
multiple cells might write to the same vertex in order to set refinement events.
Solution: We should be able to solve this by inverting the control. The
vertex checks in touchVertexLastTime if any cell has set a refinement event,
and refines if that is the case. This would increase the concurrency of the
enterCell operations.
Problem 3 (Memory)
-------------------------------
ExaHyPE's initial mesh setup is performed at the beginning by a single rank.
Gradually more and more ranks are added. In order to prevent that
any of the ranks runs out of memory during the initial mesh setup,
it might make sense to only temporarily allocate memory, impose initial
conditions, evaluate the refinement criterion, and then free the memory again (better:
recylce it).
After the initial mesh setup, we would then allocate memory on all ranks
and impose initial conditions.
Problem 4 (Load Balancing)
-------------------------------
The load-balancing does currently only count the number of cells.
It does not take the different cell types in ExaHyPE's grid into account.
The helper cell types Descendant and Ancestor have way less work to do than
the compute cells of type Cell.Dominic Etienne CharrierDominic Etienne Charrierhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/215User Defined plotters API: Pass the information from the specfile2019-04-15T12:36:16+02:00Sven KÃ¶ppelUser Defined plotters API: Pass the information from the specfileHow can we access:
* Name of output file
* Full information about cell (Limiting status, etc.)
I think the plotting API is hiding too much information. The UserDefinedADERDG plotter should pass more information to the user.
This is so...How can we access:
* Name of output file
* Full information about cell (Limiting status, etc.)
I think the plotting API is hiding too much information. The UserDefinedADERDG plotter should pass more information to the user.
This is something I can do :)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/216Issue during the compilation2018-04-10T18:36:23+02:00m.tavelliIssue during the compilationHi, I get the following error during the compilation with the last updates:
/WORK/maurizio_exa/ExaHyPE-Engine/ExaHyPE/exahype/solvers/ADERDGSolver.cpp: In member function â€˜virtual void exahype::solvers::ADERDGSolver::mergeWithNeighbourD...Hi, I get the following error during the compilation with the last updates:
/WORK/maurizio_exa/ExaHyPE-Engine/ExaHyPE/exahype/solvers/ADERDGSolver.cpp: In member function â€˜virtual void exahype::solvers::ADERDGSolver::mergeWithNeighbourData(int, const HeapEntries&, int, int, const tarch::la::Vector<2, int>&, const tarch::la::Vector<2, int>&, const tarch::la::Vector<2, double>&, int)â€™:
/WORK/maurizio_exa/ExaHyPE-Engine/ExaHyPE/exahype/solvers/ADERDGSolver.cpp:3566:27: error: â€˜sâ€™ was not declared in this scope
lFhbnd,dofPerFace,s
^https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/217MPI bug2019-09-20T15:43:08+02:00m.tavelliMPI bugHi, with the last updates I'm not able to run the code using MPI, in parcitular I get the message reported below. This can be reproduced with the GPR application that is in the repository and even if I turn off all the plotters. The seri...Hi, with the last updates I'm not able to run the code using MPI, in parcitular I get the message reported below. This can be reproduced with the GPR application that is in the repository and even if I turn off all the plotters. The serial version seems to work.
```
0.643946 [CERVINO],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 0 t_min =0
0.643966 [CERVINO],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =6.5714e-05
0.643983 [CERVINO],rank:0 info exahype::runners::Runner::runTimeStepsWithFusedAlgorithmicSteps(...) plot
[CERVINO:07241] *** Process received signal ***
[CERVINO:07241] Signal: Segmentation fault (11)
[CERVINO:07241] Signal code: Address not mapped (1)
[CERVINO:07241] Failing at address: (nil)
[CERVINO:07241] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x13150)[0x7f24fae22150]
[CERVINO:07241] [ 1] ./ExaHyPE-GPR(+0x48e949)[0x557f27611949]
[CERVINO:07241] [ 2] ./ExaHyPE-GPR(+0x494e77)[0x557f27617e77]
[CERVINO:07241] [ 3] ./ExaHyPE-GPR(+0x4a7170)[0x557f2762a170]
[CERVINO:07241] [ 4] ./ExaHyPE-GPR(+0x4a741e)[0x557f2762a41e]
[CERVINO:07241] [ 5] ./ExaHyPE-GPR(+0x2bbd7d)[0x557f2743ed7d]
[CERVINO:07241] [ 6] ./ExaHyPE-GPR(+0x2beaa6)[0x557f27441aa6]
[CERVINO:07241] [ 7] ./ExaHyPE-GPR(+0x32f84b)[0x557f274b284b]
[CERVINO:07241] [ 8] ./ExaHyPE-GPR(+0x365f95)[0x557f274e8f95]
[CERVINO:07241] [ 9] ./ExaHyPE-GPR(+0x3a67f6)[0x557f275297f6]
[CERVINO:07241] [10] ./ExaHyPE-GPR(+0x3a76f7)[0x557f2752a6f7]
[CERVINO:07241] [11] ./ExaHyPE-GPR(+0x3a68bd)[0x557f275298bd]
[CERVINO:07241] [12] ./ExaHyPE-GPR(+0x3a76f7)[0x557f2752a6f7]
[CERVINO:07241] [13] ./ExaHyPE-GPR(+0x3a68bd)[0x557f275298bd]
[CERVINO:07241] [14] ./ExaHyPE-GPR(+0x3a76f7)[0x557f2752a6f7]
[CERVINO:07241] [15] ./ExaHyPE-GPR(+0x3ca149)[0x557f2754d149]
[CERVINO:07241] [16] ./ExaHyPE-GPR(+0x3cc690)[0x557f2754f690]
[CERVINO:07241] [17] ./ExaHyPE-GPR(+0x306521)[0x557f27489521]
[CERVINO:07241] [18] ./ExaHyPE-GPR(+0x25419c)[0x557f273d719c]
[CERVINO:07241] [19] ./ExaHyPE-GPR(+0x25474e)[0x557f273d774e]
[CERVINO:07241] [20] ./ExaHyPE-GPR(+0x25671e)[0x557f273d971e]
[CERVINO:07241] [21] ./ExaHyPE-GPR(+0x375b0)[0x557f271ba5b0]
[CERVINO:07241] [22] ./ExaHyPE-GPR(+0x37b09)[0x557f271bab09]
[CERVINO:07241] [23] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f24faa501c1]
[CERVINO:07241] [24] ./ExaHyPE-GPR(+0x45a5a)[0x557f271c8a5a]
[CERVINO:07241] *** End of error message ***
```https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/218Horizontal detection of insufficiently refined mesh for LimitingADERDGSolver2018-03-07T11:58:59+01:00Dominic Etienne CharrierHorizontal detection of insufficiently refined mesh for LimitingADERDGSolverCurrently, we restrict the limiter status up to the next coarser parent
in every iteration of the time stepping. We then evaluate on the coarser grids
if the limiter status is such that we need to refine. This then triggers refinement
r...Currently, we restrict the limiter status up to the next coarser parent
in every iteration of the time stepping. We then evaluate on the coarser grids
if the limiter status is such that we need to refine. This then triggers refinement
requests which force the time stepping to stop.
Restricting to the next coarser parent implies non-global master-worker communication
in MPI builds. This is not good.
To get rid of this master-worker communication during the time-stepping,
I propose to extend the limiter status range by a few more than one OK statuses.
If such an OK status is then propagating into a virtual child cell (Descendant),
we know that the mesh is not sufficiently refined and we halt the time stepping.
Some philosophy:
From the updates of the flags, we should further be able to predict in which direction a shock
propagates. We can then select more carefully which cell to refine next.
I should further rethink my whole limiter-based mesh refinement. Maybe it is more advantageous,
to do some bottom-up flagging for refinement. Instead of the current top-down approach
where I use halo-refinement around the limited regions.
With MPI switched on, I wonder however how well or badly this will interplay with the
load balancing during the initial mesh refinement.Dominic Etienne CharrierDominic Etienne Charrierhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/219Cancel all predictor background jobs when a predictor rerun is necessary2018-03-26T11:15:21+02:00Dominic Etienne CharrierCancel all predictor background jobs when a predictor rerun is necessaryhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/221Deeply check the public repository for CCZ42019-03-07T19:41:06+01:00Sven KÃ¶ppelDeeply check the public repository for CCZ4It is at https://github.com/exahype/exahype and we don't want to have the CCZ4 system in the code or any old commits. Make this sure by inspecting the code.
Cannot do it now since the repo is 70MB in size and I'm in a train with a bad w...It is at https://github.com/exahype/exahype and we don't want to have the CCZ4 system in the code or any old commits. Make this sure by inspecting the code.
Cannot do it now since the repo is 70MB in size and I'm in a train with a bad wifi.Sven KÃ¶ppelSven KÃ¶ppelhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/222Fix the M/h computation in the CCZ4/Writers/TimingStatisticsWriter.cpph2019-04-15T12:41:12+02:00Sven KÃ¶ppelFix the M/h computation in the CCZ4/Writers/TimingStatisticsWriter.cpphFrom a mail at Luke:
>
I just noticed there is something wrong in the M/h determination:
>
As explained in my last mail to Tobias and you, I just divide these
two numbers. However, the time in the first column of stdout measures
the tim...From a mail at Luke:
>
I just noticed there is something wrong in the M/h determination:
>
As explained in my last mail to Tobias and you, I just divide these
two numbers. However, the time in the first column of stdout measures
the time since program start.
>
In ExaHyPE, the grid setup sometimes takes a considerable amount --
like 10 minutes. If you measure the M/h straight after the first
timesteps after these 10 minutes, you get of course totally wrong
numbers. However, if you measure after 1000 minutes runtime, the 10
minutes grid setup do not change the result so much.
>
It is not hard to substract the time the grid setup needs in order to
improve the correctness of the number. You can do this either by hand
(just look up when the first time step started) or we can do this in
code (CCZ4/Writers/TimingStatisticsWriter.cpph).Sven KÃ¶ppelSven KÃ¶ppelhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/223Implement new Space time predictor without (variable number of) picard loops2019-04-15T12:34:30+02:00Sven KÃ¶ppelImplement new Space time predictor without (variable number of) picard loopsThe task is relatively easy: Diff Dumbsers fortran prototype (in repository) and check out the changes in the generic kernels.
From Dumbser, 11. MÃ¤rz 2018 um 11:20:
> WÃ¤re natÃ¼rlich super, wenn Du meinen Fortran Code in C Ã¼bersetzen kÃ¶...The task is relatively easy: Diff Dumbsers fortran prototype (in repository) and check out the changes in the generic kernels.
From Dumbser, 11. MÃ¤rz 2018 um 11:20:
> WÃ¤re natÃ¼rlich super, wenn Du meinen Fortran Code in C Ã¼bersetzen kÃ¶nntest.
> Wenn es Probleme gibt,
> machen wir wieder eine Skype session.
> Das Format der Schleifen und der nÃ¶tigen Berechnungen ist quasi dasselbe wie
> fÃ¼r alle anderen Rechnungen
> im space-time predictor, d.h. da kann man sehr viel Ã¼bernehmen. Nur, dass
> man den Zeitindex nicht mehr
> mitschleppen muss, sondern nur im Raum arbeiten kann. Der einzige Kernel der
> geÃ¤ndert werden muss ist der
> space-time predictor (in 2D und 3D).
>
> Ich wÃ¼rde nur den second und third order initial guess implementieren, siehe
> den Code in
> SpaceTimePredictor.f90 unter
>
> #ifdef SECOND_ORDER_INITIAL_GUESS
>
> #ifdef THIRD_ORDER_INITIAL_GUESS
>
> Ich habe mich diese Woche auf Scaling und den Vergleich Runge-Kutta DG /
> ADER-DG konzentriert,
> d.h. an dem 2D FO-fCCZ4 habe ich nicht mehr weitergearbeitet. Ich will erst
> das GRMHD Paper vom
> Tisch haben.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/224Numerical details how to evolve CCZ4 with FD2019-04-15T12:34:55+02:00Sven KÃ¶ppelNumerical details how to evolve CCZ4 with FDCollection of what was written by Dumbser in various E-Mails in order to proceed in the Runge Kutta - Finte Differencing code (Cactus/Antelope/Okapi):
### Dumbser, 3. April 2018 um 11:03: FO-CCZ4 with Finite differencing works
> since S...Collection of what was written by Dumbser in various E-Mails in order to proceed in the Runge Kutta - Finte Differencing code (Cactus/Antelope/Okapi):
### Dumbser, 3. April 2018 um 11:03: FO-CCZ4 with Finite differencing works
> since Sven has reported some difficulties with the implementation of FO-CCZ4 in the Einstein toolkit last week, and since I wanted to understand the potential problems in depth, I have simply written my own finite difference code for the Einstein equations, based on central finite differences in space and Runge-Kutta time integration. According to Sven's and Elias' description, this is exactly what you are also doing in Cactus, right? The implementation is straightforward, since FD schemes are extremely simple.
>
>
>
> To do my tests, I have just copy-pasted my Fortran subroutine PDEFusedSrcNCP into the finite difference code, and I then insert FD point values of Q and central FD approximations for the first spatial derivatives. To save CPU time, I have done all computations in 1D so far.
>
>
>
> Please find attached the results that I have obtained for the Gauge wave with amplitude A=0.01 and A=0.1 until a final time of t=1000. I have used sixth order central FD in space and a classical third order Runge-Kutta scheme in time. For FO-CCZ4, I have set all damping coefficients to zero (kappa1=kappa2=kappa3=0), and I use c=0 with e=2. Zero shift (sk=0) and harmonic lapse. CFL number based on e is set to CFL=0.5 at the moment. Now the important points:
>
>
>
> 1. Runge-Kutta O3 is for now preferrable over Runge-Kutta O4, since it is intrinsically dissipative. The reason is that the fourth order time derivative term in the Taylor series on the left hand side remains with RK3, while it cancels with RK4, and when moving the term q_tttt to the right hand side and after Cauchy-Kowalevsky procedure, it becomes a fourth order spatial derivative term with negative sign, which is good for stability (second spatial derivatives must have positive sign on the right hand side, fourth spatial derivatives must have negative sign for stability. this is easy to check via Fourier series and the dispersion relation of the PDE).
>
>
>
> 2. The RK3 alone is not enough to stabilize the scheme for the larger amplitude A=0.1 of the Gauge Wave, but it is sufficient for A=0.01. I therefore explicitly needed to subtract a term of the type - dt/dx*u_xxxx, which is essentially a fourth order Kreiss-Oliger-type dissipation with appropriately chosen viscosity coefficient.
>
>
>
> I will now replace the Kreiss-Oliger dissipation which I do not like with the numerical dissipation that you would have obtained with a Rusanov flux in a fourth order accurate finite volume scheme. In the end, the dissipation operator will again be written as a finite difference, but there I know at least exactly what is going on and we will exactly know the amount of dissipation to be put (it can only be a function of the largest eigenvalue). So there will be NO parameter to be tuned. I will keep you updated on this.
>
>
>
> From my own results I can conclude that everything is working as expected, i.e. you must have at least one bug in your implementation of FO-CCZ4 in Cactus; or you have run the tests with the wrong parameters (please use CFL=0.5 for the moment, Kreiss-Oliger dissipation with a viscosity factor so that you get -dt/dx*u_xxxx on the right hand side in the end, please set kappa1=kappa2=kappa3=0 and set e=2 and c=0 in FO-CCZ4). If your code still does not run, I can send you my finite difference Fortran code to help you with the debugging.
>
>
>
> While running and implementing my FD code for FO-CCZ4 I have also been working on the vectorization of FO-CCZ4 in ADER-DG. The interesting news: the finite difference code requires more than 20 microseconds per FD point update, and ADER-DG with the good new initial guess for the space-time polynomial and proper vectorization needs also about 20 microseconds per DOF update for FO-CCZ4, i.e. ADER-DG is indeed becoming competitive with FD, who would have every believed this last year :-) On the new 512bit vector machines of RSC in Moscow, we expect the PDE to run even twice as fast, since the vector registers are twice as large as the current state of the art. We are aiming at a time per DOF update of about 10 microseconds. I will keep you informed.
### Dumbser, 4. April 2018 um 10:51:
>
However, my latest experiments show that you can also use RK4 together with a finite-volume type dissipative operator, which is very simple to
implement and which does not require any parameters to be tuned. It will just replace the Kreiss-Oliger dissipation. And by the way: in this setting,
the scheme can be run with CFL=0.9, which is what we want. I will send around more details later.
### Dumbser, 4. April 2018 um 18:23
> there are again good news from the finite difference for FO-CCZ4 front. Instead of your classical Kreiss-Oliger dissipation, I suggest to use the following dissipation operator, which should simply be "added" to the time derivatives of
>
> all quantities on the right hand side, i.e.:
>
>
> ```
> dudt(:,i,j,k) = dudt(:,i,j,k) - 1.0/dx(1)* 3.0/256.0* smax * ( -15.0*u(:,i+1,j,k)-u(:,i+3,j,k)-15.0*u(:,i-1,j,k)+6.0*u(:,i+2,j,k)+20.0*u(:,i,j,k)+6.0*u(:,i-2,j,k)-u(:,i-3,j,k) )
> ```
>
>
> where dudt(:,i,j,k) is the time derivative of the discrete solution computed by the existing Fortran function PDEFusedSrcNCP and smax is the maximum eigenvalue in absolute value. This operator derives from
>
>
> ```
> - 1/dx(1)*( fp - fm ),
> ```
>
>
> where the dissipative flux fp is defined as
>
>
> ```
> fp = - 1/2 * smax * ( uR - uL ),
> ```
>
>
> and uR and uL are the central high order polynomial reconstructions of u evaluated at the cell interface x_i+1/2. The flux fm is the same, but on the left interface x_i-1/2.
>
>
>
> Please find attached the new results for the Gauge Wave with A=0.1 amplitude. Everything looks fine, i.e., the ADM constraints as well as the waveform at the final time. Note that this simulation was now run with the fourth order
> Runge-Kutta scheme in time and using a CFL number of CFL=0.9 based on the maximum eigenvalue.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/225Provide vectorized user functions in the optimized kernels2019-03-07T19:41:41+01:00Sven KÃ¶ppelProvide vectorized user functions in the optimized kernelsThis is something Jean Matthieu should do.
Then we can immediately test a couple of PDEs, such as Euler, GRMHD or CCZ4. Dumbser shared his vectorized code also somewhere.This is something Jean Matthieu should do.
Then we can immediately test a couple of PDEs, such as Euler, GRMHD or CCZ4. Dumbser shared his vectorized code also somewhere.Jean-Matthieu GallardJean-Matthieu Gallardhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/226Use sensible Fortran flags2018-04-24T11:58:40+02:00Ben HazelwoodUse sensible Fortran flagsAt the moment the fortran code is compiled with minimal flags, and therefore the compiler has its hands tied. However, the code is written in a way which should be easily vectorisable by the compiler.
Ekatherine from RSC is currently te...At the moment the fortran code is compiled with minimal flags, and therefore the compiler has its hands tied. However, the code is written in a way which should be easily vectorisable by the compiler.
Ekatherine from RSC is currently testing:
`-xCORE-AVX512 -fma -align array64byte`
and has mentioned that `-qopt-prefetch=3` worked well with the prototype code.
To be updated...Ben HazelwoodBen Hazelwoodhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/227Reflection for the parameter system2019-03-07T20:43:51+01:00Sven KÃ¶ppelReflection for the parameter systemCurrently, using runtime parameters requires to parse them somewhere, leading to code like
```c++
void GeometricBallLimiting::readParameters(const mexa::mexafile& para) {
radius = para["radius"].as_double();
std::string where = para...Currently, using runtime parameters requires to parse them somewhere, leading to code like
```c++
void GeometricBallLimiting::readParameters(const mexa::mexafile& para) {
radius = para["radius"].as_double();
std::string where = para["where"].as_string();
toLower(where);
if(where == "inside") limit_inside = true;
else if(where == "outside") limit_inside = false;
else {
logError("readParameters()", "Valid values for where are 'inside' and 'outside'. Keeping default.");
}
logInfo("readParameters()", "Limiting " << (limit_inside ? "within" : "outside of") << " a ball with radius=" << radius);
}
```
associated to a structure where the parameters are stored,
```c++
struct GeometricBallLimiting : public LimitingCriterionCode {
double radius; ///< Radius of the ball (default -1 = no limiting)
bool limit_inside; ///< Whether to limit inside or outside (default inside)
GeometricBallLimiting() : radius(-1), limit_inside(true) {}
bool isPhysicallyAdmissible(IS_PHYSICALLY_ADMISSIBLE_SIGNATURE) const override;
void readParameters(const mexa::mexafile& parameters) override;
};
```
This is lot's of overhead and really redundant.
In Cactus, the user can declare parameters *including their description/meaning and valid values* in a nice language, they are then made available as a structure by the glue code, all the parsing is abstracted away. Example of a Cactus parameter file (CCL file):
```
real eta "Damping coefficient for the Gamma Driver" STEERABLE=always
{
0:* :: "should be 1-2/M"
}0.2
KEYWORD evol_type "Which set of equations to evolve"
{
"BSSN" :: "traditional BSSN"
"Z4c" :: "Z4c"
"CCZ4" :: "(Covariant) and conformal Z4"
"FOCCZ4" :: "First order formulation of the CCZ4"
}"Z4c"
boolean include_theta_source "Only FO-CCZ4: set to false to remove the algebraic source terms of the type -2*Theta" STEERABLE=always
{
} yes
```
In the code, one then just has something like
```c++
struct parameters {
double eta;
std::string evol_type;
boolean include_theta_source;
}
```
which is already filled nicely with values.
While at least I certainly don't want to code such a glue code for ExaHyPE, in contrast with minimal effort we can get much better then the current MEXA system. In fact, it would be nice to use *OOP reflection* to automatically register class attributes for common data types (int/double/bool/string). That means I would be fine with writing
```c++
struct GeometricBallLimiting {
double radius;
enum class limit_at { inside, outside };
REGISTER_PARAM(radius, DEFAULT(-1), "Radius of the ball");
REGISTER_PARAM(limit_at, DEFAULT(limit_at::inside), "Where to limit");
}
```
In fact, something like this is possible some macro magic.Sven KÃ¶ppelSven KÃ¶ppelhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/228stableTimeStepSize kernels should return maximum eigenvalue for (classical) l...2018-06-14T16:10:38+02:00Dominic Etienne CharrierstableTimeStepSize kernels should return maximum eigenvalue for (classical) local time steppingIn general, we cannot associate the smallest time step size with the finest mesh
level as there might be larger eigenvalues present on coarser mesh levels.
This is an issue for local time stepping where you scale the smallest time
step ...In general, we cannot associate the smallest time step size with the finest mesh
level as there might be larger eigenvalues present on coarser mesh levels.
This is an issue for local time stepping where you scale the smallest time
step size by a factor k (k=3 in Peano) with decreasing mesh level.
The kernels should thus return the maximum eigenvalue, too, or only the maximum eigenvalue.
The minimum local time step size would then be computed according to:
```
dt_min = CFL * max_{over all cells} lambda / min_{over all cells} cellSize
```
which is different to what we are currently doing for global time stepping:
```
dt_min = CFL * min_{over all cells} ( lambda / cellSize )
```
The ADERDGSolver superclass would then decide which minimisation to
perform.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/157Plotter variables are not constants2018-09-10T19:41:33+02:00Ghost UserPlotter variables are not constantsActually the ExaHyPE runtime treats the number of writtenUnknowns per plotter as a runtime variable, ie.
```
plot hdf5::flash ConservedWriter
variables = 19
time = 0.0
repeat = 0.00001
output = ./hdf5-flash/c...Actually the ExaHyPE runtime treats the number of writtenUnknowns per plotter as a runtime variable, ie.
```
plot hdf5::flash ConservedWriter
variables = 19
time = 0.0
repeat = 0.00001
output = ./hdf5-flash/conserved
end plot
```
one can change `variables = x` without recompiling. However, the toolkit wants this to be a constant:
```
plot hdf5::flash ConservedWriter
variables const = 19
time = 0.0
repeat = 0.00001
output = ./hdf5-flash/conserved
end plot
```
otherwise it says `ERROR: eu.exahype.parser.ParserException: [71,17] expecting: 'const'`
This should not happen, ie. the toolkit grammar should accept without `const`.
As always, there is assumably no case when somebody wants to do this **except benchmarking plotter file formats which is exactly what I'm doing now** and the typical generated code by the toolkit is not aware of a non-constexpr number of variables, but all the `ExaHyPE/exahype/plotters/` code actually treats the number as runtime constant and I see no reason why to artificially introduce something constexpr here.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/153Constants in specfiles2018-09-10T19:42:06+02:00Ghost UserConstants in specfilesI have too much constants that a specfile syntax
```
solver Limiting-ADER-DG GRMHDSolver
variables const = rho:1,vel:3,E:1,B:3,psi:1,lapse:1,shift:3,gij:6
order const = 4
maximum-mesh-size...I have too much constants that a specfile syntax
```
solver Limiting-ADER-DG GRMHDSolver
variables const = rho:1,vel:3,E:1,B:3,psi:1,lapse:1,shift:3,gij:6
order const = 4
maximum-mesh-size = 6.0009
maximum-mesh-depth = 0
time-stepping = global
kernel const = generic::fluxes::nonlinear
language const = C
constants = initial_data:tovstar,boundary_x_lower:reflective,boundary_y_lower:reflective,boundary_z_upper:outgoing,tovstar-mass:1.234,tovstar-rl-ratio:2.345
```
would make sense. Tobias thinks in such a case a user would do something like
```
solver Limiting-ADER-DG GRMHDSolver
variables const = rho:1,vel:3,E:1,B:3,psi:1,lapse:1,shift:3,gij:6
order const = 4
maximum-mesh-size = 6.0009
maximum-mesh-depth = 0
time-stepping = global
kernel const = generic::fluxes::nonlinear
language const = C
constants = configuration-file:foobar.txt
```
and outsource the configuration in a language the user likes. However, this breaks with the idea of a single specfile for a single run. Therefore, I see two options
## Add a DATA section after the specfile
This is what many script languages do, for instance perl ([the DATA syntax in perl](https://stackoverflow.com/questions/13463509/the-data-syntax-in-perl)). The idea is just that the parsers ignore what goes after the end of the specfile (ie. the line containing `end exahype-project`). Users could dump there any content in their favourite language. I would vote for this as it is super easy to implement, allows file concatenation and flexibility.
## Allow user constant section
We could also just allow users to do something like
```
solver Limiting-ADER-DG GRMHDSolver
variables const = rho:1,vel:3,E:1,B:3,psi:1,lapse:1,shift:3,gij:6
order const = 4
maximum-mesh-size = 6.0009
maximum-mesh-depth = 0
time-stepping = global
kernel const = generic::fluxes::nonlinear
language const = C
constants = parameters:appended
limiter-kernel const = generic::musclhancock
limiter-language const = C
dmp-observables = 2
dmp-relaxation-parameter = 1e-2
dmp-difference-scaling = 1e-3
steps-till-cured = 0
simulation-parameters
foo = bar
baz = bar
blo = bar
blu = bar
etc.
end simulation-parameters
plot vtk::Cartesian::vertices::limited::binary ConservedWriter
variables const = 19
time = 0.0
repeat = 0.00166667
output = ./vtk-output/conserved
end plot
...
```
This would go well with the specfile syntax. In order to implement, we need
* Such a section with any key-value pairs added to the grammar, so the toolkit does not complain
* Support in the `Parser.cpp` (which is not too hard to add)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/152Dynamic plotter registration2019-03-07T19:39:00+01:00Ghost UserDynamic plotter registrationUsers shall be able to add, comment out or remove plotters at runtime without recompiling. The plotter ordering should not be fixed.
This is not too hard to obtain, just requires changes at `KernelCalls.cpp` with having a generated plot...Users shall be able to add, comment out or remove plotters at runtime without recompiling. The plotter ordering should not be fixed.
This is not too hard to obtain, just requires changes at `KernelCalls.cpp` with having a generated plotter registration function (semi pseudocode)
```c++
Writer* kernels::getNamedWriter(std::string name, Solver& solver) {
if(name == "ConservedWriter") return new GRMHD::ConservedWriter(*static_cast<exahype::solvers::LimitingADERDGSolver*>(solver));
if(name == "usw") return new GRMHD::SomeOtherWriter(*static_cast<exahype::solvers::ADERDGSolver*>(solver));
if(name == ...
else
failure: Dont now this plotter type.
}
```
We then can replace the generated current section
```c++
exahype::plotters::RegisteredPlotters.push_back( new exahype::plotters::Plotter(0,0,parser,new GRMHD::ConservedWriter( *static_cast<exahype::solvers::LimitingADERDGSolver*>(exahype::solvers::RegisteredSolvers[0])) ));
exahype::plotters::RegisteredPlotters.push_back( new exahype::plotters::Plotter(0,1,parser,new GRMHD::ConservedWriter( *static_cast<exahype::solvers::LimitingADERDGSolver*>(exahype::solvers::RegisteredSolvers[0])) ));
exahype::plotters::RegisteredPlotters.push_back( new exahype::plotters::Plotter(0,2,parser,new GRMHD::ConservedWriter( *static_cast<exahype::solvers::LimitingADERDGSolver*>(exahype::solvers::RegisteredSolvers[0])) ));
exahype::plotters::RegisteredPlotters.push_back( new exahype::plotters::Plotter(0,3,parser,new GRMHD::ConservedWriter( *static_cast<exahype::solvers::LimitingADERDGSolver*>(exahype::solvers::RegisteredSolvers[0])) ));
exahype::plotters::RegisteredPlotters.push_back( new exahype::plotters::Plotter(0,4,parser,new GRMHD::IntegralsWriter( *static_cast<exahype::solvers::LimitingADERDGSolver*>(exahype::solvers::RegisteredSolvers[0])) ));
```
with a non-generated section (pseudocode)
```c+++
for(const int& solvernum : Parser->getSolvers()) {
int plotternum=0;
for( const string& plottername : Parser->getPlotterNamesForSolver(solvernum)) {
exahype::plotters::RegisteredPlotters.push_back(new exahype::plotters::Plotter(solvernum,plotternum++,parser,getNamedWriter(plottername, exahype::solvers::RegisteredSolvers[solvernum]));
}
}
```
This is something I can do on my own. If the Parser gives the neccessary data...https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/92Multi-solvers / Parameter Studies / Sensitivity Analysis2019-09-20T15:46:45+02:00Ghost UserMulti-solvers / Parameter Studies / Sensitivity Analysis### Progress:
* Multi-solver infrastructure is implemented for all solvers (ADER-DG,Godunov FV,Limiting ADER-DG).
Further tests and more debugging are necessary for MPI and TBB - especially with respect to the limiting ADER-DG so...### Progress:
* Multi-solver infrastructure is implemented for all solvers (ADER-DG,Godunov FV,Limiting ADER-DG).
Further tests and more debugging are necessary for MPI and TBB - especially with respect to the limiting ADER-DG solver.
### Issues/Features
* 30/12/16: Parameter studies require using the same solver with different initial conditions and/or source terms.
The problem and discretisation (order,variables,plotters,...) of the solver does not change in these studies.
I plan to enable such studies with an annotation {<studyNumber>}, e.g., "solver SolverType MySolver{10}". that is interpreted by the Toolkit
which then creates a constructor that passes the number of the study (0 to 9 in the above example), and
further adds the required number of solvers to the registry.
The current Plotter infrastructure does not support this idea yet because of its dependence on exahype::Parser.
* ~~30/12/16: Limiting-ADER-DG: Limiter domain seems to be required to often while the actual limiter domain does not change per solver.~~ Fixed.
* ~~30/12/16: MPI crashes for multiple ADER-DG/Limiting ADER-DG solvers (seg-fault.)~~ Fixed.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/151Single-node MPI strong scaling differences between 2d and 3d2019-09-20T15:46:36+02:00Ghost UserSingle-node MPI strong scaling differences between 2d and 3dI am investigating the strong scaling behaviour of the 2d and 3d versions of ExaHyPE.
While the 2d version shows reasonable scalability, the 3d version does not.
* Experiments are performed on a single-node of SuperMUC Phase 2.
* A...I am investigating the strong scaling behaviour of the 2d and 3d versions of ExaHyPE.
While the 2d version shows reasonable scalability, the 3d version does not.
* Experiments are performed on a single-node of SuperMUC Phase 2.
* All plotters are turned off
* To exclude interconnect effects, all experiments are performed on a single-node.
In my experiments, I switch the master-worker communication (M/W)
on or off as well as the neighbour communication (N).
## Only Peano communication (M/W=off,N=off)
ranks | adapter name | iterations | total CPU time [t]=s | average CPU time [t]=s | total user time [t]=s | average user time [t]=s |
----------|-----------------------|-------------------|-----------------------------------|---------------------------------------|----------------------------------|---------------------------------------|
2 | ADERDGTimeStep | 29 | 37.07 | 1.27828 | 316.768 | 10.923 |
3 | ADERDGTimeStep | 29 | 36.25 | 1.25 | 305.752 | 10.5432 |
12 | ADERDGTimeStep | 29 | 27.04 | 0.932414 | 206.142 | 7.10834 |
28 | ADERDGTimeStep | 29 | 10.27 | 0.354138 | 24.4012 | 0.841421 |
## M/W=on, N=off
ranks | adapter name | iterations | total CPU time [t]=s | average CPU time [t]=s | total user time [t]=s | average user time [t]=s |
----------|-----------------------|-------------------|-----------------------------------|---------------------------------------|----------------------------------|---------------------------------------|
2 |ADERDGTimeStep | 29 | 37.58 | 1.29586 | 316.044 | 10.8981 |
3 | ADERDGTimeStep | 29 | 36.3 | 1.25172 | 306.977 | 10.5854 |
12 | ADERDGTimeStep | 29 | 27.21 | 0.938276 | 207.078 | 7.14064 |
28 | ADERDGTimeStep | 29 | 10.27 | 0.354138 | 24.5317 | 0.845921 |
## M/W=off, N=on
ranks | adapter name | iterations | total CPU time [t]=s | average CPU time [t]=s | total user time [t]=s | average user time [t]=s |
----------|-----------------------|-------------------|-----------------------------------|---------------------------------------|----------------------------------|---------------------------------------|
2 | ADERDGTimeStep | 29 | 39.52 | 1.36276 | 337.709 | 11.6451
3 | ADERDGTimeStep | 29 | 99.04 | 3.41517 | 995.378
12 | ADERDGTimeStep | 29 | 121.76 | 4.19862 | 1106.45 | 38.1534
28 | ADERDGTimeStep | 29 | 18.24 | 0.628966 | 105.858 | 3.65027
## M/W=on, N=on
Slightly worse than M/W=off,N=on.
# Insights:
* Rank 3 and 12 performance are load balancing issues.
* For the 28 rank run, the LB only deploys 10 ranks. This is actually a well-balanced setup for 10 ranks. (If we set 10 ranks, we have a load balancing issue again.)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/69Do memory precision tests2019-03-07T19:23:10+01:00Ghost UserDo memory precision testsTobias schreibt:
>
ich braeuchte mal Deine Hilfe (ja, natuerlich schnell, weil das in die Praesentation rein soll, aber das haste eh vermutet ;-). Die neue ExaHyPE-Version von gestern Nacht kann mit Zahldarstellungen <IEEE double Standa...Tobias schreibt:
>
ich braeuchte mal Deine Hilfe (ja, natuerlich schnell, weil das in die Praesentation rein soll, aber das haste eh vermutet ;-). Die neue ExaHyPE-Version von gestern Nacht kann mit Zahldarstellungen <IEEE double Standard arbeiten. Das ist teuer (erste Schritte suggerieren einen Faktor 5 in der Laufzeit, aber das koennte auch am Chip liegen, d.h. ich muss die Russenmaschine testen), aber es reduziert halt den Speicherbedarf einer Simulation etwas. Ich bekomm 10% schon fuer den einfachen Euler 2d mit Ordnung 3 - hoffe aber, dass das bei anderen Setups deutlich signifikanter ist. Einschalten laesst sich das Feature, indem man
>
double-compression = 0.000001
>
auf etwas zwischen 0 und 1 setzt. Das andere Flag spawn-double-compression-as-background-thread kannste vergessen, das ist meine Baustelle. Meine Frage/Bitte ist nun: Hast Du einen sinnvollen Benchmark zur Hand aus der Astrophysik und kannst Du mal
double-compression = 0.0
double-compression = 0.0001
double-compression = 0.000001
double-compression = 0.000000000001
>
damit ausprobieren und mir sagen, ob Werte grosser 0 die Loesung qualitativ verschlechtern? Ich hab keine Ahnung, was Ihr Euch typischerweise anseht (Ankunftszeiten/Amplituden/...?) deswegen brauch ich da ein qualifiziertes Auge. Wenn Du parallel noch drauf schaust, was er als memoryUsage ausgibt (macht er jetzt automatisch, wenn man nicht mit MPI uebersetzt), dann wuerde mir das sehr helfen, da eine Einschaetzung zu bekommen.
>
Nachtrag: So saehe dann eine Optimisation Sectino aus
```
optimisation
fuse-algorithmic-steps = on
fuse-algorithmic-steps-factor = 0.99
timestep-batch-factor = 0.0
skip-reduction-in-batched-time-steps = on
disable-amr-if-grid-has-been-stationary-in-previous-iteration = off
double-compression = 0.000001
spawn-double-compression-as-background-thread = off
end optimisation
```
@svenk: Machen.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/58Reduce memory footprint2019-09-20T15:46:51+02:00Ghost UserReduce memory footprint# Open issues
* We have consecutive heap indices for volume data and face data. We thus need to store one index for each
and can get the others by incrementing the index up to the bound we know of as developers.
We distinguish b...# Open issues
* We have consecutive heap indices for volume data and face data. We thus need to store one index for each
and can get the others by incrementing the index up to the bound we know of as developers.
We distinguish between cell and face data since we have helper cells that do not allocate cell data but face data.
* We can remove prediction and volumeFlux fields completely if we perform also the time integration
in the volume integral and the boundary extrapolation routines.
This would further make it easier to switch between global and local time stepping.
Here, we would just load a different kernel for the boundary extrapolation and allocate space-time face data
if the user switches local time stepping on.
* Allocate all the temporary arrays like the rhs, lQi_old, etc. only once per Thread and not dynamically
during the kernel calls. **(This could be done easily now in each solver!)**
* Create one "big" ADERDGTimeStep function in kernels/solver. This might help the compiler/is more Cache-friendly.
# Done
I think the following has been Tobias' idea originally;
It is not necessary to store temporary data on the heap for every cell description.
We need to analyse which ADER-DG fields are temporary and which
need to be stored persistently on the heap.
From my point of view, the following variables are temporary:
* spaceTimePredictor
* predictor
* spaceTimeVolumeFlux (includes sources)
* volumeFlux (includes sources)
The spacetime fields have a massive memory footprint.
They scale with (N+1)^{d+1} and d*(N+1)^{d+1}.
I thus propose that we assign each thread its own spaceTimePredictor spaceTimeVolumeFlux, predictor, and volume flux fields
and remove the fields from the heap cell descriptions.
This would reduce the memory footprint of the ADER-DG method dramatically (and might further lead to more cache-friendly code ?).
In a second step, we should kick out the volumeFlux field completely, don't do the time integration of the spaceTimeVolumeFlux,
and directly perform the volume integral with the spaceTimeVolumeFlux.
Implementation details
* Allocate arrays in Prediction mapping for each threadhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/148LimitingADERDGSolver alsos performs limiter status spreading for user refinement2019-09-20T15:46:35+02:00Ghost UserLimitingADERDGSolver alsos performs limiter status spreading for user refinementPotential optimisation:
* LimitingADERDGSolver alsos performs limiter status spreading for user refinement.
This can potentially be turned off. Requires an enum return type instead of a bool.Potential optimisation:
* LimitingADERDGSolver alsos performs limiter status spreading for user refinement.
This can potentially be turned off. Requires an enum return type instead of a bool.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/57Finite volumes solver / Limiter2019-09-20T15:46:56+02:00Ghost UserFinite volumes solver / Limiter# Treating NaNs in the update vector
If NaNs occur in the update vector, a rollback
using the update vector is not possible anymore.
We thus need to introduce a "previousSolution" field
to the ADER-DG solution.
We might be able to g...# Treating NaNs in the update vector
If NaNs occur in the update vector, a rollback
using the update vector is not possible anymore.
We thus need to introduce a "previousSolution" field
to the ADER-DG solution.
We might be able to get rid of the "update" field if we directly
add a weighted (by dt and quad weight) update to the "solution" field
from the volume integral and surface integral evaluation.
# Status of the implementation of the limiting ADER-DG scheme (unordered notes)
* The limiter workflow works now for uniform meshes with Intel's TBBs.
[sod-shock-tube_limiting-aderdg-P_3_Godunov-N_7_dmp-only.avi](/uploads/b71d24c7b82f805c604f31c1b322fcae/sod-shock-tube_limiting-aderdg-P_3_Godunov-N_7_dmp-only.avi)
* Found and fixed some more bugs. Now everything looks fine. The limiter has a local characteristic and only
requires a rollback/reallocation of memory in every fifth time step (49 out of 255) in the experiment
shown [sod-shock-tube_limiting-aderdg-P_3_Godunov-N_7_dmp-only.avi](/uploads/1e065c7c5c2a52aefb551ef0a03f0ac5/sod-shock-tube_limiting-aderdg-P_3_Godunov-N_7_dmp-only.avi), where we ran an experiment with P=3 order ADER-DG approx, \delta_0=1e-2, \epsilon=1e-3.
As long as the limiter domain does not change, we do not need to reallocate new memory and do a
recomputation of certain cells in our novel implementation.
* Find another simulation using a 9-th order ADER-DG approximation and parameters \delta_0=10^-4 , \epsilon=10^-3 here:
[sod-shock-tube_limiting-aderdg-P_9_Godunov-N_19_dmp_pad.avi](/uploads/fb466322f7df5339790aac4afca9ce6c/sod-shock-tube_limiting-aderdg-P_9_Godunov-N_19_dmp_pad.avi)
~~We observee a strange x velocity in this benchmark. Can't be seen in the video. The both profile of the Shock-Tube differs signifantly
in vicinity of the contact discontinuity.~~This is actually the momentum density. Everything is fine.
* Explosion problem with P=5 and delta_0=1e-4 and epsilon=1e-3:
[explosion_limiting-aderdg-P_5_Godunov-N_11_dmp_pad.avi](/uploads/4ace50cdcbfcbc1706432d0cc180fcea/explosion_limiting-aderdg-P_5_Godunov-N_11_dmp_pad.avi). Here the limiter domain must be adjusted after most of the time steps
* I performed some further optimisations of the DMP evaluation. I found that we can easily perform a
"loop" fusion of the min and max computation and the DMP.
* Implemented the physical admissibility detection (PAD) now as well and use it to
detect non-physical oscillations in the initial conditions. Here, I found that we can directly
pass it the solution min and max we computed as part of the DMP calculation. No need
to loop over the whole degrees of freedom and the Lobatto nodes again.
* ~~There is still an issue with the discrete maximum principle which detects too many
cells to be troubled.~~ Was resolved by tuning the DMP parameters.
* AMR and Limiting need to be combined. This is in principle "a simple" wiring of FV patches
along the ADER-DG tree. We further need spatial averaging and interpolation operators.
Then, we can follow the AMR timestepping implementation of the ADER-DG scheme.
# Writing a Godunov type first order method
Due to the issues I encountered while [rewriting](#issueswithrewritingthefinitevolumessolver) the
pseudo-TVD donor cell type finite volumes solver, Tobias and me decided
to write a simple first order Godunov type finite volumes solver that only exchanges volume averages/fluxes
between direct (face) neighbours.
Replacement of this simple method by a more complex one will be tackled in later stages of the project -- if necessary.
# Issues with higher order finite volumes solvers
* Higher-order methods have a reconstruction stencil which is larger than the
Riemann solver stencil (simple star).
* Reconstruction and time evolution of boundary extrapolated values at one face of a patch can only be performed
after all volume averages from the direct neighbour and corner neighbour patches are available.
I have identified the following phases of the FVM solver now:
* Gather: Just get the neighbour values (arithmetic intensity = 0). This will move into the Merging mapping.
* Spatial reconstruction: Compute spatial slopes in the cells or something similar. This will move into the SolutionUpdate mapping.
* Temporal evolution of boundary-extrapolated values: This will move into the SolutionUpdate mapping.
* Riemann solve: This will move into the SolutionUpdate mapping.
* Solution update; This will move into SolutionUpdate mapping.
* Extrapolation of volume averages: This will send layers of the volume averages to the neighbours.
We send layers instead of a single layer in order to only have one data exchange instead of two.
With the above strategy, we can merge FVM solvers into the current framework. The difference to the ADER-DG solver
is that the computational intensity of the neighbour merging operation is zero.
We will further need to consider neighbour data exchange over edges (3-d) and corners in our
merging scheme. Neighbour data exchange over edges in 3-d will require additional falgs.
Exchange over corners does not require additional flags.
# Issues with rewriting the finite volumes solver
I tried to decompose our pseudo-TVD donor cell type finite volumes
solver into a Riemann solve and a solution update part,
and ran into the following issues.
## Open issues with the donor cell type pseudo-TVD finite volumes solver implementation:
* The current solver uses updated solution values in the solution update.
We have to distinguish between old values and new values or
have to introduce an update.
* We need to consider two layers of the neighbour to compute all extrapolated boundary
values (wLx,wRx,wLy,wRy). Currently only one layer ist considered. This means
that the boundary extrapolated values and thus the face fluxes at the outermost faces
are computed wrongly.
* To tackle the above problem, we either need to exchange a single cell of
the diagonal neighbours or we need to rely on two data exchanges.
Tobias told me there exists however a trick to circumvent this:
Reconstructing the diagonal neighbours' contributions with values from
the direct neighbours.
## Limitations of the donor cell type pseudo-TVD finite volumes solver:
* The solver is not TVD in multiple dimensions.
* We do neither perform corner correction nor (dimensional) operator splitting. We should
thus observe loss of mass conservation, large dispersion errors, and a reduced CFL stability limit.
The reduced CFL stability limit was taken into account in our implementation.
This is also a problem of the multi-dim. Godunov method in the unsplit form.
## Follow up: Low-order Euler-DG patch with direct coupling to ADER-DG method?
## Dissemination
For the following benchmarks I always used copy boundary conditions which are equal to
outflow/inflow boundary conditions as long as everything flows out of the domain.
* Euler (Compressible Hydrodynamics)
* Explosion Problem with 27^2 grid, P=3, and N=2*3+1 FV subgrid:
![lim-aderdg_explosion](/uploads/6459f9b4a775a482cdfec24e129c56ae/lim-aderdg_explosion.png)
[lim-aderdg_explosion.avi](/uploads/6a38fa00a6bf41fe3aefbee380a03396/lim-aderdg_explosion.avi)
* Sod Shock Tube with 27^2 grid, P=3 and N=2*3+1 FV subgrid:
![lim-aderdg_euler_sod-shock-tube](/uploads/eb2a5e57ca8f37665c2dffe48dbf1e22/lim-aderdg_euler_sod-shock-tube.png)
[lim-aderdg_euler_sod-shock-tube.avi](/uploads/61dfedf7db4ead996d7fd6b9796ba5c0/lim-aderdg_euler_sod-shock-tube.avi)
![hires_antialias](/uploads/26843a0f50e52ded919cfc12b777360b/draft2_hires_antialias.png)
* SRMHD (Special Relativistic Magnetohydrodynamics; basically: Special Relativistic Euler+Maxwell)
* Blast Wave setup as in 10.1016/j.cpc.2014.03.018 with 27^2 grid, P=5, and N=2*5+1 FV subgrid:
![lim-aderg_mhd__blast-wave](/uploads/feb026167c7c2d882c132f3b775c37ac/lim-aderg_mhd__blast-wave.png)
[lim-aderdg_mhd_blast-wave.avi] (/uploads/4d715c47a03bd3850d3c0398e39da58c/lim-aderdg_mhd_blast-wave.avi)
* Rotor setup as in http://adsabs.harvard.edu/abs/2004MSAIS...4...36D with 9^2 grid, P=9, and N=2*9+1 FV subgrid:
![lim-aderdg_mhd_rotor_9x9_P9](/uploads/b35e7dc2dc137068e7416a1952d02a1a/lim-aderdg_mhd_rotor_9x9_P9.png)
[lim-aderdg_mhd_rotor_9x9_P9.avi](/uploads/c5ee8ff29f18a7201be66a1a26b664dc/lim-aderdg_mhd_rotor_9x9_P9.avi)
https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/145SegFault occurs if I turn conservative flux off2018-06-15T15:13:07+02:00Ghost UserSegFault occurs if I turn conservative flux offProject: DIM_LimitingADERDGProject: DIM_LimitingADERDGhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/37Precision issues2019-04-15T12:37:23+02:00Ghost UserPrecision issuesI observed that we can only apply the restriction and prolongation operations
a certain number of times until the projected constant values in my test are not equal anymore
to the reference values (with respect to a tolerance of 1e-12)....I observed that we can only apply the restriction and prolongation operations
a certain number of times until the projected constant values in my test are not equal anymore
to the reference values (with respect to a tolerance of 1e-12).
For more infos see the todos in 2D and 3D definitions of
GenericEulerKernelTest::testFaceUnknownsProjection(),
and
GenericEulerKernelTest::testVolumeUnknownsProjection()https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/143New template-less pure virtual API2019-03-07T19:23:41+01:00Ghost UserNew template-less pure virtual APITODO: Implement this new API.
![usersolver_layout](/uploads/b314b4b3a6196966cec01c80918fb0c9/usersolver_layout.png)
Tobias favours it. Don't know how it goes with optimized kernels. SMall patch for the piucture from Tobias:
```
/* N...TODO: Implement this new API.
![usersolver_layout](/uploads/b314b4b3a6196966cec01c80918fb0c9/usersolver_layout.png)
Tobias favours it. Don't know how it goes with optimized kernels. SMall patch for the piucture from Tobias:
```
/* NEW: */
kernels::aderdg::generic::c::spaceTimePredictorLinear(BasisSolverAPI& solver, other parameters ... );
```
In textual form [Sketch.h](/uploads/8bcfc8c2920e7532f73e1dac5f8f5a07/Sketch.h)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/142ADERDG, inverseDX2018-06-19T11:17:55+02:00Jean-Matthieu GallardADERDG, inverseDX* Problem: most kernel use 1/dx instead of dx. Slow operation that could easily be optimized away.
* Solution:
- Peano implement a cellDescription.getInverseSize()
- ADERDGSolver use it
- ADERDG Kernel adapted
* Already done:
- Op...* Problem: most kernel use 1/dx instead of dx. Slow operation that could easily be optimized away.
* Solution:
- Peano implement a cellDescription.getInverseSize()
- ADERDGSolver use it
- ADERDG Kernel adapted
* Already done:
- Optimized kernel adapted, generate the inverseDx in the ADERDGSolver (code isolated with preprocessor)
@di25coxJean-Matthieu GallardJean-Matthieu Gallardhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/140Intel compiler bug on certain large C++ files ("exahype/repositories/Reposito...2019-08-25T11:41:31+02:00Ghost UserIntel compiler bug on certain large C++ files ("exahype/repositories/Repository...")With Intel 17 (Intel 15/16 apparently not affected) aka `COMPILER=Intel, SHAREDMEM=TBB, MODE=Debug, DISTRIBUTEDMEM=None.` there is a bug when compiling files like
./ExaHyPE/exahype/repositories/RepositoryExplicitGridTemplateInstant...With Intel 17 (Intel 15/16 apparently not affected) aka `COMPILER=Intel, SHAREDMEM=TBB, MODE=Debug, DISTRIBUTEDMEM=None.` there is a bug when compiling files like
./ExaHyPE/exahype/repositories/RepositoryExplicitGridTemplateInstantiation4LimiterStatusMergingAndSpreadingMPI.o
and
./ExaHyPE/exahype/repositories/RepositoryExplicitGridTemplateInstantiation4GridErasing.o
and similar. This is known to Tobias and should be reported to Nicolay Hammer at LRZ or similar.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/1362D / 3D build check is not performed correctly2019-03-18T21:55:01+01:00Ghost User2D / 3D build check is not performed correctlyWe have a startup check in the code that makes sure that the specfile has some same `const` build constants as for instance the ADERDG polynomial order. However, this test fails for checking that for instance a 2D build is also runned wi...We have a startup check in the code that makes sure that the specfile has some same `const` build constants as for instance the ADERDG polynomial order. However, this test fails for checking that for instance a 2D build is also runned with a 2D grid specification. Thus we get strange errors as in https://gitlab.lrz.de/exahype/ExaHyPE-Engine/commit/10b1261b34e2e239f9a4fe1e2c42f0e38d85a207 :
```
0.0459649 error Invalid simulation end-time: notoken
(file:/home/koeppel/numrel/exahype/Engine-ExaHyPE/./ExaHyPE/exahype/Parser.cpp,line:377)
` ``
And that not after the beginning but after one timestep.
There should be a well-speaking check instead at the beginning.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/130Guidebook: Fix recommendation for MPI Tags for Reductions2019-03-07T19:24:19+01:00Ghost UserGuidebook: Fix recommendation for MPI Tags for ReductionsWhen implementing an own MPI Reductions plotter, the guidebook tells us we should write something like
```
void finishRow() {
#ifdef Parallel
// Question: Do we really reserve a free tag and release on each function call?
c...When implementing an own MPI Reductions plotter, the guidebook tells us we should write something like
```
void finishRow() {
#ifdef Parallel
// Question: Do we really reserve a free tag and release on each function call?
const int reductionTag = tarch::parallel::Node::getInstance().reserveFreeTag(
std::string("TimeSeriesReductions(") + filename + ")::finishRow()" );
if(master) {
double recieved[LEN];
for (int rank=1; rank<tarch::parallel::Node::getInstance().getNumberOfNodes(); rank++) {
if(!tarch::parallel::NodePool::getInstance().isIdleNode(rank)) {
MPI_Recv( &recieved[0], LEN, MPI_DOUBLE, rank, reductionTag, tarch::parallel::Node::getInstance().getCommunicator(), MPI_STATUS_IGNORE );
addValues(recieved);
}
}
} else {
for (int rank=1; rank<tarch::parallel::Node::getInstance().getNumberOfNodes(); rank++) {
if(!tarch::parallel::NodePool::getInstance().isIdleNode(rank)) {
MPI_Send( &data[0], LEN, MPI_DOUBLE, tarch::parallel::Node::getGlobalMasterRank(), reductionTag, tarch::parallel::Node::getInstance().getCommunicator());
}
}
}
tarch::parallel::Node::getInstance().releaseTag(reductionTag);
#endif
if(master) {
TimeSeriesReductions::finishRow();
writeRow();
}
}
```
However I assume this is not that good, looking at the output:
![mpi-reservefreetags](/uploads/de93f1e5c48d10c62101531611e044cd/mpi-reservefreetags.png)
*First question*: Does every plotter need it's own tag? In principal only if they would be called in parallel, isn't it?
In any case, I think the tags should'nt be created so frequently, isn't it? Instead, only once on startup time?
This should be fixed in the guidebook and my code :Dhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/129Access to neighbours for which kernels?2018-06-15T15:13:07+02:00Ghost UserAccess to neighbours for which kernels?Finite Volumes Solver:
- source; ex: bathymetry in SWEFinite Volumes Solver:
- source; ex: bathymetry in SWEhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/127Revise time stepping notation2019-09-20T15:46:39+02:00Ghost UserRevise time stepping notationI have to write down carefully how I handle the time step sizes for
* fused time stepping
* standard time stepping
and what I do during mesh refinement for
* fused time stepping
* standard time stepping
This is getting a little comp...I have to write down carefully how I handle the time step sizes for
* fused time stepping
* standard time stepping
and what I do during mesh refinement for
* fused time stepping
* standard time stepping
This is getting a little complicated and the docu just gives an partial overview.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/126CFL factor per solver as const int the specfile2019-04-15T12:37:03+02:00Ghost UserCFL factor per solver as const int the specfileThis reduces inconsistencies. Can now be easily generated as constexpr double into the AbstractMySolver class.This reduces inconsistencies. Can now be easily generated as constexpr double into the AbstractMySolver class.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/124Toolkit - Move from replaceAll to true template engine2018-09-21T12:09:36+02:00Jean-Matthieu GallardToolkit - Move from replaceAll to true template engineThe templates used by the Toolkit start to require some logic. For example the NamingScheme tag requires to expand a Set to multiple lines.
Currently we put that logic in the java class and use a replaceAll but it would be better to put...The templates used by the Toolkit start to require some logic. For example the NamingScheme tag requires to expand a Set to multiple lines.
Currently we put that logic in the java class and use a replaceAll but it would be better to put that logic in the template itself using a true template engine. For example the NamingScheme tag would be replace with a foreach loop that expand the Set in the template. Also this would help factorizing the code by putting common tag and their value into a more global context set for the template engine
Switching should be simple as we are already using standard template tags for our variable.
Ofc the template engine needs to be light and fully available in a jar or any format requiring no installation to avoid adding new dependencies. For example https://github.com/HubSpot/jinjava or http://jtwig.org/
This is a "nice to have" feature so we should do it when we have more time.
Concerned: @di25cox , @gi26det , @svenk , @ga96nuvJean-Matthieu GallardJean-Matthieu Gallardhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/117Support material parameters2019-09-20T15:46:49+02:00Ghost UserSupport material parametersTODO:
* ~~Make Generic C/C++ kernels support parameters x {FV,ADERDG}~~
* ~~Make Generic fortran kernels support parameters.
( solutionUpdate is here the problem.)~~
* ~~In all kernels using the solver template argument swit...TODO:
* ~~Make Generic C/C++ kernels support parameters x {FV,ADERDG}~~
* ~~Make Generic fortran kernels support parameters.
( solutionUpdate is here the problem.)~~
* ~~In all kernels using the solver template argument switch to the constexpr for variables,parameters,and
order/basisSize.~~
* ~~MyElasticWaveSolver: Realised that we have to rethink our material approach
a little. Material parameters are not available from Q during the space-time predictor
computation which uses the space-time predictor or predictor DoF.
We need either:~~
* store the material values in solution(+previousSolution),
predictor,and space-time predictor
* ~~or pass the material values as separate argument
to flux,eigenvalues,ncp,matrixb etc.
We should then also pass them as separate arguments
to adjustedSolutionValues.~~
~~Follow up:
It might wise then to split the material parameters from the variables.~~ We have chosen the first option.
* ~~The Nonlinear 2D ADER-DG C/C++ kernels do not support materials yet.~~
* ~~The Nonlinear 3D ADER-DG C/C++ kernels do not support materials yet.~~
* ~~The Linear 2D ADER-DG C/C++ kernels do not support materials yet.~~
* ~~ The Linear3D ADER-DG C/C++ kernels do not support materials yet.~~
* The Fortran ADER-DG kernels do not support materials yet.
* The 2D FV kernels do not support materials yet.
* The 3D FV kernels do not support materials yet.
* The Linear 2D+3D ADER-DG C/C++ kernels need to consider material parameters in the AMR operators.
* (Optional) Transform all generic kernels to templates and use constexpr for variables,parameters and order/basisSize.
Questions:
* What happens at the boundary for the Finite Volumes method? Currently, we hand no material parameters in at the boundary.
* Are there no time averaged source terms in the spaceTimePredictorLinear2d/3d? Yes, I consider them now at least in the striding.
* The linear ADER-DG kernels work with time derivatives. Does it make sense to let the user set the nodal values?
* Why wasn't tempPointForceSources merged into tempSpaceTimeUnknowns. The API change wasn't necessary.
Follow ups:
* Add a boolean to the AbstractMySolver class indicating if we consider Linear or Nonlinear kernels.
* Distinguish more clearly between temporary array sizes and dof array sizes.
* Add a boolean to the AbstractMySolver class indicating if we consider Linear or Nonlinear kernels.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/110Compression2018-06-15T15:13:07+02:00Ghost UserCompression## TODO
* ~~add "hint-size" to the relevant compression fields.~~
* support the Finite Volumes solver degrees of freedom## TODO
* ~~add "hint-size" to the relevant compression fields.~~
* support the Finite Volumes solver degrees of freedomhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/98Scalability Tests (MPI,TBB,MPI+TBB)2019-09-20T15:46:43+02:00Ghost UserScalability Tests (MPI,TBB,MPI+TBB)This is my spreadsheet.
## Tests
Testing scalability with parameters
* dim=2,3,
* p=3,5,7,9
* h=???
## Systems
| system | processor | cores per processor | cores per node | memory per node|
|----------|--------------|-...This is my spreadsheet.
## Tests
Testing scalability with parameters
* dim=2,3,
* p=3,5,7,9
* h=???
## Systems
| system | processor | cores per processor | cores per node | memory per node|
|----------|--------------|-------------------------|-------------------|---------------------- |
| phi1 | 2 Intel Xeon E5-2650 2.0 GHz | 8 cores per processor | 16 cores per node | 64 GByte RAM |
### ADER-DG
Additional parameters:
* ts=fused,standard,
Runs;
* **TBB**
* Strong scaling
* [EulerFlow,2D,phi1,Gnu](/uploads/c4c454dc3c41fe7ee0295a07320db9c7/EulerFlow_TBB_Gnu.tar.gz)
* [Z4,3D,phi1,Gnu](/uploads/4bac6f1c44aa732c4c6c06116f540754/Z4_TBB_Gnu.tar.gz)
* Weak scaling
* **MPI**
* Strong scaling
* Weak scaling
* **MPI+TBB**
* Strong scaling
* Weak scaling
### Godunov FV
Additional parameters:
Comment: Fused time stepping results here in a single sweep scheme.
* ts=fused,standard
Runs;
* **TBB**
* Strong scaling
* Weak scaling
* **MPI**
* Strong scaling
* Weak scaling
* **MPI+TBB**
* Strong scaling
* Weak scaling
### Limiting ADER-DG (ADER-DG + FV)
Comment: Fused time stepping is not supported yet for the
limiting ADER-DG solver.
Additional parameters:
* ts=standard,fused
Runs;
* **TBB**
* Strong scaling
* [EulerFlow,Explosion(t=0.4),2D,phi1,Gnu](/uploads/27c9c32a34c46c75e7d918578972cc7b/EulerFlow_LimitingADERDG_TBB_Gnu.tar.gz)
* Weak scaling
* **MPI**
* Strong scaling
* Weak scaling
* **MPI+TBB**
* Strong scaling
* Weak scaling
## Setting up experiments on Hamilton (Durham's HPC)
Here I will document an (more or less) efficient approach to run the above experiments
on Hamilton.
### Installing the latest ExaHyPE and Peano snapshots
* **Installing ExaHyPE**: Hamilton has git version 1.7.1 installed. Unfortunately, I had login issues while I was trying to clone the ExaHyPE-Engine reppository.
I am thus required to perform a two stage process to get ExaHyPE on my login node.
1. Check ExaHyPE out on my laptop.
2. Optional: scp to mira. Login to mira.
3. scp to hamilton.
* **Installing Peano**: Hamilton has svn version 1.6.11 installed.
* Obtaining Peano is very convenient. It is done via:
``svn checkout http://svn.code.sf.net/p/peano/code/trunk peano``
### Building ExaHyPE
...
### Job scripts
We have currently three different job script "frameworks" available in the ExaHyPE-Engine
repository.
While Dominics job scripts are only suited for TBB on Hamilton and SuperMUC,
Tobias' job script suite is suited to perform MPI and MPI+TBB benchmarks on Hamilton (DUR).
Vasco's python based job script framework can basically run anything on SuperMUC.
I will try to merge the three approaches into a python version with different backends based on the
supercomputer.
* **Tobias'** bash job scripts can be found at ``~/dev/codes/c/ExaHyPE/ExaHyPE-Engine/Miscellaneous/JobScripts/hamilton``.
* **Dominics'** bash job scripts can be found at ``~/dev/codes/c/ExaHyPE/ExaHyPE-Engine/Miscellaneous/JobScripts/scaling``.
* **Vasco's** python job scripts can be found at ``~/dev/codes/c/ExaHyPE/ExaHyPE-Engine/Miscellaneous/JobScripts/``.
### Performance analysis scripts
* **Analysing domain decomposition**: ...
* **Plotting speedups**: ...
### Benchmark March 2017
* Dominic's MPI and MPI+TBB job scripts for Hamilton (SLURM) and SuperMUC (LoadLeveler) [Benchmark_Mar2017.tar.gz](/uploads/1966ff2a9b7d28d12037444fca09e213/Benchmark_Mar2017.tar.gz)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/137The GRMHD, SRMHD, and EulerFlow error plotters seem to not work correctly (Th...2018-06-15T15:13:42+02:00Ghost UserThe GRMHD, SRMHD, and EulerFlow error plotters seem to not work correctly (The ADERDG correctness is broken (GRMHD AccretionDisk 2D/3D) )The errors in the AccretionDisk3D, GRMHD application, looked very good (`max mesh ref = 0.5` on a domain `0.0 .. 2.0` in x,y,z) one week ago:
```
sven@nils:~/numrel/exahype/Engine-ExaHyPE/ApplicationExamples/GRMHD$ cat output/error-r...The errors in the AccretionDisk3D, GRMHD application, looked very good (`max mesh ref = 0.5` on a domain `0.0 .. 2.0` in x,y,z) one week ago:
```
sven@nils:~/numrel/exahype/Engine-ExaHyPE/ApplicationExamples/GRMHD$ cat output/error-rho.asc
plotindex time l1norm l2norm max min avg
1 0.000000e+00 5.669120e-12 7.324244e-11 3.750836e-09 1.676159e-13 8.555365e-13
2 1.333333e-02 1.852645e-06 2.236011e-06 3.977681e-05 2.178258e-13 2.698931e-07
3 2.000000e-02 2.860199e-06 3.615806e-06 6.110038e-05 2.300382e-13 4.156812e-07
4 2.666667e-02 3.469631e-06 4.545670e-06 7.338015e-05 2.300382e-13 5.023163e-07
5 3.333333e-02 3.877169e-06 5.212832e-06 8.077556e-05 2.300382e-13 5.578673e-07
6 4.000000e-02 4.180962e-06 5.713450e-06 8.532972e-05 2.300382e-13 5.966567e-07
7 4.666667e-02 4.433994e-06 6.101790e-06 8.813653e-05 2.300382e-13 6.268654e-07
8 5.333333e-02 4.662388e-06 6.411162e-06 8.981804e-05 2.300382e-13 6.524981e-07
9 6.000000e-02 4.880226e-06 6.663393e-06 9.074876e-05 2.300382e-13 6.756477e-07
```
The solution was stationary and the errors should always stay in the same order of magnitude. They did.
Now, we experience something different *both in 2D and 3D ADERDG* and probably also in the FV scheme (but this could have another source).
This ticket shall track these problems.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/160Support a Tecplot-compatible output format2018-06-15T15:13:42+02:00Ghost UserSupport a Tecplot-compatible output formatThis is really low priority, but just to not forget about:
TecPlot360 is a proprietary but amazing visualization software which feels (in my hands) much better then Visit or ParaView. However, it does not (even) load the smalles common ...This is really low priority, but just to not forget about:
TecPlot360 is a proprietary but amazing visualization software which feels (in my hands) much better then Visit or ParaView. However, it does not (even) load the smalles common denominator VTK. Instead, a number of file formats are supported in-house:
```
â€¢ CGNS Loader
â€¢ DEM Loader
â€¢ DXF Loader
â€¢ EnSight Loader
â€¢ Excel Loader
â€¢ FEA Loader
â€¢ FLOW-3D Loader
â€¢ FLUENT Loader
â€¢ General Text Loader
â€¢ HDF Loader
â€¢ HDF5 Loader
â€¢ Kiva Loader
â€¢ PLOT3D Loader
â€¢ PLY Loader
â€¢ Tecplot-Format Loader
â€¢ Text Spreadsheet Loader
```
See also: http://home.ustc.edu.cn/~cbq/360_data_format_guide.pdf with meaningful pictures as this one:
![beautiful-tecplot-nodal-values](/uploads/c4ea3adb5281c5c2f713199792c50f53/beautiful-tecplot-nodal-values.png)
Instead of Trentos code, we don't want to implement a writer which writes only the tecplot-specific format but instead use some of these widespread formats, for instance "FLUENT" or so.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/154Min and max is not send to master/worker2018-06-15T15:13:43+02:00Ghost UserMin and max is not send to master/worker* Currently, the min and max is not send between master and worker ranks. This issue does only affect certain AMR+MPI builds
where a cell is at a master/worker boundary but at the same at a remote neighbour boundary.
* Furthermore, ...* Currently, the min and max is not send between master and worker ranks. This issue does only affect certain AMR+MPI builds
where a cell is at a master/worker boundary but at the same at a remote neighbour boundary.
* Furthermore, I have to discuss the whole forking process performed by Peano with Tobias.