ExaHyPE issueshttps://gitlab.lrz.de/groups/exahype/-/issues2017-10-04T17:11:38+02:00https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/179Split TimeStepSizeComputation and merge parts with SolutionUpdate and LocalRe...2017-10-04T17:11:38+02:00Ghost UserSplit TimeStepSizeComputation and merge parts with SolutionUpdate and LocalRecomputation- TimeStepSizeComputation will be reduced to a simple
time step size computation function.
SolutionUpdate and LocalRecomputation will also
compute a time step size and will further advance in time.
This will reduce logic.
- Next step wi...- TimeStepSizeComputation will be reduced to a simple
time step size computation function.
SolutionUpdate and LocalRecomputation will also
compute a time step size and will further advance in time.
This will reduce logic.
- Next step will be a fusion of multiple algorithmic phases of the
ADER-DG and Limiting ADER-DG schemes in a single solver function.
This will hopefully make it easier to optimise for the compiler
and easier for the processor cache to decide what to hold or drop.
- We might be able to get rid of mapping FusedTimeSteppingInitialisation
with the above split.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/176Bug in optimised kernel generation for limiting ader-dg solver2017-09-06T13:58:22+02:00Ghost UserBug in optimised kernel generation for limiting ader-dg solverInclude file is wrong. Is <MySolver>_ADERDG.h for Limiting-ADER-DG solver.
```
/ddn/home/jdmd33/dev/ExaHyPE-Engine/./Benchmarks/hamilton/Euler/kernels/EulerSolver/stableTimeStepSize.cpp(7): catastrophic error: cannot open source file "E...Include file is wrong. Is <MySolver>_ADERDG.h for Limiting-ADER-DG solver.
```
/ddn/home/jdmd33/dev/ExaHyPE-Engine/./Benchmarks/hamilton/Euler/kernels/EulerSolver/stableTimeStepSize.cpp(7): catastrophic error: cannot open source file "EulerSolver.h"
#include "EulerSolver.h"
^
```Jean-Matthieu GallardJean-Matthieu Gallardhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/174Generic kernels: Select fluxes,ncp similar as for optimised kernels?2017-10-04T16:06:01+02:00Ghost UserGeneric kernels: Select fluxes,ncp similar as for optimised kernels?This issue is open for discussion.
Issue
-----
Working with new ExaHyPE users has revealed that they often are not familar
with the object-oriented programming concept of inheritance.
Especially, they do not know how to overriding vir...This issue is open for discussion.
Issue
-----
Working with new ExaHyPE users has revealed that they often are not familar
with the object-oriented programming concept of inheritance.
Especially, they do not know how to overriding virtual functions
of the AbstractMySolver class.
They are further not familiar with the keywords "virtual" and "override".
This issue makes it difficult for them to select the
right PDE kernels (flux,ncp,...) for their application.
Even worse: The code might even compile and run but it will not perform
the expected calculations.
Such an error is very hard to detect in practice.
Especially for a new ExaHyPE user.
Toolkit-based solution (open for discussion)
--------------------------------------------
JM has moved the selection of the
PDE kernels to the toolkit by requiring the
user to specify the kernels in the following way:
```
kernels const = optimised::fluxes::nonlinear // flux only
```
or
```
kernels const = optimised::fluxes::ncp::nonlinear // flux and ncp
```
or
```
kernels const = optimised::fluxes::ncp::source::nonlinear // flux and ncp, source
```
In my opinion, this is the better approach.
The "const" modifier of "kernels" indicates that the user has to
rerun the toolkit everytime he selects different PDE-kernels
The toolkit will then update the AbstractSolver Header file.
The compiler will deal with any inconsistencies between
the files:
* The compiler will tell you if you have not implemented a
kernel you have specified - not an assertion
as it is the case right now.
(Users often do not even know about the Asserts Mode.)
* The compiler will tell you if you have implemented
a method which is not called. You should then
comment out the implementation or remove it.
What is your opinion?
---------------------
Please comment below.Jean-Matthieu GallardJean-Matthieu Gallardhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/171Bug in LimitingADERDGSolver MPI implementation2018-03-20T16:14:23+01:00Ghost UserBug in LimitingADERDGSolver MPI implementationMin and max is not send correctly to neighbour if Heap neighbour comm. is
configured as non-blocking (CreateCopiesOfSentData=false).Min and max is not send correctly to neighbour if Heap neighbour comm. is
configured as non-blocking (CreateCopiesOfSentData=false).https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/170Clear out the private member variables in the solvers2017-11-21T13:40:26+01:00Ghost UserClear out the private member variables in the solversMany of the private member variables in in the ADERDGSolver,FiniteVolumesSolver,
LimitingADERDGSolver can be computed. This includes cardinalities.
Only store the order of approximation, number of variables and number of parameters.
Comp...Many of the private member variables in in the ADERDGSolver,FiniteVolumesSolver,
LimitingADERDGSolver can be computed. This includes cardinalities.
Only store the order of approximation, number of variables and number of parameters.
Compute all other cardinalities; provide getter functions.
- It would be better if the optimised solver would dismiss the "getBnd.." functions and would
overwrite the existing, now virtual, "getUnknowns..." and "getData.." functions.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/168Metadata-merging currently not done properly2017-08-14T12:53:04+02:00Ghost UserMetadata-merging currently not done properlyMy latest changes introduced a bug into codes using AMR or LimitingADERDGSolver:
There is only a merge of local metadata performed in the first iteration.My latest changes introduced a bug into codes using AMR or LimitingADERDGSolver:
There is only a merge of local metadata performed in the first iteration.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/167Grid cells are not erased properly2017-08-16T20:03:29+02:00Ghost UserGrid cells are not erased properly... only the cell description is deleted. Empty remains in the grid.
Furthermore, we do not erase all cells and cell descriptions at the end of the simulation.
We rely on the OS's process manager to free the application memory.... only the cell description is deleted. Empty remains in the grid.
Furthermore, we do not erase all cells and cell descriptions at the end of the simulation.
We rely on the OS's process manager to free the application memory.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/166Peano Heaps and Intel MPI do not work properly together2017-08-25T11:26:38+02:00Ghost UserPeano Heaps and Intel MPI do not work properly together(Open MPI seems to work.)
Single-node MPI tests show:
* MPI is currently not working correctly.
* TBB seems to work correctly.
It's always fun to debug MPI...
# 1 MPI+TBB
## 1.1
```
Euler_ADERDG-no-output-gen-fused...(Open MPI seems to work.)
Single-node MPI tests show:
* MPI is currently not working correctly.
* TBB seems to work correctly.
It's always fun to debug MPI...
# 1 MPI+TBB
## 1.1
```
Euler_ADERDG-no-output-gen-fused-regular-0-p5-TBB-Intel-n1-t1-c24.out
8.68453 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 0 t_min =0
8.68455 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000270382
8.68459 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
9.77685 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 1 t_min =0.000270382
9.77688 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000270382
9.7769 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
10.7874 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 2 t_min =0.000540765
10.7875 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269703
10.7875 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
11.8094 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 3 t_min =0.000810468
11.8094 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269363
11.8095 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
12.8234 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 4 t_min =0.00107983
12.8234 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269194
12.8235 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
13.8676 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 5 t_min =0.00134902
13.8676 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269109
13.8676 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
14.8835 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 6 t_min =0.00161813
14.8835 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269066
14.8836 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
15.8789 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 7 t_min =0.0018872
15.879 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269045
15.879 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
16.9168 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 8 t_min =0.00215624
16.9168 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269034
16.9168 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
17.9286 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 9 t_min =0.00242528
17.9286 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269029
17.9286 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
18.9628 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 10 t_min =0.00269431
18.9629 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269026
18.9629 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
19.9945 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 11 t_min =0.00296333
19.9945 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269025
19.9955 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
21.0106 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 12 t_min =0.00323236
21.0107 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269024
21.0107 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of mesh refinements = 0
21.0107 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of local recomputations = 0
21.0107 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of global recomputations = 0
21.0107 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of predictor reruns = 0
```
## 1.2
```
Euler_ADERDG-no-output-gen-fused-regular-0-p5-TBB-Intel-n1-t24-c1.out
5.84491 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::createGrid(Repository) finished grid setup after 17 iterations
6.1414 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runAsMaster(...) initialised all data and computed first time step size
8.1216 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runAsMaster(...) plotted initial solution (if specified) and computed first predictor
8.12165 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 0 t_min =0
8.12167 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000270382
8.12171 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
11.0395 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) recompute space-time predictor
13.2524 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 1 t_min =0.000270382
13.2525 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000251415 !!! DIFFERENCE !!!
13.2525 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
16.445 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) recompute space-time predictor
18.9004 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 2 t_min =0.000521798
18.9005 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000242749
18.9005 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
22.8658 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) recompute space-time predictor
25.8135 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 3 t_min =0.000764547
25.8136 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000235312
25.8136 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
30.957 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 4 t_min =0.000999859
30.9571 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000235312
156.472 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 13 t_min =0.00315762
156.472 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000242922
156.472 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of mesh refinements = 0
156.472 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of local recomputations = 0
156.472 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of global recomputations = 0
156.472 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of predictor reruns = 6
```
## 1.3
```
Euler_ADERDG-no-output-gen-fused-regular-0-p5-TBB-Intel-n1-t4-c6.out
16.3129 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 0 t_min =0
16.3129 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000270382
16.313 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
25.2424 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) recompute space-time predictor
31.3638 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 1 t_min =0.000270382
31.3638 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000255681 !!! DIFFERENCE !!!
31.3639 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
40.3852 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 2 t_min =0.000526063
40.3853 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000255681
40.3853 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
49.7392 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) recompute space-time predictor
57.1134 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 3 t_min =0.000781745
57.1135 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000248662
57.1135 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
67.3553 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 4 t_min =0.00103041
67.3554 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000248662
...
...
...
175.134 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
193.899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 12 t_min =0.00311021
193.899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.00026858
193.899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of mesh refinements = 0
193.899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of local recomputations = 0
193.899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of global recomputations = 0
193.899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of predictor reruns = 2
```
# 2 MPI+None
## 2.1
```
Euler_ADERDG-no-output-gen-fused-regular-0-p5-None-Intel-n1-t1-c24.out
17.7097 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 0 t_min =0
17.7097 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000270382
17.7098 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
28.1517 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 1 t_min =0.000270382
28.1518 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000270382
28.1518 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
38.5223 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 2 t_min =0.000540765
38.5223 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269703
38.5223 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
48.7899 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 3 t_min =0.000810468
48.79 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269363
48.79 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
59.1023 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 4 t_min =0.00107983
59.1023 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269194
...
...
...
130.907 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
141.16 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 12 t_min =0.00323236
141.16 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000269024
141.16 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of mesh refinements = 0
141.16 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of local recomputations = 0
141.16 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of global recomputations = 0
141.16 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of predictor reruns = 0
```
## 2.2
```
Euler_ADERDG-no-output-gen-fused-regular-0-p5-None-Intel-n1-t24-c1.out
14.0067 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
16.5843 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) recompute space-time predictor
19.2036 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 1 t_min =0.000270382
19.2036 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000212189
19.2037 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
21.8836 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 2 t_min =0.000482572
21.8836 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000212189
21.8836 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
25.2896 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 3 t_min =0.000694761
25.2896 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000223607
25.2897 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
29.067 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 4 t_min =0.000918368
29.0671 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000229512
...
...
...
109.009 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::runOneTimeStepWithFusedAlgorithmicSteps(...) run 1 iterations with fused algorithmic steps
125.061 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 13 t_min =0.00303507
125.061 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =0.000243652
125.061 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of mesh refinements = 0
125.062 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of local recomputations = 0
125.062 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of global recomputations = 0
125.062 [cn7063.hpc.dur.ac.uk],rank:0 info exahype::runners::Runner::printStatistics(...) number of predictor reruns = 1
```https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/162Combine adapters SolutionUpdate and TimeStepSizeComputation2017-08-16T13:06:55+02:00Ghost UserCombine adapters SolutionUpdate and TimeStepSizeComputation......https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/161AMR+Limiting leads to dispersion errors2017-07-13T12:35:21+02:00Ghost UserAMR+Limiting leads to dispersion errors![amr_limiter_error](/uploads/4f1591cfd8d0db8a2fbc76675deec3a6/amr_limiter_error.png)
* Shock is not at the correct position.
* Problem appears for both, Godunov and MUSCL-Hancock methods.
* Problem affects MUSCL-Hancock more
P...![amr_limiter_error](/uploads/4f1591cfd8d0db8a2fbc76675deec3a6/amr_limiter_error.png)
* Shock is not at the correct position.
* Problem appears for both, Godunov and MUSCL-Hancock methods.
* Problem affects MUSCL-Hancock more
Problem is not solved yet
https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/160Support a Tecplot-compatible output format2018-06-15T15:13:42+02:00Ghost UserSupport a Tecplot-compatible output formatThis is really low priority, but just to not forget about:
TecPlot360 is a proprietary but amazing visualization software which feels (in my hands) much better then Visit or ParaView. However, it does not (even) load the smalles common ...This is really low priority, but just to not forget about:
TecPlot360 is a proprietary but amazing visualization software which feels (in my hands) much better then Visit or ParaView. However, it does not (even) load the smalles common denominator VTK. Instead, a number of file formats are supported in-house:
```
• CGNS Loader
• DEM Loader
• DXF Loader
• EnSight Loader
• Excel Loader
• FEA Loader
• FLOW-3D Loader
• FLUENT Loader
• General Text Loader
• HDF Loader
• HDF5 Loader
• Kiva Loader
• PLOT3D Loader
• PLY Loader
• Tecplot-Format Loader
• Text Spreadsheet Loader
```
See also: http://home.ustc.edu.cn/~cbq/360_data_format_guide.pdf with meaningful pictures as this one:
![beautiful-tecplot-nodal-values](/uploads/c4ea3adb5281c5c2f713199792c50f53/beautiful-tecplot-nodal-values.png)
Instead of Trentos code, we don't want to implement a writer which writes only the tecplot-specific format but instead use some of these widespread formats, for instance "FLUENT" or so.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/159DMP is illegally called for ordinary ADERDG-Solver2017-11-02T18:15:15+01:00Ghost UserDMP is illegally called for ordinary ADERDG-SolverThis are actually two problems:
1. For an ordinary ADERDG-Solver (not Limiter) where the variable `dmp-observables` is not given, the default value is **not** `0` but instead just a random number (NaN or MaX or whatever for int). ...This are actually two problems:
1. For an ordinary ADERDG-Solver (not Limiter) where the variable `dmp-observables` is not given, the default value is **not** `0` but instead just a random number (NaN or MaX or whatever for int). This is very bad but solvable in the Parser for me.
2. The method `mapDiscreteMaximumPrincipleObservables` in the abstract ADERDG solver is called. This should not happen at all! Why does it happen?
![dmp](/uploads/fd5837af72fe6f92e871426712399023/dmp.png)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/158AMR+LimitingADERDGSolver crashes for certain limiter status changes2017-11-02T18:15:15+01:00Ghost UserAMR+LimitingADERDGSolver crashes for certain limiter status changes# 1. Issue:
There is another issue with the MUSCL-Hancock solver which crashes in the min and max
determination after a global recomputation.
```
147.062 info exahype::runners::Runner::updateMeshFusedTimeStepping(......# 1. Issue:
There is another issue with the MUSCL-Hancock solver which crashes in the min and max
determination after a global recomputation.
```
147.062 info exahype::runners::Runner::updateMeshFusedTimeStepping(...) recompute solution locally (if applicable) and compute new time step size
assertion in file /home/dominic/dev/codes/c/ExaHyPE/ExaHyPE-Engine/./ExaHyPE/exahype/solvers/LimitingADERDGSolver.cpp, line 1111 failed: *(observablesMin+i)<std::numeric_limits<double>::max()
parameter i: 1
parameter solverPatch.toString(): (solverNumber:0,neighbourMergePerformed:[1,1,1,1],isInside:[1,1,0,1],parentIndex:69,isAugmented:0,newlyCreated:0,type:Cell,refinementEvent:None,level:5,offset:[0.518519,0],size:[0.0123457,0.0123457],previousCorrectorTimeStamp:1.79769e+308,previousCorrectorTimeStepSize:1.79769e+308,correctorTimeStepSize:0.000497671,correctorTimeStamp:0.0909671,predictorTimeStepSize:0,predictorTimeStamp:0.0914648,solution:2838,solutionAverages:2843,solutionCompressed:-1,previousSolution:2839,previousSolutionAverages:2841,previousSolutionCompressed:-1,update:2840,updateAverages:2842,updateCompressed:-1,extrapolatedPredictor:2844,extrapolatedPredictorAverages:2846,extrapolatedPredictorCompressed:-1,fluctuation:2845,fluctuationAverages:2847,fluctuationCompressed:-1,solutionMin:2848,solutionMax:2849,facewiseAugmentationStatus:[0,0,0,0],augmentationStatus:0,facewiseHelperStatus:[2,2,2,2],helperStatus:2,facewiseLimiterStatus:[0,0,0,0],limiterStatus:0,previousLimiterStatus:3,iterationsToCureTroubledCell:10,compressionState:Uncompressed,bytesPerDoFInPreviousSolution:305,bytesPerDoFInSolution:-296662368,bytesPerDoFInUpdate:32765,bytesPerDoFInExtrapolatedPredictor:0,bytesPerDoFInFluctuation:0)
ExaHyPE-Euler: /home/dominic/dev/codes/c/ExaHyPE/ExaHyPE-Engine/./ExaHyPE/exahype/solvers/LimitingADERDGSolver.cpp:1111: void exahype::solvers::LimitingADERDGSolver::determineSolverMinAndMax(exahype::solvers::LimitingADERDGSolver::SolverPatch&): Assertion `false' failed.
```
I have to gather more information about this first.
# 2. Issue
Cell of type 3 or 4 (FV->DG) changes to Troubled.
Neighbour is of Type 1 or 2 (DG->FV).
Example:
Before:
![before](/uploads/0a30bada69ece5db04146246ff72c5c1/before.png)
After:
![after](/uploads/fcf89aaaa40453a52ead9622bd283033/after.png)
* Fix 1: Need to stop iterations if situation detected in neighbour merging
-> write stable values, i.e. own values to ghost layer
-> Perform rollback in affected cells (=>irregular
limiter domain chage; requires local recomputation)
* Fix 2: Use less dissipative FV methods or higher order
ADERDG => finer subcell resolution
Or set parameter steps to cure troubled cells to a higher value
# 3. Found bugs:
* The whole global recomputation thing is more sophisticated than previously thought. I
have to be careful how I go back in time. This is based on the previous limiter status.
I am only allowed to delete patches after the recomputation.
# 4. Ways to increase stability
* Do not change Local Recomputation to Global Recomputation if limiter based mesh refinement is also necessary.
# 5. Algorithms:
(Stuff above is outdated. Keep it for reference.)
## Local Recomputation
1. limiter status spreading
2. local reinitialisation
3. local recomputation + local predictor computation
## Global Recomputation
1. limiter status spreading
2. global rollback (keep new limiter status) <-ensures we adjust the previous solution during mesh refinement
3. mesh refinement according to new limiter status
4. overwrite new limiter status with previous values
5. recompute time step size
6. reinitialise fused time stepping and recompute predictor
Problems:
* TimeStepSizeComputation does update the time stamps (solved)
* Need additional adapter for global rollback (global reinitialisation)
* I am not allowed to deallocate limiter patches during limiter status spreading
* Have to keep in mind to overwrite the limiter status in finalise mesh refinement.
## Mesh Refinement
1. mesh refinement according to ref. crit.
4. recompute time step size
5. reinitialise fused time stepping and recompute predictorhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/157Plotter variables are not constants2018-09-10T19:41:33+02:00Ghost UserPlotter variables are not constantsActually the ExaHyPE runtime treats the number of writtenUnknowns per plotter as a runtime variable, ie.
```
plot hdf5::flash ConservedWriter
variables = 19
time = 0.0
repeat = 0.00001
output = ./hdf5-flash/c...Actually the ExaHyPE runtime treats the number of writtenUnknowns per plotter as a runtime variable, ie.
```
plot hdf5::flash ConservedWriter
variables = 19
time = 0.0
repeat = 0.00001
output = ./hdf5-flash/conserved
end plot
```
one can change `variables = x` without recompiling. However, the toolkit wants this to be a constant:
```
plot hdf5::flash ConservedWriter
variables const = 19
time = 0.0
repeat = 0.00001
output = ./hdf5-flash/conserved
end plot
```
otherwise it says `ERROR: eu.exahype.parser.ParserException: [71,17] expecting: 'const'`
This should not happen, ie. the toolkit grammar should accept without `const`.
As always, there is assumably no case when somebody wants to do this **except benchmarking plotter file formats which is exactly what I'm doing now** and the typical generated code by the toolkit is not aware of a non-constexpr number of variables, but all the `ExaHyPE/exahype/plotters/` code actually treats the number as runtime constant and I see no reason why to artificially introduce something constexpr here.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/156LimitingADERDGSolver currently crashing with TBB and AMR2017-11-02T18:15:15+01:00Ghost UserLimitingADERDGSolver currently crashing with TBB and AMR* ~~Uniform grids: LimitingADERDGSolver fails assertion if TBB is switched on.
Probably something with the status spreading with TBB.~~
Seems to be solved now.
* Adaptive grids: LimitingADERDGSolver mesh update iterates foreve...* ~~Uniform grids: LimitingADERDGSolver fails assertion if TBB is switched on.
Probably something with the status spreading with TBB.~~
Seems to be solved now.
* Adaptive grids: LimitingADERDGSolver mesh update iterates forever
* Adaptive grids: LimitingADERDGSolver fails assertion.
Probably something because of the new domain boundary treatment.
(Not sure if this is still an issue.)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/155Bounding Box Scaling (virtually-expand-domain) Does Not Work2017-11-02T18:15:15+01:00Ghost UserBounding Box Scaling (virtually-expand-domain) Does Not WorkVirtually expanding the bounding box around the computational domain
is a Peano trick to shut off neighbour communication with the global master (rank 0).
Virtually expanding the bounding box does currently lead to problems in ExaHyP...Virtually expanding the bounding box around the computational domain
is a Peano trick to shut off neighbour communication with the global master (rank 0).
Virtually expanding the bounding box does currently lead to problems in ExaHyPE:
* It enables scenarios where a coarse grid vertex is on the boundary/outside of the domain,
but a fine grid vertex (and its h environment) is inside of the domain.
* It confuses the isFaceInside function in class Cell.
# Virtually Expanding the Domain
Some background on the virtually expand domain flag:
* Peano does only consider inside and boundary vertices for the (MPI) neighbour
merging. Outside vertices are ignored for this purpose.
* Since rank 1 is placed into a centre of 3^d child cells belonging to rank 0,
it will perform neighbour merging with rank 0 as long as those vertices are
either inside or directly at the boundary of the domain.
* Switching off neighbour merging directly at the domain boundary (``vertex.isBoundary()``) does not
make sense. The reason is that refinement at the boundary will introduce hanging nodes.
Boundary nodes should however be persistent. (Tobias' reasoning. Have to ask further why this is bad.)
* Virtually expanding the domain does place the nodes located at the remote boundary to
rank 0 outside of the domain
* From the above points, it is clear that virtually expanding the boundary is mandatory for reasonable MPI
scalability especially in 3d. This is exactly what we have observed in our 3D MPI experiments.
* This will be a little inconvenient for people who prescribe initial conditions at certain boundaries by means of (x,t).
(Seismic people know about this. Leonhard knows how to deal with it.)
# Remarks
* Virtually expanding the bounding box usually leads to a shrinking of the actual computational domain since
only inside cells are considered as within the computational domain.
ExaHyPE should thus tell the user what the shrinked domain looks like.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/154Min and max is not send to master/worker2018-06-15T15:13:43+02:00Ghost UserMin and max is not send to master/worker* Currently, the min and max is not send between master and worker ranks. This issue does only affect certain AMR+MPI builds
where a cell is at a master/worker boundary but at the same at a remote neighbour boundary.
* Furthermore, ...* Currently, the min and max is not send between master and worker ranks. This issue does only affect certain AMR+MPI builds
where a cell is at a master/worker boundary but at the same at a remote neighbour boundary.
* Furthermore, I have to discuss the whole forking process performed by Peano with Tobias.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/153Constants in specfiles2018-09-10T19:42:06+02:00Ghost UserConstants in specfilesI have too much constants that a specfile syntax
```
solver Limiting-ADER-DG GRMHDSolver
variables const = rho:1,vel:3,E:1,B:3,psi:1,lapse:1,shift:3,gij:6
order const = 4
maximum-mesh-size...I have too much constants that a specfile syntax
```
solver Limiting-ADER-DG GRMHDSolver
variables const = rho:1,vel:3,E:1,B:3,psi:1,lapse:1,shift:3,gij:6
order const = 4
maximum-mesh-size = 6.0009
maximum-mesh-depth = 0
time-stepping = global
kernel const = generic::fluxes::nonlinear
language const = C
constants = initial_data:tovstar,boundary_x_lower:reflective,boundary_y_lower:reflective,boundary_z_upper:outgoing,tovstar-mass:1.234,tovstar-rl-ratio:2.345
```
would make sense. Tobias thinks in such a case a user would do something like
```
solver Limiting-ADER-DG GRMHDSolver
variables const = rho:1,vel:3,E:1,B:3,psi:1,lapse:1,shift:3,gij:6
order const = 4
maximum-mesh-size = 6.0009
maximum-mesh-depth = 0
time-stepping = global
kernel const = generic::fluxes::nonlinear
language const = C
constants = configuration-file:foobar.txt
```
and outsource the configuration in a language the user likes. However, this breaks with the idea of a single specfile for a single run. Therefore, I see two options
## Add a DATA section after the specfile
This is what many script languages do, for instance perl ([the DATA syntax in perl](https://stackoverflow.com/questions/13463509/the-data-syntax-in-perl)). The idea is just that the parsers ignore what goes after the end of the specfile (ie. the line containing `end exahype-project`). Users could dump there any content in their favourite language. I would vote for this as it is super easy to implement, allows file concatenation and flexibility.
## Allow user constant section
We could also just allow users to do something like
```
solver Limiting-ADER-DG GRMHDSolver
variables const = rho:1,vel:3,E:1,B:3,psi:1,lapse:1,shift:3,gij:6
order const = 4
maximum-mesh-size = 6.0009
maximum-mesh-depth = 0
time-stepping = global
kernel const = generic::fluxes::nonlinear
language const = C
constants = parameters:appended
limiter-kernel const = generic::musclhancock
limiter-language const = C
dmp-observables = 2
dmp-relaxation-parameter = 1e-2
dmp-difference-scaling = 1e-3
steps-till-cured = 0
simulation-parameters
foo = bar
baz = bar
blo = bar
blu = bar
etc.
end simulation-parameters
plot vtk::Cartesian::vertices::limited::binary ConservedWriter
variables const = 19
time = 0.0
repeat = 0.00166667
output = ./vtk-output/conserved
end plot
...
```
This would go well with the specfile syntax. In order to implement, we need
* Such a section with any key-value pairs added to the grammar, so the toolkit does not complain
* Support in the `Parser.cpp` (which is not too hard to add)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/152Dynamic plotter registration2019-03-07T19:39:00+01:00Ghost UserDynamic plotter registrationUsers shall be able to add, comment out or remove plotters at runtime without recompiling. The plotter ordering should not be fixed.
This is not too hard to obtain, just requires changes at `KernelCalls.cpp` with having a generated plot...Users shall be able to add, comment out or remove plotters at runtime without recompiling. The plotter ordering should not be fixed.
This is not too hard to obtain, just requires changes at `KernelCalls.cpp` with having a generated plotter registration function (semi pseudocode)
```c++
Writer* kernels::getNamedWriter(std::string name, Solver& solver) {
if(name == "ConservedWriter") return new GRMHD::ConservedWriter(*static_cast<exahype::solvers::LimitingADERDGSolver*>(solver));
if(name == "usw") return new GRMHD::SomeOtherWriter(*static_cast<exahype::solvers::ADERDGSolver*>(solver));
if(name == ...
else
failure: Dont now this plotter type.
}
```
We then can replace the generated current section
```c++
exahype::plotters::RegisteredPlotters.push_back( new exahype::plotters::Plotter(0,0,parser,new GRMHD::ConservedWriter( *static_cast<exahype::solvers::LimitingADERDGSolver*>(exahype::solvers::RegisteredSolvers[0])) ));
exahype::plotters::RegisteredPlotters.push_back( new exahype::plotters::Plotter(0,1,parser,new GRMHD::ConservedWriter( *static_cast<exahype::solvers::LimitingADERDGSolver*>(exahype::solvers::RegisteredSolvers[0])) ));
exahype::plotters::RegisteredPlotters.push_back( new exahype::plotters::Plotter(0,2,parser,new GRMHD::ConservedWriter( *static_cast<exahype::solvers::LimitingADERDGSolver*>(exahype::solvers::RegisteredSolvers[0])) ));
exahype::plotters::RegisteredPlotters.push_back( new exahype::plotters::Plotter(0,3,parser,new GRMHD::ConservedWriter( *static_cast<exahype::solvers::LimitingADERDGSolver*>(exahype::solvers::RegisteredSolvers[0])) ));
exahype::plotters::RegisteredPlotters.push_back( new exahype::plotters::Plotter(0,4,parser,new GRMHD::IntegralsWriter( *static_cast<exahype::solvers::LimitingADERDGSolver*>(exahype::solvers::RegisteredSolvers[0])) ));
```
with a non-generated section (pseudocode)
```c+++
for(const int& solvernum : Parser->getSolvers()) {
int plotternum=0;
for( const string& plottername : Parser->getPlotterNamesForSolver(solvernum)) {
exahype::plotters::RegisteredPlotters.push_back(new exahype::plotters::Plotter(solvernum,plotternum++,parser,getNamedWriter(plottername, exahype::solvers::RegisteredSolvers[solvernum]));
}
}
```
This is something I can do on my own. If the Parser gives the neccessary data...https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/151Single-node MPI strong scaling differences between 2d and 3d2019-09-20T15:46:36+02:00Ghost UserSingle-node MPI strong scaling differences between 2d and 3dI am investigating the strong scaling behaviour of the 2d and 3d versions of ExaHyPE.
While the 2d version shows reasonable scalability, the 3d version does not.
* Experiments are performed on a single-node of SuperMUC Phase 2.
* A...I am investigating the strong scaling behaviour of the 2d and 3d versions of ExaHyPE.
While the 2d version shows reasonable scalability, the 3d version does not.
* Experiments are performed on a single-node of SuperMUC Phase 2.
* All plotters are turned off
* To exclude interconnect effects, all experiments are performed on a single-node.
In my experiments, I switch the master-worker communication (M/W)
on or off as well as the neighbour communication (N).
## Only Peano communication (M/W=off,N=off)
ranks | adapter name | iterations | total CPU time [t]=s | average CPU time [t]=s | total user time [t]=s | average user time [t]=s |
----------|-----------------------|-------------------|-----------------------------------|---------------------------------------|----------------------------------|---------------------------------------|
2 | ADERDGTimeStep | 29 | 37.07 | 1.27828 | 316.768 | 10.923 |
3 | ADERDGTimeStep | 29 | 36.25 | 1.25 | 305.752 | 10.5432 |
12 | ADERDGTimeStep | 29 | 27.04 | 0.932414 | 206.142 | 7.10834 |
28 | ADERDGTimeStep | 29 | 10.27 | 0.354138 | 24.4012 | 0.841421 |
## M/W=on, N=off
ranks | adapter name | iterations | total CPU time [t]=s | average CPU time [t]=s | total user time [t]=s | average user time [t]=s |
----------|-----------------------|-------------------|-----------------------------------|---------------------------------------|----------------------------------|---------------------------------------|
2 |ADERDGTimeStep | 29 | 37.58 | 1.29586 | 316.044 | 10.8981 |
3 | ADERDGTimeStep | 29 | 36.3 | 1.25172 | 306.977 | 10.5854 |
12 | ADERDGTimeStep | 29 | 27.21 | 0.938276 | 207.078 | 7.14064 |
28 | ADERDGTimeStep | 29 | 10.27 | 0.354138 | 24.5317 | 0.845921 |
## M/W=off, N=on
ranks | adapter name | iterations | total CPU time [t]=s | average CPU time [t]=s | total user time [t]=s | average user time [t]=s |
----------|-----------------------|-------------------|-----------------------------------|---------------------------------------|----------------------------------|---------------------------------------|
2 | ADERDGTimeStep | 29 | 39.52 | 1.36276 | 337.709 | 11.6451
3 | ADERDGTimeStep | 29 | 99.04 | 3.41517 | 995.378
12 | ADERDGTimeStep | 29 | 121.76 | 4.19862 | 1106.45 | 38.1534
28 | ADERDGTimeStep | 29 | 18.24 | 0.628966 | 105.858 | 3.65027
## M/W=on, N=on
Slightly worse than M/W=off,N=on.
# Insights:
* Rank 3 and 12 performance are load balancing issues.
* For the 28 rank run, the LB only deploys 10 ranks. This is actually a well-balanced setup for 10 ranks. (If we set 10 ranks, we have a load balancing issue again.)