ExaHyPE issueshttps://gitlab.lrz.de/groups/exahype/-/issues2018-03-02T13:23:47+01:00https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/210Efficent output format for large simulations2018-03-02T13:23:47+01:00Leonhard RannabauerEfficent output format for large simulationsAs soon as output reaches a certain size plotting is our major bottleneck. (a mesh of ~40 mio dofs with 14 unknowns for pvtu on supermuc takes 1h per plot)
We should look into different approaches like the ASYNC lib https://github.com/TU...As soon as output reaches a certain size plotting is our major bottleneck. (a mesh of ~40 mio dofs with 14 unknowns for pvtu on supermuc takes 1h per plot)
We should look into different approaches like the ASYNC lib https://github.com/TUM-I5/ASYNC which sacrifices a singe thread per rank on output.
For supermuc: We should also start to look into parallel file systems like LUSTRE https://en.wikipedia.org/wiki/Lustre_(file_system).Leonhard RannabauerLeonhard Rannabauerhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/211Multicore-ise LoadBalancing mapping2019-09-20T15:46:28+02:00Ghost UserMulticore-ise LoadBalancing mappingThis affects the concurrency level within the grid setup iterations.This affects the concurrency level within the grid setup iterations.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/212Number of Plotters is not detected2018-09-10T19:43:27+02:00Ghost UserNumber of Plotters is not detectedWhen you compile an ExaHyPE application with a Specfile with *n* plotters but run it with *m* plotters (say the types of the first *m-n* plotters are the same), there is no detection that `m != n`. This leads to weird errors which even a...When you compile an ExaHyPE application with a Specfile with *n* plotters but run it with *m* plotters (say the types of the first *m-n* plotters are the same), there is no detection that `m != n`. This leads to weird errors which even are hardly understandable in DEBUG mode:
```
0.0310583 15:50:15 [nils]rank:0 debug exahype::parser::Parser::getIdentifierForPlotter() found token notoken (file:/home/sven/numrel/exahype/Engine-ExaHyPE/ExaHyPE/exahype/parser/Parser.cpp,line:999)
assertion in file /home/sven/numrel/exahype/Engine-ExaHyPE/ExaHyPE/exahype/parser/Parser.cpp, line 1001 failed: token.compare(_noTokenFound) != 0
parameter token: notoken
parameter solverNumber: 0
parameter plotterNumber: 3
ExaHyPE-GRMHD: /home/sven/numrel/exahype/Engine-ExaHyPE/ExaHyPE/exahype/parser/Parser.cpp:1001: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> exahype::parser::Parser::getIdentifierForPlotter(int, int) const: Assertion `false' failed.
Abgebrochen
```
but completely nonunderstandable in Release mode:
```
0.00535548 15:32:20 [nils]rank:0 error exahype::parser::Parser::getFirstSnapshotTimeForPlotter() 'GRMHDSolver_FV' - plotter 3: 'time' value must be a float. (file:/home/sven/numrel/exahype/Engine-ExaHyPE/ExaHyPE/exahype/parser/Parser.cpp,line:1046)
0.00538875 15:32:20 [nils]rank:0 error exahype::parser::Parser::getRepeatTimeForPlotter() 'GRMHDSolver_FV' - plotter 3: 'repeat' value must be a float. (file:/home/sven/numrel/exahype/Engine-ExaHyPE/ExaHyPE/exahype/parser/Parser.cpp,line:1067)
```
note that in this example, `n=4` and `m=3`, so especially plotter 3 looked allright where the error message tried to complain actually about the nonexisting plotter 4.
This is very bad. We need some better enforcement of this rule.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/213Named variables in VTK plots2018-09-10T19:38:52+02:00Ghost UserNamed variables in VTK plotsI want to have them. I will implement this tonight via the plotter mapping class as an optional virtual function (isn't this already there? HDF5 uses it) and pass the data straight forwards throught VTK. This will result in a patch for p...I want to have them. I will implement this tonight via the plotter mapping class as an optional virtual function (isn't this already there? HDF5 uses it) and pass the data straight forwards throught VTK. This will result in a patch for peano.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/214Faster mesh refinement2018-03-02T11:23:39+01:00Ghost UserFaster mesh refinementProblem 1 (Costly operation is performed while no parallelism available)
------------------------------------------------------------------------
Peano's shared-memory parallelism is based on identifying regular subgrids in the
tree. If...Problem 1 (Costly operation is performed while no parallelism available)
------------------------------------------------------------------------
Peano's shared-memory parallelism is based on identifying regular subgrids in the
tree. If a new cells is introduced to the tree it might not be yet be identified
as part of a regular subgrid and the operations performed on the cell
are thus parallelised.
Solution: If imposing initial conditions or evaluating a refinement criterion is too
expensive we could let the user choose to perform it as background task.
This might lead to more mesh setup iteration but potentially to a better
exploitation of the available cores. It should also benefit the hiding of MPI communication
during the mesh setup.
Problem 2 (Overall concurrency)
-------------------------------
ExaHyPE's regular shared-memory parallelism during the mesh setup is currently further limited as
multiple cells might write to the same vertex in order to set refinement events.
Solution: We should be able to solve this by inverting the control. The
vertex checks in touchVertexLastTime if any cell has set a refinement event,
and refines if that is the case. This would increase the concurrency of the
enterCell operations.
Problem 3 (Memory)
-------------------------------
ExaHyPE's initial mesh setup is performed at the beginning by a single rank.
Gradually more and more ranks are added. In order to prevent that
any of the ranks runs out of memory during the initial mesh setup,
it might make sense to only temporarily allocate memory, impose initial
conditions, evaluate the refinement criterion, and then free the memory again (better:
recylce it).
After the initial mesh setup, we would then allocate memory on all ranks
and impose initial conditions.
Problem 4 (Load Balancing)
-------------------------------
The load-balancing does currently only count the number of cells.
It does not take the different cell types in ExaHyPE's grid into account.
The helper cell types Descendant and Ancestor have way less work to do than
the compute cells of type Cell.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/215User Defined plotters API: Pass the information from the specfile2019-04-15T12:36:16+02:00Ghost UserUser Defined plotters API: Pass the information from the specfileHow can we access:
* Name of output file
* Full information about cell (Limiting status, etc.)
I think the plotting API is hiding too much information. The UserDefinedADERDG plotter should pass more information to the user.
This is so...How can we access:
* Name of output file
* Full information about cell (Limiting status, etc.)
I think the plotting API is hiding too much information. The UserDefinedADERDG plotter should pass more information to the user.
This is something I can do :)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/216Issue during the compilation2018-04-10T18:36:23+02:00Ghost UserIssue during the compilationHi, I get the following error during the compilation with the last updates:
/WORK/maurizio_exa/ExaHyPE-Engine/ExaHyPE/exahype/solvers/ADERDGSolver.cpp: In member function ‘virtual void exahype::solvers::ADERDGSolver::mergeWithNeighbourD...Hi, I get the following error during the compilation with the last updates:
/WORK/maurizio_exa/ExaHyPE-Engine/ExaHyPE/exahype/solvers/ADERDGSolver.cpp: In member function ‘virtual void exahype::solvers::ADERDGSolver::mergeWithNeighbourData(int, const HeapEntries&, int, int, const tarch::la::Vector<2, int>&, const tarch::la::Vector<2, int>&, const tarch::la::Vector<2, double>&, int)’:
/WORK/maurizio_exa/ExaHyPE-Engine/ExaHyPE/exahype/solvers/ADERDGSolver.cpp:3566:27: error: ‘s’ was not declared in this scope
lFhbnd,dofPerFace,s
^https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/217MPI bug2019-09-20T15:43:08+02:00Ghost UserMPI bugHi, with the last updates I'm not able to run the code using MPI, in parcitular I get the message reported below. This can be reproduced with the GPR application that is in the repository and even if I turn off all the plotters. The seri...Hi, with the last updates I'm not able to run the code using MPI, in parcitular I get the message reported below. This can be reproduced with the GPR application that is in the repository and even if I turn off all the plotters. The serial version seems to work.
```
0.643946 [CERVINO],rank:0 info exahype::runners::Runner::startNewTimeStep(...) step 0 t_min =0
0.643966 [CERVINO],rank:0 info exahype::runners::Runner::startNewTimeStep(...) dt_min =6.5714e-05
0.643983 [CERVINO],rank:0 info exahype::runners::Runner::runTimeStepsWithFusedAlgorithmicSteps(...) plot
[CERVINO:07241] *** Process received signal ***
[CERVINO:07241] Signal: Segmentation fault (11)
[CERVINO:07241] Signal code: Address not mapped (1)
[CERVINO:07241] Failing at address: (nil)
[CERVINO:07241] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x13150)[0x7f24fae22150]
[CERVINO:07241] [ 1] ./ExaHyPE-GPR(+0x48e949)[0x557f27611949]
[CERVINO:07241] [ 2] ./ExaHyPE-GPR(+0x494e77)[0x557f27617e77]
[CERVINO:07241] [ 3] ./ExaHyPE-GPR(+0x4a7170)[0x557f2762a170]
[CERVINO:07241] [ 4] ./ExaHyPE-GPR(+0x4a741e)[0x557f2762a41e]
[CERVINO:07241] [ 5] ./ExaHyPE-GPR(+0x2bbd7d)[0x557f2743ed7d]
[CERVINO:07241] [ 6] ./ExaHyPE-GPR(+0x2beaa6)[0x557f27441aa6]
[CERVINO:07241] [ 7] ./ExaHyPE-GPR(+0x32f84b)[0x557f274b284b]
[CERVINO:07241] [ 8] ./ExaHyPE-GPR(+0x365f95)[0x557f274e8f95]
[CERVINO:07241] [ 9] ./ExaHyPE-GPR(+0x3a67f6)[0x557f275297f6]
[CERVINO:07241] [10] ./ExaHyPE-GPR(+0x3a76f7)[0x557f2752a6f7]
[CERVINO:07241] [11] ./ExaHyPE-GPR(+0x3a68bd)[0x557f275298bd]
[CERVINO:07241] [12] ./ExaHyPE-GPR(+0x3a76f7)[0x557f2752a6f7]
[CERVINO:07241] [13] ./ExaHyPE-GPR(+0x3a68bd)[0x557f275298bd]
[CERVINO:07241] [14] ./ExaHyPE-GPR(+0x3a76f7)[0x557f2752a6f7]
[CERVINO:07241] [15] ./ExaHyPE-GPR(+0x3ca149)[0x557f2754d149]
[CERVINO:07241] [16] ./ExaHyPE-GPR(+0x3cc690)[0x557f2754f690]
[CERVINO:07241] [17] ./ExaHyPE-GPR(+0x306521)[0x557f27489521]
[CERVINO:07241] [18] ./ExaHyPE-GPR(+0x25419c)[0x557f273d719c]
[CERVINO:07241] [19] ./ExaHyPE-GPR(+0x25474e)[0x557f273d774e]
[CERVINO:07241] [20] ./ExaHyPE-GPR(+0x25671e)[0x557f273d971e]
[CERVINO:07241] [21] ./ExaHyPE-GPR(+0x375b0)[0x557f271ba5b0]
[CERVINO:07241] [22] ./ExaHyPE-GPR(+0x37b09)[0x557f271bab09]
[CERVINO:07241] [23] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f24faa501c1]
[CERVINO:07241] [24] ./ExaHyPE-GPR(+0x45a5a)[0x557f271c8a5a]
[CERVINO:07241] *** End of error message ***
```https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/218Horizontal detection of insufficiently refined mesh for LimitingADERDGSolver2018-03-07T11:58:59+01:00Ghost UserHorizontal detection of insufficiently refined mesh for LimitingADERDGSolverCurrently, we restrict the limiter status up to the next coarser parent
in every iteration of the time stepping. We then evaluate on the coarser grids
if the limiter status is such that we need to refine. This then triggers refinement
r...Currently, we restrict the limiter status up to the next coarser parent
in every iteration of the time stepping. We then evaluate on the coarser grids
if the limiter status is such that we need to refine. This then triggers refinement
requests which force the time stepping to stop.
Restricting to the next coarser parent implies non-global master-worker communication
in MPI builds. This is not good.
To get rid of this master-worker communication during the time-stepping,
I propose to extend the limiter status range by a few more than one OK statuses.
If such an OK status is then propagating into a virtual child cell (Descendant),
we know that the mesh is not sufficiently refined and we halt the time stepping.
Some philosophy:
From the updates of the flags, we should further be able to predict in which direction a shock
propagates. We can then select more carefully which cell to refine next.
I should further rethink my whole limiter-based mesh refinement. Maybe it is more advantageous,
to do some bottom-up flagging for refinement. Instead of the current top-down approach
where I use halo-refinement around the limited regions.
With MPI switched on, I wonder however how well or badly this will interplay with the
load balancing during the initial mesh refinement.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/219Cancel all predictor background jobs when a predictor rerun is necessary2018-03-26T11:15:21+02:00Ghost UserCancel all predictor background jobs when a predictor rerun is necessaryhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/221Deeply check the public repository for CCZ42019-03-07T19:41:06+01:00Ghost UserDeeply check the public repository for CCZ4It is at https://github.com/exahype/exahype and we don't want to have the CCZ4 system in the code or any old commits. Make this sure by inspecting the code.
Cannot do it now since the repo is 70MB in size and I'm in a train with a bad w...It is at https://github.com/exahype/exahype and we don't want to have the CCZ4 system in the code or any old commits. Make this sure by inspecting the code.
Cannot do it now since the repo is 70MB in size and I'm in a train with a bad wifi.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/222Fix the M/h computation in the CCZ4/Writers/TimingStatisticsWriter.cpph2019-04-15T12:41:12+02:00Ghost UserFix the M/h computation in the CCZ4/Writers/TimingStatisticsWriter.cpphFrom a mail at Luke:
>
I just noticed there is something wrong in the M/h determination:
>
As explained in my last mail to Tobias and you, I just divide these
two numbers. However, the time in the first column of stdout measures
the tim...From a mail at Luke:
>
I just noticed there is something wrong in the M/h determination:
>
As explained in my last mail to Tobias and you, I just divide these
two numbers. However, the time in the first column of stdout measures
the time since program start.
>
In ExaHyPE, the grid setup sometimes takes a considerable amount --
like 10 minutes. If you measure the M/h straight after the first
timesteps after these 10 minutes, you get of course totally wrong
numbers. However, if you measure after 1000 minutes runtime, the 10
minutes grid setup do not change the result so much.
>
It is not hard to substract the time the grid setup needs in order to
improve the correctness of the number. You can do this either by hand
(just look up when the first time step started) or we can do this in
code (CCZ4/Writers/TimingStatisticsWriter.cpph).https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/223Implement new Space time predictor without (variable number of) picard loops2019-04-15T12:34:30+02:00Ghost UserImplement new Space time predictor without (variable number of) picard loopsThe task is relatively easy: Diff Dumbsers fortran prototype (in repository) and check out the changes in the generic kernels.
From Dumbser, 11. März 2018 um 11:20:
> Wäre natürlich super, wenn Du meinen Fortran Code in C übersetzen kö...The task is relatively easy: Diff Dumbsers fortran prototype (in repository) and check out the changes in the generic kernels.
From Dumbser, 11. März 2018 um 11:20:
> Wäre natürlich super, wenn Du meinen Fortran Code in C übersetzen könntest.
> Wenn es Probleme gibt,
> machen wir wieder eine Skype session.
> Das Format der Schleifen und der nötigen Berechnungen ist quasi dasselbe wie
> für alle anderen Rechnungen
> im space-time predictor, d.h. da kann man sehr viel übernehmen. Nur, dass
> man den Zeitindex nicht mehr
> mitschleppen muss, sondern nur im Raum arbeiten kann. Der einzige Kernel der
> geändert werden muss ist der
> space-time predictor (in 2D und 3D).
>
> Ich würde nur den second und third order initial guess implementieren, siehe
> den Code in
> SpaceTimePredictor.f90 unter
>
> #ifdef SECOND_ORDER_INITIAL_GUESS
>
> #ifdef THIRD_ORDER_INITIAL_GUESS
>
> Ich habe mich diese Woche auf Scaling und den Vergleich Runge-Kutta DG /
> ADER-DG konzentriert,
> d.h. an dem 2D FO-fCCZ4 habe ich nicht mehr weitergearbeitet. Ich will erst
> das GRMHD Paper vom
> Tisch haben.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/224Numerical details how to evolve CCZ4 with FD2019-04-15T12:34:55+02:00Ghost UserNumerical details how to evolve CCZ4 with FDCollection of what was written by Dumbser in various E-Mails in order to proceed in the Runge Kutta - Finte Differencing code (Cactus/Antelope/Okapi):
### Dumbser, 3. April 2018 um 11:03: FO-CCZ4 with Finite differencing works
> since S...Collection of what was written by Dumbser in various E-Mails in order to proceed in the Runge Kutta - Finte Differencing code (Cactus/Antelope/Okapi):
### Dumbser, 3. April 2018 um 11:03: FO-CCZ4 with Finite differencing works
> since Sven has reported some difficulties with the implementation of FO-CCZ4 in the Einstein toolkit last week, and since I wanted to understand the potential problems in depth, I have simply written my own finite difference code for the Einstein equations, based on central finite differences in space and Runge-Kutta time integration. According to Sven's and Elias' description, this is exactly what you are also doing in Cactus, right? The implementation is straightforward, since FD schemes are extremely simple.
>
>
>
> To do my tests, I have just copy-pasted my Fortran subroutine PDEFusedSrcNCP into the finite difference code, and I then insert FD point values of Q and central FD approximations for the first spatial derivatives. To save CPU time, I have done all computations in 1D so far.
>
>
>
> Please find attached the results that I have obtained for the Gauge wave with amplitude A=0.01 and A=0.1 until a final time of t=1000. I have used sixth order central FD in space and a classical third order Runge-Kutta scheme in time. For FO-CCZ4, I have set all damping coefficients to zero (kappa1=kappa2=kappa3=0), and I use c=0 with e=2. Zero shift (sk=0) and harmonic lapse. CFL number based on e is set to CFL=0.5 at the moment. Now the important points:
>
>
>
> 1. Runge-Kutta O3 is for now preferrable over Runge-Kutta O4, since it is intrinsically dissipative. The reason is that the fourth order time derivative term in the Taylor series on the left hand side remains with RK3, while it cancels with RK4, and when moving the term q_tttt to the right hand side and after Cauchy-Kowalevsky procedure, it becomes a fourth order spatial derivative term with negative sign, which is good for stability (second spatial derivatives must have positive sign on the right hand side, fourth spatial derivatives must have negative sign for stability. this is easy to check via Fourier series and the dispersion relation of the PDE).
>
>
>
> 2. The RK3 alone is not enough to stabilize the scheme for the larger amplitude A=0.1 of the Gauge Wave, but it is sufficient for A=0.01. I therefore explicitly needed to subtract a term of the type - dt/dx*u_xxxx, which is essentially a fourth order Kreiss-Oliger-type dissipation with appropriately chosen viscosity coefficient.
>
>
>
> I will now replace the Kreiss-Oliger dissipation which I do not like with the numerical dissipation that you would have obtained with a Rusanov flux in a fourth order accurate finite volume scheme. In the end, the dissipation operator will again be written as a finite difference, but there I know at least exactly what is going on and we will exactly know the amount of dissipation to be put (it can only be a function of the largest eigenvalue). So there will be NO parameter to be tuned. I will keep you updated on this.
>
>
>
> From my own results I can conclude that everything is working as expected, i.e. you must have at least one bug in your implementation of FO-CCZ4 in Cactus; or you have run the tests with the wrong parameters (please use CFL=0.5 for the moment, Kreiss-Oliger dissipation with a viscosity factor so that you get -dt/dx*u_xxxx on the right hand side in the end, please set kappa1=kappa2=kappa3=0 and set e=2 and c=0 in FO-CCZ4). If your code still does not run, I can send you my finite difference Fortran code to help you with the debugging.
>
>
>
> While running and implementing my FD code for FO-CCZ4 I have also been working on the vectorization of FO-CCZ4 in ADER-DG. The interesting news: the finite difference code requires more than 20 microseconds per FD point update, and ADER-DG with the good new initial guess for the space-time polynomial and proper vectorization needs also about 20 microseconds per DOF update for FO-CCZ4, i.e. ADER-DG is indeed becoming competitive with FD, who would have every believed this last year :-) On the new 512bit vector machines of RSC in Moscow, we expect the PDE to run even twice as fast, since the vector registers are twice as large as the current state of the art. We are aiming at a time per DOF update of about 10 microseconds. I will keep you informed.
### Dumbser, 4. April 2018 um 10:51:
>
However, my latest experiments show that you can also use RK4 together with a finite-volume type dissipative operator, which is very simple to
implement and which does not require any parameters to be tuned. It will just replace the Kreiss-Oliger dissipation. And by the way: in this setting,
the scheme can be run with CFL=0.9, which is what we want. I will send around more details later.
### Dumbser, 4. April 2018 um 18:23
> there are again good news from the finite difference for FO-CCZ4 front. Instead of your classical Kreiss-Oliger dissipation, I suggest to use the following dissipation operator, which should simply be "added" to the time derivatives of
>
> all quantities on the right hand side, i.e.:
>
>
> ```
> dudt(:,i,j,k) = dudt(:,i,j,k) - 1.0/dx(1)* 3.0/256.0* smax * ( -15.0*u(:,i+1,j,k)-u(:,i+3,j,k)-15.0*u(:,i-1,j,k)+6.0*u(:,i+2,j,k)+20.0*u(:,i,j,k)+6.0*u(:,i-2,j,k)-u(:,i-3,j,k) )
> ```
>
>
> where dudt(:,i,j,k) is the time derivative of the discrete solution computed by the existing Fortran function PDEFusedSrcNCP and smax is the maximum eigenvalue in absolute value. This operator derives from
>
>
> ```
> - 1/dx(1)*( fp - fm ),
> ```
>
>
> where the dissipative flux fp is defined as
>
>
> ```
> fp = - 1/2 * smax * ( uR - uL ),
> ```
>
>
> and uR and uL are the central high order polynomial reconstructions of u evaluated at the cell interface x_i+1/2. The flux fm is the same, but on the left interface x_i-1/2.
>
>
>
> Please find attached the new results for the Gauge Wave with A=0.1 amplitude. Everything looks fine, i.e., the ADM constraints as well as the waveform at the final time. Note that this simulation was now run with the fourth order
> Runge-Kutta scheme in time and using a CFL number of CFL=0.9 based on the maximum eigenvalue.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/225Provide vectorized user functions in the optimized kernels2019-03-07T19:41:41+01:00Ghost UserProvide vectorized user functions in the optimized kernelsThis is something Jean Matthieu should do.
Then we can immediately test a couple of PDEs, such as Euler, GRMHD or CCZ4. Dumbser shared his vectorized code also somewhere.This is something Jean Matthieu should do.
Then we can immediately test a couple of PDEs, such as Euler, GRMHD or CCZ4. Dumbser shared his vectorized code also somewhere.Jean-Matthieu GallardJean-Matthieu Gallardhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/226Use sensible Fortran flags2018-04-24T11:58:40+02:00Ghost UserUse sensible Fortran flagsAt the moment the fortran code is compiled with minimal flags, and therefore the compiler has its hands tied. However, the code is written in a way which should be easily vectorisable by the compiler.
Ekatherine from RSC is currently te...At the moment the fortran code is compiled with minimal flags, and therefore the compiler has its hands tied. However, the code is written in a way which should be easily vectorisable by the compiler.
Ekatherine from RSC is currently testing:
`-xCORE-AVX512 -fma -align array64byte`
and has mentioned that `-qopt-prefetch=3` worked well with the prototype code.
To be updated...https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/227Reflection for the parameter system2019-03-07T20:43:51+01:00Ghost UserReflection for the parameter systemCurrently, using runtime parameters requires to parse them somewhere, leading to code like
```c++
void GeometricBallLimiting::readParameters(const mexa::mexafile& para) {
radius = para["radius"].as_double();
std::string where = para...Currently, using runtime parameters requires to parse them somewhere, leading to code like
```c++
void GeometricBallLimiting::readParameters(const mexa::mexafile& para) {
radius = para["radius"].as_double();
std::string where = para["where"].as_string();
toLower(where);
if(where == "inside") limit_inside = true;
else if(where == "outside") limit_inside = false;
else {
logError("readParameters()", "Valid values for where are 'inside' and 'outside'. Keeping default.");
}
logInfo("readParameters()", "Limiting " << (limit_inside ? "within" : "outside of") << " a ball with radius=" << radius);
}
```
associated to a structure where the parameters are stored,
```c++
struct GeometricBallLimiting : public LimitingCriterionCode {
double radius; ///< Radius of the ball (default -1 = no limiting)
bool limit_inside; ///< Whether to limit inside or outside (default inside)
GeometricBallLimiting() : radius(-1), limit_inside(true) {}
bool isPhysicallyAdmissible(IS_PHYSICALLY_ADMISSIBLE_SIGNATURE) const override;
void readParameters(const mexa::mexafile& parameters) override;
};
```
This is lot's of overhead and really redundant.
In Cactus, the user can declare parameters *including their description/meaning and valid values* in a nice language, they are then made available as a structure by the glue code, all the parsing is abstracted away. Example of a Cactus parameter file (CCL file):
```
real eta "Damping coefficient for the Gamma Driver" STEERABLE=always
{
0:* :: "should be 1-2/M"
}0.2
KEYWORD evol_type "Which set of equations to evolve"
{
"BSSN" :: "traditional BSSN"
"Z4c" :: "Z4c"
"CCZ4" :: "(Covariant) and conformal Z4"
"FOCCZ4" :: "First order formulation of the CCZ4"
}"Z4c"
boolean include_theta_source "Only FO-CCZ4: set to false to remove the algebraic source terms of the type -2*Theta" STEERABLE=always
{
} yes
```
In the code, one then just has something like
```c++
struct parameters {
double eta;
std::string evol_type;
boolean include_theta_source;
}
```
which is already filled nicely with values.
While at least I certainly don't want to code such a glue code for ExaHyPE, in contrast with minimal effort we can get much better then the current MEXA system. In fact, it would be nice to use *OOP reflection* to automatically register class attributes for common data types (int/double/bool/string). That means I would be fine with writing
```c++
struct GeometricBallLimiting {
double radius;
enum class limit_at { inside, outside };
REGISTER_PARAM(radius, DEFAULT(-1), "Radius of the ball");
REGISTER_PARAM(limit_at, DEFAULT(limit_at::inside), "Where to limit");
}
```
In fact, something like this is possible some macro magic.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/228stableTimeStepSize kernels should return maximum eigenvalue for (classical) l...2018-06-14T16:10:38+02:00Ghost UserstableTimeStepSize kernels should return maximum eigenvalue for (classical) local time steppingIn general, we cannot associate the smallest time step size with the finest mesh
level as there might be larger eigenvalues present on coarser mesh levels.
This is an issue for local time stepping where you scale the smallest time
step ...In general, we cannot associate the smallest time step size with the finest mesh
level as there might be larger eigenvalues present on coarser mesh levels.
This is an issue for local time stepping where you scale the smallest time
step size by a factor k (k=3 in Peano) with decreasing mesh level.
The kernels should thus return the maximum eigenvalue, too, or only the maximum eigenvalue.
The minimum local time step size would then be computed according to:
```
dt_min = CFL * max_{over all cells} lambda / min_{over all cells} cellSize
```
which is different to what we are currently doing for global time stepping:
```
dt_min = CFL * min_{over all cells} ( lambda / cellSize )
```
The ADERDGSolver superclass would then decide which minimisation to
perform.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/157Plotter variables are not constants2018-09-10T19:41:33+02:00Ghost UserPlotter variables are not constantsActually the ExaHyPE runtime treats the number of writtenUnknowns per plotter as a runtime variable, ie.
```
plot hdf5::flash ConservedWriter
variables = 19
time = 0.0
repeat = 0.00001
output = ./hdf5-flash/c...Actually the ExaHyPE runtime treats the number of writtenUnknowns per plotter as a runtime variable, ie.
```
plot hdf5::flash ConservedWriter
variables = 19
time = 0.0
repeat = 0.00001
output = ./hdf5-flash/conserved
end plot
```
one can change `variables = x` without recompiling. However, the toolkit wants this to be a constant:
```
plot hdf5::flash ConservedWriter
variables const = 19
time = 0.0
repeat = 0.00001
output = ./hdf5-flash/conserved
end plot
```
otherwise it says `ERROR: eu.exahype.parser.ParserException: [71,17] expecting: 'const'`
This should not happen, ie. the toolkit grammar should accept without `const`.
As always, there is assumably no case when somebody wants to do this **except benchmarking plotter file formats which is exactly what I'm doing now** and the typical generated code by the toolkit is not aware of a non-constexpr number of variables, but all the `ExaHyPE/exahype/plotters/` code actually treats the number as runtime constant and I see no reason why to artificially introduce something constexpr here.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/153Constants in specfiles2018-09-10T19:42:06+02:00Ghost UserConstants in specfilesI have too much constants that a specfile syntax
```
solver Limiting-ADER-DG GRMHDSolver
variables const = rho:1,vel:3,E:1,B:3,psi:1,lapse:1,shift:3,gij:6
order const = 4
maximum-mesh-size...I have too much constants that a specfile syntax
```
solver Limiting-ADER-DG GRMHDSolver
variables const = rho:1,vel:3,E:1,B:3,psi:1,lapse:1,shift:3,gij:6
order const = 4
maximum-mesh-size = 6.0009
maximum-mesh-depth = 0
time-stepping = global
kernel const = generic::fluxes::nonlinear
language const = C
constants = initial_data:tovstar,boundary_x_lower:reflective,boundary_y_lower:reflective,boundary_z_upper:outgoing,tovstar-mass:1.234,tovstar-rl-ratio:2.345
```
would make sense. Tobias thinks in such a case a user would do something like
```
solver Limiting-ADER-DG GRMHDSolver
variables const = rho:1,vel:3,E:1,B:3,psi:1,lapse:1,shift:3,gij:6
order const = 4
maximum-mesh-size = 6.0009
maximum-mesh-depth = 0
time-stepping = global
kernel const = generic::fluxes::nonlinear
language const = C
constants = configuration-file:foobar.txt
```
and outsource the configuration in a language the user likes. However, this breaks with the idea of a single specfile for a single run. Therefore, I see two options
## Add a DATA section after the specfile
This is what many script languages do, for instance perl ([the DATA syntax in perl](https://stackoverflow.com/questions/13463509/the-data-syntax-in-perl)). The idea is just that the parsers ignore what goes after the end of the specfile (ie. the line containing `end exahype-project`). Users could dump there any content in their favourite language. I would vote for this as it is super easy to implement, allows file concatenation and flexibility.
## Allow user constant section
We could also just allow users to do something like
```
solver Limiting-ADER-DG GRMHDSolver
variables const = rho:1,vel:3,E:1,B:3,psi:1,lapse:1,shift:3,gij:6
order const = 4
maximum-mesh-size = 6.0009
maximum-mesh-depth = 0
time-stepping = global
kernel const = generic::fluxes::nonlinear
language const = C
constants = parameters:appended
limiter-kernel const = generic::musclhancock
limiter-language const = C
dmp-observables = 2
dmp-relaxation-parameter = 1e-2
dmp-difference-scaling = 1e-3
steps-till-cured = 0
simulation-parameters
foo = bar
baz = bar
blo = bar
blu = bar
etc.
end simulation-parameters
plot vtk::Cartesian::vertices::limited::binary ConservedWriter
variables const = 19
time = 0.0
repeat = 0.00166667
output = ./vtk-output/conserved
end plot
...
```
This would go well with the specfile syntax. In order to implement, we need
* Such a section with any key-value pairs added to the grammar, so the toolkit does not complain
* Support in the `Parser.cpp` (which is not too hard to add)