ExaHyPE issueshttps://gitlab.lrz.de/groups/exahype/-/issues2017-09-04T14:16:41+02:00https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/177Batching required for LTS and ExaHyPE's limiter-based refinement2017-09-04T14:16:41+02:00Ghost UserBatching required for LTS and ExaHyPE's limiter-based refinementLimiting ADER-DG in ExaHyPE
---------------------------
In ExaHyPE, we do not consider adaptive refinement for the Finite Volumes (FV)
solver.
We thus only use FV on the finest level of the adaptive mesh
if we employ a limiting ADER-DG ...Limiting ADER-DG in ExaHyPE
---------------------------
In ExaHyPE, we do not consider adaptive refinement for the Finite Volumes (FV)
solver.
We thus only use FV on the finest level of the adaptive mesh
if we employ a limiting ADER-DG solver.
If a limiting ADER-DG solver detects a shock, i.e. a cell where we need to
limit non-physical oscillations, we refine it down to
the finest adaptive mesh level.
And we further refine its neighbours down to the finest mesh level.
Local Time Stepping
-------------------
TBC
Local Time Stepping plus limiting ADER-DG in ExaHyPE
-----------------------------------------------------
It is possible that a cell is marked as troubled during the
local time stepping. This will require potentially
more refinement around the cell, or to refine the cell itself.
The newly refined cells might not have done any time stepping
at all during the current LTS batch since
their parents stem from a coarser level.
They lag behind in time.
The question is what to do in those scenarios.
Batching
--------
One idea is to consider the number of local time steps to run as a batch.
We remember the solution before a batch starts.
In case (limiter-based) refinement is triggered, we memorise the cells to refine,
perform a rollback to the memorised solution.
Afterwards, we refine the memorised cells and run the batch again.
It can happen multiple times that we have to rerun a batch.
On the other hand, we ensure that our grid is always tracking a shock correctly.
Further techniques
------------------
On the expense of computational cost, we could refine the mesh generously (according to the restricted limiter status)
This would reduce the number of batch reruns.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/178Limiting ADER-DG: Solution min and max computation is very expensive2017-09-04T16:02:23+02:00Ghost UserLimiting ADER-DG: Solution min and max computation is very expensiveWe found that the min/max computation (in ExaHyPE) can currently be by a factor 10 more
expensive than the space-time predictor computation.
Test case was Euler equations in 3D with p=5 polynomials and a Sod shock tube scenario.We found that the min/max computation (in ExaHyPE) can currently be by a factor 10 more
expensive than the space-time predictor computation.
Test case was Euler equations in 3D with p=5 polynomials and a Sod shock tube scenario.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/179Split TimeStepSizeComputation and merge parts with SolutionUpdate and LocalRe...2017-10-04T17:11:38+02:00Ghost UserSplit TimeStepSizeComputation and merge parts with SolutionUpdate and LocalRecomputation- TimeStepSizeComputation will be reduced to a simple
time step size computation function.
SolutionUpdate and LocalRecomputation will also
compute a time step size and will further advance in time.
This will reduce logic.
- Next step wi...- TimeStepSizeComputation will be reduced to a simple
time step size computation function.
SolutionUpdate and LocalRecomputation will also
compute a time step size and will further advance in time.
This will reduce logic.
- Next step will be a fusion of multiple algorithmic phases of the
ADER-DG and Limiting ADER-DG schemes in a single solver function.
This will hopefully make it easier to optimise for the compiler
and easier for the processor cache to decide what to hold or drop.
- We might be able to get rid of mapping FusedTimeSteppingInitialisation
with the above split.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/180Compression - take ghostLayers and padding into account2017-09-26T10:19:58+02:00Ghost UserCompression - take ghostLayers and padding into account- FiniteVolumesSolver - Compression should take ghostLayers+padding (alignment) into account
- ADERDGSolver - Compression should take padding (alignment) into account- FiniteVolumesSolver - Compression should take ghostLayers+padding (alignment) into account
- ADERDGSolver - Compression should take padding (alignment) into accounthttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/181Get rid of TemporaryVariables2018-03-20T16:13:53+01:00Ghost UserGet rid of TemporaryVariables- We still have to debate if this makes sense.- We still have to debate if this makes sense.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/182Aligned heaps throw segmentation fault upon deleteData invocation2017-10-13T12:16:05+02:00Ghost UserAligned heaps throw segmentation fault upon deleteData invocation- This is especially an issue for adaptive simulations.
- Adaptive simulations are currently also not possible with the optimised kernels
since there are no optimised variants of the prolongation and restriction
kernels available yet...- This is especially an issue for adaptive simulations.
- Adaptive simulations are currently also not possible with the optimised kernels
since there are no optimised variants of the prolongation and restriction
kernels available yet.
TODO
- I will add a test for the heap allocation. Hopefully this will minimise the complexity
of the problem.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/183MPI FV doesnt compile2017-10-02T16:30:46+02:00Ghost UserMPI FV doesnt compileProblem occurs in a line with TODO(Dominic), so @di25cox :
The FiniteVolumeCellDescription does not contain a method getAdjacentToRemoteRank(), so building fails:
```
/home/sven/numrel/exahype/Engine-ExaHyPE/./ExaHyPE/exahype/solvers/...Problem occurs in a line with TODO(Dominic), so @di25cox :
The FiniteVolumeCellDescription does not contain a method getAdjacentToRemoteRank(), so building fails:
```
/home/sven/numrel/exahype/Engine-ExaHyPE/./ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp: In member function ‘virtual void exahype::solvers::FiniteVolumesSolver::preProcess(int, int) const’:
/home/sven/numrel/exahype/Engine-ExaHyPE/./ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp:2152:22: error: ‘exahype::solvers::FiniteVolumesSolver::CellDescription {aka class exahype::records::FiniteVolumesCellDescription}’ has no member named ‘getAdjacentToRemoteRank’
!cellDescription.getAdjacentToRemoteRank() // TODO(Dominic): What is going on here?
^
/home/sven/numrel/exahype/Engine-ExaHyPE/./ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp: In member function ‘virtual void exahype::solvers::FiniteVolumesSolver::postProcess(int, int)’:
/home/sven/numrel/exahype/Engine-ExaHyPE/./ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp:2168:24: error: ‘exahype::solvers::FiniteVolumesSolver::CellDescription {aka class exahype::records::FiniteVolumesCellDescription}’ has no member named ‘getAdjacentToRemoteRank’
!cellDescription.getAdjacentToRemoteRank() // TODO(Dominic): What is going on here?
^
```
This workaround works for me:
```
diff --git a/ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp b/ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp
index ddba006..949ba38 100644
--- a/ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp
+++ b/ExaHyPE/exahype/solvers/FiniteVolumesSolver.cpp
@@ -2149,7 +2149,7 @@ void exahype::solvers::FiniteVolumesSolver::preProcess(
cellDescription.getType()==CellDescription::Type::Cell
#ifdef Parallel
&&
- !cellDescription.getAdjacentToRemoteRank() // TODO(Dominic): What is going on here?
+ 1 // !cellDescription.getAdjacentToRemoteRank() // TODO(Dominic): What is going on here? // TODO FIX THIS LINE
#endif
) {
uncompress(cellDescription);
@@ -2165,7 +2165,7 @@ void exahype::solvers::FiniteVolumesSolver::postProcess(
cellDescription.getType()==CellDescription::Type::Cell
#ifdef Parallel
&&
- !cellDescription.getAdjacentToRemoteRank() // TODO(Dominic): What is going on here?
+ 1 // !cellDescription.getAdjacentToRemoteRank() // TODO(Dominic): What is going on here? // TODO FIX THIS LINE
#endif
&&
CompressionAccuracy>0.0
```
I pushed this to https://gitlab.lrz.de/exahype/ExaHyPE-Engine/commit/1d435a25b782e5cc30d10b27db30d8b8e6609c7d on the master. Should have used a merge request instead. Is a workaround anyway. Please fix it.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/184Decompose mapping Merging into two mappings2017-11-21T13:39:29+01:00Ghost UserDecompose mapping Merging into two mappingsMapping merging does currently perform neighbour merges (touchVertexFirstTime etc.) as well as the
merging of the time step data from the master with the worker.
Often only a merging of time step data is necessary. The touchVertexFirstTi...Mapping merging does currently perform neighbour merges (touchVertexFirstTime etc.) as well as the
merging of the time step data from the master with the worker.
Often only a merging of time step data is necessary. The touchVertexFirstTime merges
then reduce to a nop. In a shared memory context, this means that we add unnecessary overhead.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/185[Toolkit] Makefile generation with template engine + remove ARCHITECTURE2017-10-06T16:28:14+02:00Jean-Matthieu Gallard[Toolkit] Makefile generation with template engine + remove ARCHITECTURETODO
* Generate the Makefile using the template engine
* Remove the ARCHITECTURE parameter export, use the one from the spec fileTODO
* Generate the Makefile using the template engine
* Remove the ARCHITECTURE parameter export, use the one from the spec fileJean-Matthieu GallardJean-Matthieu Gallardhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/186Turn mapping events off on coarse levels2019-09-20T15:46:32+02:00Ghost UserTurn mapping events off on coarse levels* We basically will go from here:
```
peano::MappingSpecification
exahype::mappings::XYZ::enterCellSpecification(int level) const {
return peano::MappingSpecification(
peano::MappingSpecification::WholeTree,
peano::...* We basically will go from here:
```
peano::MappingSpecification
exahype::mappings::XYZ::enterCellSpecification(int level) const {
return peano::MappingSpecification(
peano::MappingSpecification::WholeTree,
peano::MappingSpecification::RunConcurrentlyOnFineGrid,true);
}
```
To here:
```
peano::MappingSpecification
exahype::mappings::XYZ::enterCellSpecification(int level) const {
if (level < exahype::solvers::getCoarsestMeshLevelOfAllSolvers()) {
return peano::MappingSpecification(
peano::MappingSpecification::Nop,
peano::MappingSpecification::RunConcurrentlyOnFineGrid,true);
}
return peano::MappingSpecification(
peano::MappingSpecification::WholeTree,
peano::MappingSpecification::RunConcurrentlyOnFineGrid,true);
}
```
* Second, we should set the alterState bool only to true in mappings where this is necessary, i.e.,
where we perform reductions or use temporary variables.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/187No time averaging and kernel's signature2017-10-24T15:25:13+02:00Jean-Matthieu GallardNo time averaging and kernel's signatureHi,
Small design decision. The no time averaging (NTA) option has the effect that the SpaceTimePredictor and VolumeIntegral don't take the same argument with or without the option.
In the case of the SpaceTimePredictor it's a few data ...Hi,
Small design decision. The no time averaging (NTA) option has the effect that the SpaceTimePredictor and VolumeIntegral don't take the same argument with or without the option.
In the case of the SpaceTimePredictor it's a few data storage that aren't required (the one storing the time averaged data) so it's not a bid deal as I can just pass the pointer that happen to be nullptr if NTA is enabled.
In the case of the VolumeIntegral however it's a bit dirtier: it needs either lFhi (time averaged flux) or lFi. Currently I have chosen to always gives the kernel both storage and let it use the right one, ignoring the other (lFhi might be nullptr, lFi is always correct). Another possibility since both lFi and lFhi are double* would be to use the same signature and just pass the correct one, which I could detect by checking if lFhi is nullptr but it would be a bit unclean because it would mean assuming it has to be nullptr when using NTA (which it currently is, but I don't like having such hidden relationship in the code).
Do you agree with the current design (giving everything to the kernel, possibly nullptr, and letting it do the correct choice based on its template parameters) or do you have another design preference ?
Best,
JM
@svenk @di25cox @gi26detJean-Matthieu GallardJean-Matthieu Gallardhttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/188Global Reduction of Time Stepping Data2017-10-11T11:50:37+02:00Ghost UserGlobal Reduction of Time Stepping DataI currently do the following to reduce and broadcast global
values, like e.g. the minimum time step size:
* I broadcast time step data from master to worker
all the way down to the "lowest" worker.
* I reduce time step data from worker...I currently do the following to reduce and broadcast global
values, like e.g. the minimum time step size:
* I broadcast time step data from master to worker
all the way down to the "lowest" worker.
* I reduce time step data from worker to master
all the way up to the global master rank.
I could use a simple MPI_Reduce and a simple MPI_Gather to
perform the above steps.~~
Postponed.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/189Bug in Solver mesh level computation if bounding box is virtually expanded2017-10-11T16:47:02+02:00Ghost UserBug in Solver mesh level computation if bounding box is virtually expandedThere is a bug in the mesh level computation if the boundy box is virtually expanded:
```
_coarsestMeshLevel =
exahype::solvers::Solver::computeMeshLevel(_maximumMeshSize,domainSize[0]);
```
Then, ``_domainSize != _boundingBoxSize``.There is a bug in the mesh level computation if the boundy box is virtually expanded:
```
_coarsestMeshLevel =
exahype::solvers::Solver::computeMeshLevel(_maximumMeshSize,domainSize[0]);
```
Then, ``_domainSize != _boundingBoxSize``.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/190Seg Fault in MPI Build if distributed-memory section is missing2019-09-20T15:46:30+02:00Ghost UserSeg Fault in MPI Build if distributed-memory section is missingThis occurs in Runner::initHPCEnvironment(...) and has
nothing to do with other Seg Faults occuring in builds using Alignment.This occurs in Runner::initHPCEnvironment(...) and has
nothing to do with other Seg Faults occuring in builds using Alignment.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/191Pass std::vectors to patchwise functions2017-10-13T20:55:05+02:00Ghost UserPass std::vectors to patchwise functions- It's safer - size is available e.g.
- Signatures are clearer. User kernels still get the raw pointers.
- We can just do vector.data() to pass a pointer to the existing kernels.- It's safer - size is available e.g.
- Signatures are clearer. User kernels still get the raw pointers.
- We can just do vector.data() to pass a pointer to the existing kernels.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/192[Peano] Number of persistent subgrids is changing from iteration to iteration2017-10-16T11:09:26+02:00Ghost User[Peano] Number of persistent subgrids is changing from iteration to iterationWe observed that if a rank has distributed all its local work to workers,
the number of cells of this rank which are stored in regular subgrids
might change every second iteration.We observed that if a rank has distributed all its local work to workers,
the number of cells of this rank which are stored in regular subgrids
might change every second iteration.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/193Inconsitencies with the solver constructors (ParserView, cmd line passing)2017-10-18T09:07:53+02:00Ghost UserInconsitencies with the solver constructors (ParserView, cmd line passing)Note: This is a _minor_ long term issue.
I noticed (once again) that we have inconsistencies in the handling of the user solver constructor, the abstract solver constructor and the user `init` function, all this for FV, ADERDG and Limit...Note: This is a _minor_ long term issue.
I noticed (once again) that we have inconsistencies in the handling of the user solver constructor, the abstract solver constructor and the user `init` function, all this for FV, ADERDG and LimitingADERDG solvers. While the Command line options passing (`std::vector<std::string>& cmdlineargs`) is mandatory in all solvers, they only get the `exahype::Parser::ParserView& constants` when there *are* constants in the specfile. This creates a lot of trouble for the users, for instance when they decide to introduce constants on an existing application because they need to introduce the new signature then on their own.
Even worse: Currently, the `ParserView&` works for FV solvers, but not for ADERDG solvers (at some point, a `ParserView` instead of a `ParserView&` is needed) and therefore also not for coupled LimitingADERDG solvers.
I know that Tobias wants to keep the first code the user see's in his solver as small as possible. So what about always passing the `cmdlineargs` and a `constants` reference and either storing everything in some class
```
struct SpecfileOptions { // or so
std::vector<std::string>& cmdlineargs;
exahype::Parser::ParserView& constants;
// could store even more, for instance general access to the parser,
// static build information, paths, etc.
};
```
and just pass a `SpecfileOptions& options` to the constructors and the `init` function.
As an alternative or additionally, we can make the `init` function virtual and put an empty implementation into the `Abstract*Solver.{cpp,h}`. Then the user could introduce it into his solver only when he needs it.
We are talking about startup code which is only called once, so this should really be as comfortable as possible, performance does not matter at all.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/194Erasing of Peano grid vertices2018-03-20T16:13:04+01:00Ghost UserErasing of Peano grid verticesCurrently, I remove only my data from the heap when I erase a cell.
The Peano grid structure still persists.
I tried a variety of tricks to remove the grid structure on the fly during
the mesh refinement but always run into problems:
- ...Currently, I remove only my data from the heap when I erase a cell.
The Peano grid structure still persists.
I tried a variety of tricks to remove the grid structure on the fly during
the mesh refinement but always run into problems:
- Inconsistent adjacency indices
- Spurious erases of neighbours
- Data races with TBB
- ...
I think now it does make more sense to erase Peano vertices after the
mesh refinement. Here the refined grid is fixed. Refinement does not
interfere with erasing.
I think of either:
- Erase Peano vertices on the fly during the time stepping iterations
however I do not know the overhead and impact on the parallelisation.
- Run a few iterations of a dedicated mesh erasing adapter after the
refinement.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/195Additional refinement along MPI boundaries2018-03-20T16:13:38+01:00Ghost UserAdditional refinement along MPI boundariesPeano's spacetree traversal is inverted every second iteration.
This poses a problem for ExaHyPE's master-worker communication
if the master rank has local subtrees.
Example:
--------
Imagine a uniform grid where a fork was performed on...Peano's spacetree traversal is inverted every second iteration.
This poses a problem for ExaHyPE's master-worker communication
if the master rank has local subtrees.
Example:
--------
Imagine a uniform grid where a fork was performed on Level 2. 3^d - 1 workers of our
master rank have been introduced. They and the master rank hold 1/3^d of the computational domain.
We call the portion belonging to the master rank, the local subtree of the master (rank).
In the first iteration, we correctly kick off all workers on the coarse grid before we descend into
the local subtree of the master.
In the second iteration, we start within the local subtree of the master and perform computations.
As soon as we reach the coarse grid, we kick off all workers.
The workers had to wait while we performed our computations on the master's local subtree.
Now finally, the workers will start their computations.
The whole tree traversal might take twice as long as assumed.
This becomes even worse if there are more Master-Worker boundaries.
The red bars in the plot below indicate such scenarios.
Tobias proposes to add additional refinement along the MPI boundary to prevent the need of vertical
communication. At least from the master to the worker. This might work.
I wonder however if it is really beneficial for ExaHyPE to change the traversal order in every second iteration.
In a MPI setting where we perform asynchronous communication, an inversion of the
traversal order is especially ill-suited since we then have to wait till the last message was received
by the neighbour. Instead, we would need to wait and (block) until all messages for the currently
touched vertex are received.
**Update:** The traversal order is hardwired into Peano. It is necessary to run it forward and backward.
![master_worker_synchronisation](/uploads/e1fc83f445841cb3763e0bfd7d8ec701/master_worker_synchronisation.jpg)
Additional Refinement along the MPI boundary
----------------------------------------------
This can be accomplished by continuously refining the top most parent patch (which is of type Cell) of every cell
of type Descendant which is at a Master-Worker boundary.
This has to be done until we end up with a patch of type Cell at the Master-Worker boundary.
At this point, we then need to send the solution values of the Master cell to the worker.
We further might need to impose initial conditions.
We further need to perform status flag merges in prepareSendToWorker.
**Problems with this approach:** It might introduce rippling refinings around
the artificially refined cell.
Introduce a no-operation traversal
----------------------------
We could further introduce a no-operation traversal before we perform
reductions and broadcasts which would rewind Peano's streams
but does not perform any computations and communication.
In this case, we would always follow the top-down traversal.
The observed Master-Worker synchronisation would not appear.
**Problems with this approach:**
- ~~Batching is currently not possible with~~
~~multiple adapters. We could maybe perform no operation in every second iteration.~~
~~However, we would then have still a Master Worker synchronisation. Or would we not?~~
We could have a single empty traversal in front of a batch. That would
work.
- The BoundaryDataExchanger of the heaps always assumes an inversion of the traversal in
every iteration.
To alter this behaviour, we would need to change the receive methods
in Peano's AbstractHeap,DoubleHeap, and BoundaryDataExchanger methods.
We are required to add a bool "assumeForwardTraversal" (defaults to false) to the signature.
We are further required to update ExaHyPE's solver implementations:
Whenever we receive boundary data after we have run the no-operation traversal,
we need to set this new flag to true when calling receiveData.
If we run of batch iteration, this must be done only
in the first iteration of the batch.
Have both
----------------------------
For optimal performance, it might be useful to employ both techniques.
We might use "loop padding", i.e. insert empty traversals, in order to end up with the forward traversal
any time we need to broadcast / reduce something.
In generally it might useful to handle broadcasts and reductions outside of the mapping.
Therefore, we would however need to plug into both, runAsMaster and runAsWorker.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/196Fix musclhancock scheme for certain patch sizes2018-02-22T09:45:39+01:00Ghost UserFix musclhancock scheme for certain patch sizesFor me, for certain patch sizes, our 2nd order FV scheme in ExaHyPE (musclhancock) fails. Godunov runs fine. This has to be debugged (it certainly depends on the patch size: Some work, some not).
2nd order is crucial for some applicatio...For me, for certain patch sizes, our 2nd order FV scheme in ExaHyPE (musclhancock) fails. Godunov runs fine. This has to be debugged (it certainly depends on the patch size: Some work, some not).
2nd order is crucial for some applications (GRMHD,CCZ4).
I will do this, ticket just for book keeping.