ExaHyPE-Engine issueshttps://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues2019-03-21T10:56:35+01:00https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/175Symbolic flux calculations reduce speed significantly2019-03-21T10:56:35+01:00Ghost UserSymbolic flux calculations reduce speed significantlyI just completed some Likwid performance measurements for the generic and optimised kernels.
For the optimised kernels, I tested a variant using "symbolic variables" in the
flux and eigenvalue computation and another variant using
classi...I just completed some Likwid performance measurements for the generic and optimised kernels.
For the optimised kernels, I tested a variant using "symbolic variables" in the
flux and eigenvalue computation and another variant using
classic array indexing (optimised-nonsymbolic).
The files are suffixed by a ".likwid.csv".
I further attached measured Peano adapter times
The files are suffixed by a ".csv".
Setup
--------
* Compressible Euler equations (Euler_Flow)
* pure ADER-DG scheme (no limiter)
* polynomial orders p=3,5,7,9;
regular 27^3 grid (3D)
* TBB threads=1,12,24.
* Intel icpc17 (USE_IPO=on).
* nonfused (3 algorithmic phases) vs. fused (a single pipelined algorithmic phase) ADER-DG implementation
* no predictor reruns did occur for the fused implementation
Preliminary Results
----------------------------
* Optimised kernels are faster than the generic ones (I kind of expected this 😉)
* Raw array access (optimised-nonsymbolic) is significantly faster than using the "symbolic variables"(optimised).
* Fused scheme pays off (as long as number of reruns is low; very interesting for linear PDEs (no reruns here))
Files
-----
[Euler_ADERDG-no-output-generic.csv](/uploads/f671c93a33e70c9549853f1f51518c9d/Euler_ADERDG-no-output-generic.csv)
[Euler_ADERDG-no-output-generic.likwid.csv](/uploads/0d52ad214f0b1d6cfae4d658a9997cb5/Euler_ADERDG-no-output-generic.likwid.csv)
[Euler_ADERDG-no-output-optimised.csv](/uploads/422288cdc222b1b250b3aad9e2ad73a1/Euler_ADERDG-no-output-optimised.csv)
[Euler_ADERDG-no-output-optimised-nonsymbolic.csv](/uploads/0f84240941230d4bc9f06c76b154396b/Euler_ADERDG-no-output-optimised-nonsymbolic.csv)
[Euler_ADERDG-no-output-optimised.likwid.csv](/uploads/135bd71c668ca0c4c4ee5b7988ea2f17/Euler_ADERDG-no-output-optimised.likwid.csv)
[Euler_ADERDG-no-output-optimised-nonsymbolic.likwid.csv](/uploads/7606eefe559499a1f6de62b5f33905d0/Euler_ADERDG-no-output-optimised-nonsymbolic.likwid.csv)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/111Delete Logo.h from repository and include minimal PNG decoding library2018-06-15T15:13:43+02:00Ghost UserDelete Logo.h from repository and include minimal PNG decoding libraryThe ExaHyPE repository has more than 2000 commits and weights only 10MB. Now in January somebody added a **10MB C++ header file** containing an uncompressed picture. It's not only possible to read in this picture in any image reader any...The ExaHyPE repository has more than 2000 commits and weights only 10MB. Now in January somebody added a **10MB C++ header file** containing an uncompressed picture. It's not only possible to read in this picture in any image reader any more, but the repository just blew to the double size and nobody really wants to deal with such files (try to load it in a text editor).
First, the commit introducing this picture should be changed and the monster file deleted out of the repository.
Second, it should be replaced by an actual picture in GIF, JPG, PNG or similar efficient formats (will be less than **20kb**!) and a suitable decompression routine should be added to the Demonstrator. This mimics how actual offline initial data generators work. We do _not_ need to introduce a dependency on an external library but can embed an Open Source minimal decoding library. There are a lot of them, for instance one of these:
* https://github.com/nothings/stb/blob/master/stb_image.h - single file header only PNG decompressor (*public domain*, 200kB code)
* https://github.com/elanthis/upng - another micro PNG decompressor single file library (*as-is* license)
* http://lodev.org/lodepng/ - another pico PNG decomressor, 500lines (*as-is* license)
* https://github.com/hidefromkgb/gif_load/blob/master/gif_load.h - a 400 lines GIF reader (*public domain*, 20kB code)
This is really the way to go as it allows users to exchange the picture with another one _at runtime_.
I @sven volunteer to implement this.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/112Documentation for user interfaces (generic kernels)2018-06-15T15:13:43+02:00Ghost UserDocumentation for user interfaces (generic kernels)https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/106Named Variables in toolkit also as strings2018-06-15T15:13:43+02:00Ghost UserNamed Variables in toolkit also as stringsDominic new named variables stuff is nice (there is an ongoing email discussion), but @Svenk, implement some stringification stuff afterwards so we can get the name of the variables also easily and loop over them in the plotters and so.Dominic new named variables stuff is nice (there is an ongoing email discussion), but @Svenk, implement some stringification stuff afterwards so we can get the name of the variables also easily and loop over them in the plotters and so.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/107An idea to solve both the toolkit overwriting and file location problem2018-06-15T15:13:43+02:00Ghost UserAn idea to solve both the toolkit overwriting and file location problemThe Java toolkit creates C++ files in the application directory ("output Directory"). This creates (amongst others) two problems:
1. In large applications, users cannot freely structure their code in subdirectories
2. Users cannot u...The Java toolkit creates C++ files in the application directory ("output Directory"). This creates (amongst others) two problems:
1. In large applications, users cannot freely structure their code in subdirectories
2. Users cannot understand which files are overwritten and which not
To address problem (1), I recently introduced an heuristic approach, [FileSearch.java](https://gitlab.lrz.de/exahype/ExaHyPE-Engine/blob/6e7e9552fb3d65b4c5feb5c70e36038be06499ad/Toolkit/src/eu/exahype/FileSearch.java). This allows to put a file *with a similar name* in any subdirectory in the output directory. Thus, for instance, we can have an application to be structured in folders like
```
SRMHD
├── C2P-MHD.f90
├── C2P-MHD.h
├── C2PRoutines.f90 -> ../SRHD/C2PRoutines.f90
├── ExaHyPE-MHDSolver-p2
├── extended.log
├── InitialDataAdapter.cpp
├── InitialDataAdapter.h
├── InitialData.f90
├── KernelCalls.cpp
├── Makefile
├── make-p2.log
├── MHDSolver.cpp
├── MHDSolver_generated.cpp
├── MHDSolver.h
├── Parameters.f90
├── parameters.mod
├── PDE.f90
├── PDE.h
├── run_withEnv.sh
└── Writers
├── ConservedWriter.cpp
├── ConservedWriter.h
├── ErrorWriter.cpp
├── ErrorWriter.h
├── ExactPrimitivesWriter.cpp
├── ExactPrimitivesWriter.h
├── IntegralsWriter.cpp
├── IntegralsWriter.h
├── PrimitivesWriter.cpp
├── PrimitivesWriter.h
├── RelativeErrorWriter.cpp
├── RelativeErrorWriter.h
└── TimeSeriesReductions.h -> ../../EulerFlow/TimeSeriesReductions.h
```
ie. note the `Writers` subdirectory.
Actually, this ticket shall propose an idea which solves the problems (1) and (2) altogether.
## Defining a header
My idea proposes to define a magic line at any place (ie. header or footer) of the generated `.h` or `.cpp` files, containing something like:
```
/** ExaHyPE.jar: autogenerated at Fr 3. Feb 19:08:51 CET 2017 **/
/** ExaHyPE.jar: File identifier: Plotter[i].HeaderCode **/ <- this is the identifier line
/** ExaHyPE.jar: Rebuild options: [x] Rebuild on every toolkit call [ ] Manually disable rebuild [ ] Rebuild on structural change **/ <- This can be changed by user
```
The identifier line solves the problem to find an appropriate file. It allows users to rename them as he wants. The toolkit has to inspect all possible files with a `grep` like search, but this is not too bad. We don't really expect too big applications...
The options line allows the user both to understand the default setting and to enforce his own rules. This is very helpful when working at ExaHyPE internals and to stop this long lasting conflict with the toolkit overwriting stuff. It also allows users to request an overwrite of a file. The specific syntax is open to discussion, ie. one could also choose something better machine readable here.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/108Toolkit: Parser errors with faulty specfile2018-06-15T15:13:43+02:00Ghost UserToolkit: Parser errors with faulty specfileSo the specfile introduced new attributes (properties, or however you call them) and the first what happens is that the toolkit does not run any more with old specfiles. Today: Trying to parse [SRMHD.exahype](/uploads/02379c4f0485dad9d1a...So the specfile introduced new attributes (properties, or however you call them) and the first what happens is that the toolkit does not run any more with old specfiles. Today: Trying to parse [SRMHD.exahype](/uploads/02379c4f0485dad9d1a69748d0ca7445/SRMHD.exahype) as it is in the [current](https://gitlab.lrz.de/exahype/ExaHyPE-Engine/commit/3c8a41ad41a55cab990b0b452fa3cc6e0369b8da) repository:
```
ERROR: eu.exahype.parser.ParserException: [66,5] expecting: 'buffer-size'
```
I don't get this error message. If you open the [SRMHD.exahype](/uploads/02379c4f0485dad9d1a69748d0ca7445/SRMHD.exahype) there is nothing happening at this line.
So I tried out:
* Inserting the new `log-file = mylogfile.log` - didnt change a thing
* Inserting `variables = rho:1,j:3,E:1,B:3,damping:1` - didnt change a thing
* Commenting out `/* constants = {initialdata:alfven} */` gave me this error : ```ERROR: eu.exahype.lexer.LexerException: [61,41] Unknown token: *``` **WHY THE HELL CAN'T I JUST COMMENT OUT STUFF**??
* It turned out that I had to **REMOVE** the constants to get the toolkit running throught. **COMMENTING OUT IS NOT ENOUGH. WHY?**https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/139Kernel activation functions2018-06-15T15:13:43+02:00Ghost UserKernel activation functionsSome notes we might want to pickup later on:
* ``useNCP`` and ``useMatrixB`` are always used as a pair since the Matrix B is fed into the ncp kernel.
* ``useNCP``, ``useMatrixB``, and ``useFlux`` are set for all cells globally. It does ...Some notes we might want to pickup later on:
* ``useNCP`` and ``useMatrixB`` are always used as a pair since the Matrix B is fed into the ncp kernel.
* ``useNCP``, ``useMatrixB``, and ``useFlux`` are set for all cells globally. It does not make
sense to turn them off or on locally in contrast to source terms and point sources.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/109An idea to solve the double parsing and compile/runtime configuration problem2018-06-15T15:13:43+02:00Ghost UserAn idea to solve the double parsing and compile/runtime configuration problemSimilar to #107, this ticket proposes an idea which solves two problems we currently have in ExaHyPE.
## Problem 1: Double Parsing problem
The ExaHyPE specification files are described by the [exahype.grammar](https://gitlab.lrz.de/exa...Similar to #107, this ticket proposes an idea which solves two problems we currently have in ExaHyPE.
## Problem 1: Double Parsing problem
The ExaHyPE specification files are described by the [exahype.grammar](https://gitlab.lrz.de/exahype/ExaHyPE-Engine/blob/master/Toolkit/exahype.grammar). For the Java toolkit, there is a parser generator (SableCC) which allows a neat way to create the application *during* parsing the specification file. But since the specification file also holds runtime constants, the ExaHyPE binary expects a valid and compatible spec file as it's first argument, ie. call with
```cmd> ExaHyPE-Euler.exe ../EulerFlow.exahype```
In order to parse the specfile at (C++) runtime, there is that [Parser.h](https://gitlab.lrz.de/exahype/ExaHyPE-Engine/blob/master/ExaHyPE/exahype/Parser.h) code. It works with regular expressions and knows nothing about `exahype.grammar`. Thus we sometimes expect that the two parsers don't parse the same spec file similarly or with similar error messages when having problems.
## Problem 2: No clear seperation off compiletime (ie. computing) and runtime (ie. physics) parameters
The current specfile format combines two semantically different concepts of parameters:
* *Computing-related* parameters, ie. project names, source code paths, architecture and compiler specific stuff, optimization parameters, all the PDE properties, all the basic properties about generated code (ie. number and names of classes of plotters)
* *Run-related* parameters, ie. the grid extend, parallelization parameters (number of cores, hybrid setups), plotting properties (frequencies, slicing, ...), physics/application/user parameters (ie. Initial Data, Boundary Conditions)
The user has absolutely no idea which parameter is a compile time constant (and probably even needs a rerun of the toolkit after change) and which parameter is a runtime constant, ie. can be chosen freely when submitting a job at a HPC queue.
Actually this problem is even deeper: There are further constants which can be chosen when running `make`, ie. *after* code generation by the toolkit. These are especially options about the parallelization, ie. whether to build in TBB and MPI support or not. However, these options are also specified in the toolkit.
## A seperation approach
I suggest to keep the specification files for the toolkit runs. I will demonstrate it by splitting the [current EulerFlow.exahype](https://gitlab.lrz.de/exahype/ExaHyPE-Engine/blob/3c8a41ad41a55cab990b0b452fa3cc6e0369b8da/ApplicationExamples/EulerFlow.exahype) into two files. The toolkit specfile would look like:
```
exahype-project Euler
peano-kernel-path = ./Peano
exahype-path = ./ExaHyPE
output-directory = ./ApplicationExamples/EulerFlow
architecture = noarch
computational-domain
dimension = 2
end computational-domain
shared-memory
identifier = dummy
end shared-memory
distributed-memory
configure = {whatever-needs-to-be-generated-here?}
end distributed-memory
solver ADER-DG MyEulerSolver
variables = rho:1,j:3,E:1
order = 3
kernel = generic::fluxes::nonlinear
language = C
plot vtk::Cartesian::cells::ascii ConservedQuantitiesWriter
variables = 5
end plot
plot vtk::Cartesian::vertices::ascii ComputeGlobalIntegrals
variables = 0
end plot
plot vtk::Cartesian::vertices::ascii PrimitivesWriter
variables = 5
end plot
plot vtk::Cartesian::vertices::ascii ExactPrimitivesWriter
variables = 5
end plot
plot probe::ascii ProbeWriter
variables = 5
end plot
end solver
end exahype-project
```
In contrast, we could now introduce a very simple parameter file format where an accurate parser can be written in C++ very simply and which is handed as command line argument to the ExaHyPE runs:
```
# Only single line comments like this
log-file = mylogfile.log
computational-domain::width = 1.0, 1.0
computational-domain::offset = 0.0, 0.0
computational-domain::end-time = 0.12
shared-memory::identifier = dummy
shared-memory::cores = 1 # NB: This should REALLY be driven by an ENV parameter like OMP_NUM_THREADS
distributed-memory::buffer-size = 64
distributed-memory::timeout = 60
optimisation::fuse-algorithmic-steps = off
# 0.0 und 0.8 sind schon mal zwei Faktoren
optimisation::fuse-algorithmic-steps-factor = 0.99
optimisation::timestep-batch-factor = 0.0
optimisation::skip-reduction-in-batched-time-steps = on
optimisation::disable-amr-if-grid-has-been-stationary-in-previous-iteration = off
optimisation::double-compression = 0.0
optimisation::spawn-double-compression-as-background-thread = off
# when this is a runtime constant...
MyEulerSolver::order = 3
# I would love to specify the maximum-mesh-level = 3 instead
MyEulerSolver::maximum-mesh-size = 0.05
# Plotter example
ConservedQuantitiesWriter::time = 0.0
ConservedQuantitiesWriter::repeat = 0.05
ConservedQuantitiesWriter::output = ./conserved
ConservedQuantitiesWriter::select = x:0.0,y:0.0
ComputeGlobalIntegrals::time = 0.0
ComputeGlobalIntegrals::repeat = 0.05
ComputeGlobalIntegrals::output = ./output/these-files-should-not-be-there
ComputeGlobalIntegrals::select = x:0.0,y:0.0
# etc.
# own parameters simple as hell
ML_CCZ4::timelevels = 3
ML_CCZ4::harmonicN = 1.0 # 1+log
ML_CCZ4::harmonicF = 2.0 # 1+log
ML_CCZ4::BetaDriver = 0.4 # ~1/M (\eta)
ML_CCZ4::advectLapse = 1
ML_CCZ4::advectShift = 1
ML_CCZ4::evolveA = 0
ML_CCZ4::evolveB = 1
ML_CCZ4::shiftGammaCoeff = 0.75
ML_CCZ4::shiftFormulation = 0 # Gamma driver
ML_CCZ4::fixAdvectionTerms = 1
ML_CCZ4::dampk1 = 0.05
ML_CCZ4::dampk2 = 0.0
ML_CCZ4::GammaShift = 0.5
ML_CCZ4::MinimumLapse = 1.0e-8
ML_CCZ4::conformalMethod = 1 # 1 for W
ML_CCZ4::dt_lapse_shift_method = "noLapseShiftAdvection"
ML_CCZ4::initial_boundary_condition = "extrapolate-gammas"
ML_CCZ4::rhs_boundary_condition = "NewRad"
Boundary::radpower = 2
ML_CCZ4::ML_log_confac_bound = "none"
ML_CCZ4::ML_metric_bound = "none"
ML_CCZ4::ML_Gamma_bound = "none"
ML_CCZ4::ML_trace_curv_bound = "none"
ML_CCZ4::ML_curv_bound = "none"
ML_CCZ4::ML_lapse_bound = "none"
ML_CCZ4::ML_dtlapse_bound = "none"
ML_CCZ4::ML_shift_bound = "none"
ML_CCZ4::ML_dtshift_bound = "none"
AHFinderDirect::N_horizons = 1
AHFinderDirect::find_every = 64
AHFinderDirect::output_h_every = 0
AHFinderDirect::max_Newton_iterations__initial = 50
AHFinderDirect::max_Newton_iterations__subsequent = 50
AHFinderDirect::max_allowable_Theta_growth_iterations = 10
AHFinderDirect::max_allowable_Theta_nonshrink_iterations = 10
AHFinderDirect::geometry_interpolator_name = "Lagrange polynomial interpolation"
AHFinderDirect::geometry_interpolator_pars = "order=4"
AHFinderDirect::surface_interpolator_name = "Lagrange polynomial interpolation"
AHFinderDirect::surface_interpolator_pars = "order=4"
AHFinderDirect::verbose_level = "physics details"
AHFinderDirect::move_origins = "yes"
AHFinderDirect::origin_x[1] = 0.0
AHFinderDirect::initial_guess__coord_sphere__x_center[1] = 0.0
AHFinderDirect::initial_guess__coord_sphere__y_center[1] = 0.0
AHFinderDirect::initial_guess__coord_sphere__z_center[1] = 0.0
AHFinderDirect::initial_guess__coord_sphere__radius[1] = 3.0
AHFinderDirect::which_surface_to_store_info[1] = 0
AHFinderDirect::set_mask_for_individual_horizon[1] = "no"
AHFinderDirect::reset_horizon_after_not_finding[1] = "no"
AHFinderDirect::find_after_individual_time[1] = 2000.0
AHFinderDirect::max_allowable_horizon_radius[1] = 5.0
# ...
```https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/191Pass std::vectors to patchwise functions2017-10-13T20:55:05+02:00Ghost UserPass std::vectors to patchwise functions- It's safer - size is available e.g.
- Signatures are clearer. User kernels still get the raw pointers.
- We can just do vector.data() to pass a pointer to the existing kernels.- It's safer - size is available e.g.
- Signatures are clearer. User kernels still get the raw pointers.
- We can just do vector.data() to pass a pointer to the existing kernels.https://gitlab.lrz.de/exahype/ExaHyPE-Engine/-/issues/188Global Reduction of Time Stepping Data2017-10-11T11:50:37+02:00Ghost UserGlobal Reduction of Time Stepping DataI currently do the following to reduce and broadcast global
values, like e.g. the minimum time step size:
* I broadcast time step data from master to worker
all the way down to the "lowest" worker.
* I reduce time step data from worker...I currently do the following to reduce and broadcast global
values, like e.g. the minimum time step size:
* I broadcast time step data from master to worker
all the way down to the "lowest" worker.
* I reduce time step data from worker to master
all the way up to the global master rank.
I could use a simple MPI_Reduce and a simple MPI_Gather to
perform the above steps.~~
Postponed.