Efficent output format for large simulations
As soon as output reaches a certain size plotting is our major bottleneck. (a mesh of ~40 mio dofs with 14 unknowns for pvtu on supermuc takes 1h per plot) We should look into different approaches like the ASYNC lib https://github.com/TUM-I5/ASYNC which sacrifices a singe thread per rank on output. For supermuc: We should also start to look into parallel file systems like LUSTRE https://en.wikipedia.org/wiki/Lustre_(file_system).