We have a CFD code “MGLET” that solve incompressible flow and acoustics on staggered, Cartesian grids using immersed boundary methods. The simulation grids are built up of small, Cartesian grids in an octree structure to refine details of the geometry and flow. Each grid is typically 24x24x24 cells - we found that this is a sweet spot between performance and the ability to do efficient local grid refinement.
Internally MGLET store the data with 2 ghost layers on each grid, so a 24^3 grid is stored as 28^3 and so on. This is written out in an HDF5 file in a special way. This file is complete in the sense that the flowsolver can completely restart from a saved state. It is not very useful for postprocessing directly.
We have a postprocessing tool that takes this restart file and fills an OverlappingAMR structure depending on the users needs. This tool also handle things such as the staggered velocities that are defined on the faces of the cells, and properly interpolate them onto cell centers or vertices such that VTK can visualize them. It also strips off any ghost cells and can compute certain derived quantities that require some deep knowledge about details that the VTK library cannot do (often related to staggering). Already in this stage users can choose to only process certain regions in space or only certain levels to reduce the amount of data that are produced.
The outcome of this postprocessing tool is currently a VTH dataset (vth + a folder with lots of small vti’s) the user can open in Paraview. The tool can additionally generate simple 2-D and 3-D images such as cutplanes, isosurfaces but that is not interesting for this discussion since it does not rely on any datafiles on disk (no intermediate datasets are written - only the resulting image).
Our problem is that due to the small grids used in the code, we end up with tens of thousands of vti’s as you comment in your last paragraph. These are a pain to have laying around. The biggest case I have computed has 250 000 grids - and therefore result in 250 000 vti-files. This is not desirable. On typical classic supercomputers we often face problems related to quotas on number of files, which can be around 1 million. A few datasets like this and we are out of quota… Therefore it would be very useful to have an alternative AMR format that overcome the limitations of the XML-based one when the number of blocks/grids become too large.
I will soon follow up with an example.