The overlappingAMR data structure has been recently added to the VTKHDF specification, now is time for its cousin, the Hyper Tree Grid (HTG for short) to be specified as well.
HTG is a compact and memory-efficient tree-based AMR data structure, that has been around in VTK for more than a decade. Many HTG-specific algorithms have been created, using a cursor mechanism coupled with tree iteration to browse through it. HTGs can be stored in the VTK XML format using the .htg extension. The parallel .phtg implementation is currently in a non-functionning state.
This post suggests a specification for the up-and-coming VTKHDF format for HTG, which will allow efficient distributed writing and reading, temporal support, and composite structures, distributed in a single or in multiple files.
The general structure of the file would look like this
GROUP "VTKHDF"
ATTRIBUTE "Version" // Updated to (2,4)
ATTRIBUTE "Type" // "HyperTreeGrid" In this case
// HTG-specific properties
ATTRIBUTE "BranchFactor" // 2 or 3, cell division factor
ATTRIBUTE "Dimensions" // 3-Vector
ATTRIBUTE "InterfaceInterceptsName" // String referencing a cell data array
ATTRIBUTE "InterfaceNormalsName" // String referencing a cell data array
ATTRIBUTE "TransposedRootIndexing" // Bool, true if the indexing mode of the grid is inverted
// Grid point coordinates
DATASET "XCoordinates"
DATASET "YCoordinates"
DATASET "ZCoordinates"
// HTG Specific fields
DATASET "NumberOfTrees" // One entry for each distributed part
DATASET "DescriptorsSize" // One entry for each distributed part
DATASET "DepthPerTree" // size = sum(NumberOfTrees), maximum depth for each tree
DATASET "Descriptors" // Packed bit array, size = sum(DescriptorsSize)
DATASET "NumberOfCellsPerDepth" // size = sum(DepthPerTree), number of cells for each depth of each tree
DATASET "TreeIds" // size = sum(NumberOfTrees)
DATASET "Mask" // Packed bit array, size = sum(NumberOfCellsPerDepth)
GROUP "FieldData"
...
GROUP "CellData"
...
If you’re aware of the .htg v2 format, this is mostly a mapping of its content, with the addition of a few offset fields to support efficient distributed reading.
Misc. Notes:
- “NumberOfTrees” has one value for each part of the distributed dataset, and is optional for non-distributed data or if the number of trees in each part is the same.
- “DescriptorsSize” is required for distributed datasets and used as an offset array in the “Descriptors” bit set. Without it, each rank would need to process all of the previous trees to calculate the read offset for descriptors.
- “TreeIds” must have all values 0->N, but in any order. This allows distributing trees across parts
- “Mask” is optional, and omitted if no cell is masked.
- “NumberOfCellsPerDepth” is renamed from “NumberOfVerticesPerDepth” in the XML format. It allows for faster reading when limiting reading depth. The reader can compute the number of cells to offset for a given tree using this value and “DepthPerTree”.
- There is no “PointData” for the HTG structure, because we only consider cells.
- For temporal, offsets for NumberOfTrees, Descriptors, NumberOfCellsPerDepth and NumberOfCells will be required in the “Steps” group, similarly to the other types of VTKHDF datasets. This way, the reader can easily pick up any time step without any pre-computation.
Any comment or suggestion is welcome!