How does the VTKHDF reader handle HDF5 hardlink datasets?

nncarlson · December 17, 2025, 7:40pm

I’d like to know how the VTKHDF reader handles a dataset that is an HDF5 hardlink to another dataset. Is it smart enough to only load the dataset into memory once and reuse that reference each time it encounters a hardlink to it, or does it load another copy?

Here’s the context for my question. Our computational mesh is effectively an UnstructuredGrid, but with an additional cell-based ID array that identifies the “block” a cell belongs to. The latter info is used solely for viz purposes. I’m working on redoing our viz output to use the VTKHDF MultiBlockDataSet format with UnstructuredGrid blocks corresponding our notion of “blocks”. However I want to avoid the work of actually decomposing our mesh into independent block meshes if at all possible. What works (and this appears to be what the TRUCHAS reader is doing) is to write the full mesh node coordinate array to the Points dataset for one block, and make the Points dataset for all the other blocks a hardlink to it. The Connectivity data set for each block naturally must consist of only those cells in the block, but the values don’t have to be modified; they can point (sparsely) into the full Points dataset. (A similar thing works for PointData, but CellData must be reorganized.) My concern is that this approach might not be viable in the long run if each block has it’s own copy of the complete mesh points (and PointData).

mwestphal · December 18, 2025, 8:25am

@Louis_Gombert @lgivord

lgivord · December 18, 2025, 1:15pm

Hello @nncarlson ,

Is it smart enough to only load the dataset into memory once and reuse that reference each time it encounters a hardlink to it, or does it load another copy?

I believe it will load another copy currently but I can double check that if you want

Our computational mesh is effectively an UnstructuredGrid, but with an additional cell-based ID array that identifies the “block” a cell belongs to. The latter info is used solely for viz purposes. I’m working on redoing our viz output to use the VTKHDF MultiBlockDataSet format with UnstructuredGrid blocks corresponding our notion of “blocks”.

Regarding your use case, I believe the best approach isn’t to do a VTKHDF multiblock dataset but a VTKHDF UnstructuredGrid and add your “additional cell-based ID array” as CellArray.

Recently we added the ExplodeDataSet filter which allows you to “explode” your unstructured grid in N blocks based on a selected cell array, you can check the doc here VTK: vtkExplodeDataSet Class Reference

nncarlson · December 18, 2025, 3:48pm

Thanks for that pointer! I tried it and it works well. Is there some way to feed in additional metadata that gives the labels to use for the blocks instead of the autogenerated labels based on the numeric values of the array? That would be perfect; it’s often difficult to remember which block id is which part when there are many blocks.

lgivord · December 19, 2025, 8:03am

Glad to see it helps you.

I don’t think we have a filter to rename blocks. I guess it could be an interesting new feature in the explode dataset, as this filter already explode by block ID, we can give another string array to rename these blocks ? wdyt @mwestphal ?

mwestphal · December 19, 2025, 8:17am

Definitely, lets open an issue.

lgivord · December 19, 2025, 9:47am

done https://gitlab.kitware.com/vtk/vtk/-/issues/19887