VTKHDF roadmap

The aim of this post is to summarize the current status of the VTKHDF File Format handles, indicate where supporting resources can be found, and provide a roadmap of what is planned for 2024 and beyond.

Additionally, a master issue has been opened here to track any improvements and clarifications of the VTKHDF specification itself.

This post will be updated as we augment the format specification and improve the implementation.

In short, VTKHDF is a file format to store VTK data types in a HDF5 file. The up-to-date specification about this file format is here¹.

What is currently supported with VTKHDF

In the VTKHDF file format.

This format¹ specifies how to store the following VTK data types:

*: not all cell types are supported

The VTKHDF format also specifies how to store distributed and temporal data for most of these VTK data types.

In VTK

VTK has a dedicated module⁶ VTK::IOHDF to read and write VTKHDF file format, it contains a reader named vtkHDFReader and a writer, vtkHDFWriter.

types serial distributed temporal (in one file)
vtkPolyData :white_check_mark: :white_check_mark: :white_check_mark:
vtkUnstructuredGrid :white_check_mark: :white_check_mark: :white_check_mark:
vtkImageData :white_check_mark: :white_check_mark: :white_check_mark:
vtkPartitionedDataSet :white_check_mark: :white_check_mark: :white_check_mark:
vtkPartitionedDataSetCollection³ :white_check_mark: :white_check_mark: :x:
vtkMultiBlockDataSet³ :white_check_mark: :x: :x:
vtkOverlappingAMR² :white_check_mark: :white_check_mark: :white_check_mark:
vtkHDFReader supports

Some types are specified and can be read, but some of them can not be written using vtkHDFWriter just yet. The types not listed here need to be written externally, using any library which supports writing HDF5 files.


types serial distributed temporal (in one file)
vtkPolyData :white_check_mark: :x: :white_check_mark:
vtkUnstructuredGrid :white_check_mark: :x: :white_check_mark:
vtkPartitionedDataSet :x: :x: :x:
vtkPartitionedDataSetCollection³ :white_check_mark: :x: :x:
vtkMultiBlockDataSet³ :white_check_mark: :x: :x:
vtkHDFWriter supports

Note that a part of the vtkHDFReader is available in VTK 9.3.0 but most of these features are available on VTK master only.

In ParaView

This VTK::IOHDF module is fully exposed in ParaView and an extractor has been added for the vtkHDFWriter, which is essential for in situ use cases with Catalyst.

Roadmap

Few other features are already under development, discussion or planned. Some of them will be data type dependent like in this table:

PolyData UnstructuredGrid RectilinearGrid ImageData OverlappingAMR PartitionedDataSet PartitionedDataSetCollection MultiBlockDataSet
vtkHDFReader: Distributed support :large_blue_circle: :large_blue_circle: :white_circle: :large_blue_circle: :large_blue_circle: :large_blue_circle: :orange_circle: :orange_circle:
vtkHDFReader: Static mesh support** :green_circle: :green_circle: :white_circle: :white_circle: :white_circle: :green_circle: :white_circle: :white_circle:
vtkHDFWriter: non-distributed support :large_blue_circle: :large_blue_circle: :white_circle: :white_circle: :large_blue_circle: :orange_circle: :large_blue_circle: :large_blue_circle:
vtkHDFWriter:Distributed support :orange_circle: :orange_circle: :white_circle: :white_circle: :orange_circle: :orange_circle: :orange_circle: :orange_circle:
vtkHDFWriter: iterative temporal writing :green_circle: :green_circle: :white_circle: :white_circle: :orange_circle: :orange_circle: :orange_circle: :orange_circle:
vtkHDFWriter: Static mesh support :large_blue_circle: :large_blue_circle: :white_circle: :white_circle: :white_circle: :white_circle: :white_circle: :white_circle:
VTKHDF ongoing efforts in 2024 based on Data Type supported by the file format

Legends:

:white_circle: Future work
:orange_circle: Looking for funding
:yellow_circle: To be implemented in 2024
:green_circle: Currently being implemented
:large_blue_circle: Already done

**: The VTK pipeline can reuse the mesh from another timestep if the correct key information is passed through and a special cache system is used with vtkDataObjectMeshCache. For more details about this feature, see the following blog post about this topic⁷.


A few other features, not related to any specific data type, will be implemented this year:

  • Add a Misc group under VTKHDF group to contain user data ignored by the VTK implementation of the file format.
  • Support writing VTKHDF file in one file per block.

There are other features which may be implemented this year:

  • Iterative temporal writing in vtkHDFWriter which means to append in the same file all requested timestep.
  • vtkPolyhedron support for VTKHDF UnstructuredGrid⁵.

Last, we aim to create a benchmark this year to evaluate performance of the VTKHDF file format in an HPC environment.

How you can help

If you are interested in the VTKHDF file format or the current VTK implementation, please don’t hesitate to try it out. Feedback is always welcome!

Finally, if some features that currently lack funding in the roadmap interest you, please reach us via the Kitware contact page.

Resources

Official up-to-date documentation¹ : VTK File Formats - VTK documentation

Related discussions:

  • About vtkOverlappingAMR²: link
  • About composite support³ : link
  • About OverlappingAMR with temporal data⁴ : link
  • About Polyhedron cell support in VTKHDF⁵: link
  • About temporal field data: link
  • About splitted HDF5 files: link
  • About rectilinear grid support: link

HDF Module in the vtk repository⁶ : link

Kitware blog post about the StaticMeshPlugin⁷: link

Repository containing python scripts to generate VTKHDF file or to convert VTKXML file to VTKHDF file: https://gitlab.kitware.com/keu-public/vtkhdf/vtkhdf-scripts

7 Likes

cc @mwestphal @Francois_Mazen @berk.geveci @danlipsa @Louis_Gombert @Charles_Gueunet

Great roadmap, this is very useful. I have tried to follow the development of this format close, but the pace of recent development have been high and it’s not necessarily easy to hang on as an outsider. This overview is great, and I hope you manage to keep it updated. Maybe consider converting it to some “real” documentation together with the file format spec (under docs.vtk.org)?

Keep up the good work!

2 Likes

Like this? VTK File Formats - VTK documentation

Hey there Lucas,

really great work! I love to see the activity in vtk to work with hdf5, especially for AMR and I am looking forward to use it for postprocessing our code.

From a user perspective, do you know how your changes will be included in releases further downstream? Changes in vtk Master probably with the next release (9.3.1 or 9.4.0, hopefully soon?). Will ParaView receive an updated vtkhdf reader after the vtk release or already with it’s own upcoming release?

For the future work I would love to see a hdf implementation of Treegrid / Hypertreegrid Data. I believe that coupling that with a hierarchical functionality for uniform grids as blocks within the Treegrid would greatly benefit many Block-based AMR codes for their visualization.

Greetings,
Julius

1 Like

Hello @Arcadia,

Thanks! It’s always a pleasure to have user feedback.

Currently, few things in the roadmap are already merged in VTK and will be available in the next release of vtk (9.4) and paraview (5.13):

  • static mesh support in the Reader/Writer for polydata and unstructured grid
  • temporal OverlappingAMR (merged today in vtk!): we add support in the file format for temporal OverlappingAMR the spec updated is here VTK File Formats - VTK documentation. The vtkHDFReader can now open such type of data.

For the future work I would love to see a hdf implementation of Treegrid / Hypertreegrid Data

Definitely! It would be nice but for now we don’t have funding to add support in the file format for HTG and in the reader/writer in VTK.

Best,
Lucas

2 Likes

Added information about disitributed and partionedataset in the writer implementation.

Really great work. HDF5 is very useful and portable when working with large datasets.

I noticed that structured and rectilinear grids as implemented in legacy VTK are missing from the roadmap? Are they difficult to implement or not a priority?

I don’t see any potential difficulty to support it as you said it’s because, for now, adding other VTK types have low priorities.

However, as the VTKHDF File Format aims to support all VTK data objects like what the XML format does, we should support it in the future (like for vtkHyperTreeGrid, vtkTable, vtkMolecule and so on).

I want to add that any dataset type not mentionned here is simply not supported by the specifications nor the implementation. Any of them could be added and implemented though! Feel free to reach out to Kitware!

Should the roadmap be updated post Paraview 5.13 release?

The roadmap is kept up to date with the status of VTK master and not related to the ParaView release, although some feature are present, or not, in ParaView 5.13.0.

Can you precise your suggestion ?

Added rectilinear grid support discussion link.

Maybe this is a little late, but can we please choose a file extension for VTKHDF files that is not .hdf? It’s extremely frustrating when the extension for a general container format like hdf5 (or netCDF or bp or xml or json) is used over and over for conventions that are incompatible with each other. It wreaks havoc on applications like ParaView that make it difficult to determine which reader to use.

There is a reason why we use extensions like .vti, .vtp, .vtu, etc. instead of just naming them all .xml, which would be a mess. Let’s choose an appropriate VTKHDF extension now before it’s too late. I propose using .vtkhdf, which is simple enough.

1 Like

yes, this is what we recommend https://docs.vtk.org/en/latest/design_documents/VTKFileFormats.html#extension :slightly_smiling_face:

Perfect. I guess I just looked at the wrong file in the example scripts repository pointed to by beginning of this post: https://gitlab.kitware.com/keu-public/vtkhdf/vtkhdf-scripts/-/blob/main/vtkxml-to-vtkhdf/test.sh. It might be worth updating that.