Purpose of UnstructuredGrid partitioning in VTKHDF

nncarlson · March 15, 2024, 5:26pm

I’m exploring using VTKHDF export as a replacement for our code’s unsupported VTK reader. So far I’ve managed to successfully generate a vtkhdf file with an example UnstructuredGrid mesh and point and cell data that displays correctly in Paraview 5.12. (This is a static file; transient is my next step.)

In my example I’m using a single “partition”, and I’m confused about what the purpose of the partitioning is. The documentation says the number of partitions is typically the number of MPI ranks.
But if I understand correctly, there’s a single vtkhdf file, not 1 per MPI rank, and that the mesh data needs to be a serial description of the entire mesh (I’m thinking of the connectivity array and global point IDs vs rank-local IDs). So I don’t understand what MPI has to do with anything here. Is this simply an optional user-defined partitioning that is exposed in the Paraview interface? I’m wondering now if it might be similar to how our existing VTK reader partitions the mesh into “parts” that we can easily enable/disable via a checkbox list (how that’s done I don’t know).

nncarlson · March 16, 2024, 8:29pm

I’ve set up a tiny example in order to figure out how the partitioning works. The UnstructuredGrid mesh consists of 2 disconnected tets and 8 points. I’ve put 1 tet (and its 4 points) in each partition. The h5dump of the vtkhdf file follows. I’m able to load it into paraview 5.12 without error. It seems to know about all points and cells (via the spreadsheet view, for example), However only the cell from the first partition is rendered. Is there someplace in the GUI to tell it to render all partitions? I’m not finding one.

HDF5 "partition-test.vtkhdf" {
GROUP "/" {
   GROUP "VTKHDF" {
      ATTRIBUTE "Type" {
         DATATYPE  H5T_STRING {
            STRSIZE 16;
            STRPAD H5T_STR_SPACEPAD;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
         DATASPACE  SCALAR
         DATA {
         (0): "UnstructuredGrid"
         }
      }
      ATTRIBUTE "Version" {
         DATATYPE  H5T_STD_I32LE
         DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
         DATA {
         (0): 2, 1
         }
      }
      DATASET "Connectivity" {
         DATATYPE  H5T_STD_I32LE
         DATASPACE  SIMPLE { ( 8 ) / ( 8 ) }
         DATA {
         (0): 0, 1, 2, 3, 4, 5, 6, 7
         }
      }
      DATASET "NumberOfCells" {
         DATATYPE  H5T_STD_I32LE
         DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
         DATA {
         (0): 1, 1
         }
      }
      DATASET "NumberOfConnectivityIds" {
         DATATYPE  H5T_STD_I32LE
         DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
         DATA {
         (0): 4, 4
         }
      }
      DATASET "NumberOfPoints" {
         DATATYPE  H5T_STD_I32LE
         DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
         DATA {
         (0): 4, 4
         }
      }
      DATASET "Offsets" {
         DATATYPE  H5T_STD_I32LE
         DATASPACE  SIMPLE { ( 4 ) / ( 4 ) }
         DATA {
         (0): 0, 4, 4, 8
         }
      }
      DATASET "Points" {
         DATATYPE  H5T_IEEE_F64LE
         DATASPACE  SIMPLE { ( 8, 3 ) / ( 8, 3 ) }
         DATA {
         (0,0): 0, 0, 0,
         (1,0): 1, 0, 0,
         (2,0): 0, 1, 0,
         (3,0): 0, 0, 1,
         (4,0): 1, 1, 1,
         (5,0): 2, 1, 1,
         (6,0): 1, 2, 1,
         (7,0): 1, 1, 2
         }
      }
      DATASET "Types" {
         DATATYPE  H5T_STD_U8LE
         DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
         DATA {
         (0): 10, 10
         }
      }
   }
}
}

mwestphal · March 18, 2024, 8:37am

@lgivord

jorgensd · May 14, 2024, 2:26pm

Did you ever resolve this? I’m interested in the same thing.

nncarlson · May 14, 2024, 3:56pm

No, unfortunately. I wish someone from kitware would respond.

mwestphal · May 21, 2024, 7:38am

@Louis_Gombert @lgivord

lgivord · May 27, 2024, 8:15am

Hello @nncarlson,

Sorry for the delay.

In my example I’m using a single “partition”, and I’m confused about what the purpose of the partitioning is

Partitioned dataset is useful when we want to read data in distributed context, so when you load such data in paraview you will have a vtkPartitionedDataSet instead of a vtkUnstrucuredGrid/vtkPolydata…

The documentation says the number of partitions is typically the number of MPI ranks.
But if I understand correctly, there’s a single vtkhdf file, not 1 per MPI rank

Indeed with other file format for vtk, like the XML VTK File Format, you will need to have 1 file per rank but it’s not the case here because we can do that in a single file thanks to HDF5. In the future, we should be able to read a partitioned dataset where we have 1 file per rank, you can check the roadmap of VTKHDF in discourse for more details.

However only the cell from the first partition is rendered. Is there someplace in the GUI to tell it to render all partitions? I’m not finding one.

There is maybe a filter for that, I can try it if you can share your data example