file format recommendation?

I have a new project, where there are a collection of files containing timesteps from a fluid simulation. The data is face-centered, and the geometry is a collection of triangles and quadrilaterals (so, basically vtkPolyData). I was being asked to take variable values on cut planes, and to reproject the variable values on different meshes (i.e., coarser, or perhaps with nodes aligned in some special way).

I’d like to be able to save the files in a format that is aware of time (so that I can interpolate between timesteps as well). I don’t think that vtkPolyData can handle that directly, and I was made aware that vtkMultiBlockDataset was being deprecated. I also gather that vtkTemporalDataSet is a thing of the past …

I know that picking the right format in the VTK API can save me a lot of work later, does anyone have any suggestions? Right now, I was thinking vtkPartitionedDataSet.

Hi @rexthor

You are talking about file format but mention datamodel classes.

Are you choosing between datamodels types or file formats here ?

I’ll admit, the conflation may be due to my ignorance … I think that originally my intent was to take the input files, which are some weird mishmash of HDF5 and NetCDF4 that does not conform to either standard, and turn them into a VTK format so that it would be more convenient for me.

I did try vtk NetCDF and HDF5 readers, and they didn’t work, but the author of the software that wrote my input files does not have a good understanding of either format.

I think that what really sinks it for me is that I need to store what amounts to key->value data that isn’t associated with geometry (upstream pressure of simulation, and other assumptions that went into the model), and I am not sure that any of the VTK formats that I am familiar with store this sort of information - but again - could be that I am just ignorant of existing work.

What is the dimensionnality of that data ?

I think that originally my intent was to take the input files, which are some weird mishmash of HDF5 and NetCDF4 that does not conform to either standard, and turn them into a VTK format so that it would be more convenient for me.

This is resoannable although it may be a lot of work.

Data is 3D + time.

  • The cell topology does not change each timestep, but the node locations do.
  • There are scalars and vectors associated with each cell for each timestep. I assume they meant for the cells to be linear Lagrange elements.
  • Also for each timestep is what amounts to a key-value pair, I assume meant to describe changing conditions global to the simulation, among those, the actual value of simulation time, and various other things like alpha, beta, phi … where they appear to be string → float64.

The cell topology does not change each timestep, but the node locations do.

Supported by all dataset types

There are scalars and vectors associated with each cell for each timestep

This is just cell data

Also for each timestep is what amounts to a key-value pair,

This is just field data

To me it looks like you should be able to read your data into a vtkPolyData if you use 2D cells or a vtkUnstructuredGrid if you use 3D cells and just use a classic temporal pipeline approach.

Field data … great … I’ll just have to figure that out real quick.
This is currently what I’m doing to read in the file:

def read_input_data(fname):
    with netCDF4.Dataset(fname, disk_format="HDF5") as m:
        ca = vtk.vtkCellArray()
        cell_sizes = m.CellSizes # can be triangles or quadrilaterals
        node_ids = m.NodeIDs
        offset = 0
        for sz in m.NumCells:
            ca.InsertNextCell(sz, node_ids[offset:offset + sz])
            offset += sz

        face_data = m["faceData"] # rows of time, columns of variable, rays of variable value
        node_locations = m["nodeLocations"] # rows of time, columns of location data
        variable_names = m.varNames

        for t_idx, t_val in enumerate(m.solutionTime):
            pd = vtk.vtkPolyData()
            pd.SetPolys(ca) # cell connectivity

            cd = pd.GetCellData()
            for v_idx, v_name in enumerate(variable_names):
                a = numpy_to_vtk(face_data[t_idx, v_idx, :])
                a.SetName(v_name)
                cd.AddArray(a) # face values

            locations = np.reshape(node_locations[t_idx, :], (-1, 3))
            pts = vtk.vtkPoints()
            for record in locations:
                x, y, z = record
                pts.InsertNextPoint(x, y, z)
            pd.SetPoints(pts) # node locations

            #----------#
            # Aux Data #
            #----------#
            fd = pd.GetFieldData()
            for a_idx, a_name in enumerate(m.auxNames):
                a = numpy_to_vtk(m["auxData"][t_idx, a_idx]) # yes, a single value
                a.SetName(a_name)
                fd.AddArray(a)

            # Right now, I just save as polydata for each timestep - but sometimes there are THOUSANDS of timesteps.
            w = vtk.vtkXMLPolyDataWriter()
            outname = os.path.splitext(os.path.basename(fname))[0] \
                      + f"_{t_idx:02d}." + w.GetDefaultFileExtension()
            w.SetFileName(outname)
            w.SetInputConnection(f.GetOutputPort())
            w.Write()

This should get you going :slight_smile:

1 Like