Can I avoid adding duplicate cells to a vtkPolyData?

rexthor · October 13, 2023, 1:05pm

I have a vtkUnstructuredGrid composed of vtkHexahedrons (originally it was a vtkPartitionedDataSet composed of vtkStructuredGrids - it made my life easier to just have one partition). The next processing stage needs to store a file that is a list of unique faces present in the volume (I didn’t design the program needing this data - I am just constructing the file).

Originally I was just going to read in the data, append all of the partitions together, create a new vtkPolyData object so that I could run vtkRemoveDuplicatePolys. The problem was that I had 12 million hexahedrons which turn into roughly 72 million quads, and then running the algorithm ate up all of memory and the process gets killed by the OOM reaper.

One possibility I wanted to explore was using a set of some sort to only add unique faces - I just didn’t know what class to use in the VTK library?

Presently the code (snippet) looks like this:

def polygon_soup(ds):
    "Convert vtkHexahedron (3D cells) in the dataset to faces composed of vtkTriangles."
    logging.debug(f"merging {ds.GetNumberOfPartitions()} partitions in vtkPartitionedDataSet to vtkUnstructuredGrid")
    af = vtk.vtkAppendDataSets()
    af.MergePointsOn()
    for ind in range(ds.GetNumberOfPartitions()):
        af.AddInputData(ds.GetPartition(ind))
    af.Update()
    data = af.GetOutputDataObject(0)

    pd = vtk.vtkPolyData()
    logging.debug(f"saving {data.GetNumberOfPoints()} node locations into new vtkPolyData")
    pd.SetPoints(data.GetPoints())

    logging.debug(f"saving {data.GetPointData().GetNumberOfArrays()} scalar arrays into new vtkPolyData")
    for a_ind in range(data.GetPointData().GetNumberOfArrays()):
        pd.GetPointData().AddArray(data.GetPointData().GetArray(a_ind))

    ca = vtk.vtkCellArray()
    for c_idx in range(ds.GetNumberOfCells()):
        cell = data.GetCell(c_idx)
        for f_idx in range(cell.GetNumberOfFaces()):
            face = cell.GetFace(f_idx)
            node_count = face.GetNumberOfPoints()
            ca.InsertNextCell(node_count)
            for n_idx in range(node_count):
                ca.InsertCellPoint(face.GetPointId(n_idx))

    logging.debug(f"converted {data.GetNumberOfCells()} vtkHexahedrons into {ca.GetNumberOfCells()} vtkQuads")
    pd.SetPolys(ca)

    logging.debug(f"removing duplicate vtkQuads and converting to vtkTriangles")
    cpd = vtk.vtkCleanPolyData()
    cpd.SetInputData(pd)
    rdp = vtk.vtkRemoveDuplicatePolys()
    rdp.SetInputConnection(cpd.GetOutputPort())
    tf = vtk.vtkTriangleFilter()
    tf.SetInputConnection(rdp.GetOutputPort())
    tf.Update()
    return tf.GetOutputDataObject(0)

spyridon97 · October 13, 2023, 8:20pm

As of now, this is not possible unless you parse all the existing faces already present.

I have implemented a new data structure in VTK named vtkStaticFaceHashLinksTemplate, which vtkGeometryFilter uses to get the external faces. I am envisioning that it could also be used for extracting the internal faces of a mesh, along with internal and external together, which is what you need. I don’t know when I will have some time to work on it, but I hope it will be soon.

rexthor · October 18, 2023, 12:15pm

Thanks for the reply, I wound up with a crude solution that seems to work (I haven’t done a very detailed check to make sure), but for the benefit of anyone else who stumbles upon this thread, here is how I handled it:

    # ... "data" is a vtkUnstructuredGrid that comes from elided code above.
    NODE_COUNT = data.GetCell(0).GetFace(0).GetNumberOfPoints()
    unique_quads = set()
    for c_idx in range(data.GetNumberOfCells()):
        for f_idx in range(data.GetCell(0).GetNumberOfFaces()):
            face = data.GetCell(c_idx).GetFace(f_idx)
            unique_quads.add(tuple(sorted(face.GetPointId(x) for x in range(NODE_COUNT))))

    ca = vtk.vtkCellArray()
    for q in unique_quads:
        ca.InsertNextCell(NODE_COUNT)
        for n_idx in q:
            ca.InsertCellPoint(n_idx)

    # The cell normals are likely pointing in all sorts of inconvenient
    # directions at this point, but my particular application doesn't care.
    # If your application does care, you will need to fix them.
    data.SetCells(vtk.VTK_QUAD, ca)

dgobbi · October 18, 2023, 7:36pm

One thing to watch out for is that sorting the IDs might do more than just flip the normal. It might change the quad from a rectangle into a Z-shape or N-shape.

rexthor · October 18, 2023, 8:00pm

Good point … instead of sorted I could try:

for f_idx in range(data.GetCell(0).GetNumberOfFaces()):
    face = data.GetCell(c_idx).GetFace(f_idx)
    indices = np.array([face.GetPointId(x) for x in range(FACE_NODE_COUNT)])
    unique_quads.add(tuple(np.roll(indices, np.argmin(indices))))

but I don’t think that handles the inverted node order (by which, I mean: [1, 2, 3, 4] and [4, 3, 2, 1], which I would only want one of those).

dgobbi · October 18, 2023, 8:26pm

You can use the “set” just for checking for duplicates, and put the original face into your dataset. This also collapses the two loops into just one loop.

    ca = vtk.vtkCellArray()
    unique_quads = set()
    for c_idx in range(data.GetNumberOfCells()):
        for f_idx in range(data.GetCell(0).GetNumberOfFaces()):
            face = data.GetCell(c_idx).GetFace(f_idx)
            faceIDs = [face.GetPointId(x) for x in range(NODE_COUNT)]
            sortedIDs = tuple(sorted(faceIDs))
            if sortedIDs not in unique_quads:
                unique_quads.add(sortedIDs)
                ca.InsertNextCell(len(faceIDs), faceIDs)

The ca.InsertNextCell(len(faceIDs), faceIDs) is a faster method for inserting cells.