vtkGeometryFilter and its evil spawn vtkDataSetSurfaceFilter

will.schroeder · August 27, 2020, 10:34am

Summary:
Soon I will push a revamped version of vtkGeometryFilter. Since it is used heavily in VTK, I wouldn’t be surprised to see some issues pop up (despite passing all tests). Please let me know if you see anything untoward or have any comments/questions.

The major improvement relates to performance, especially for unstructured grids. Due to threading and the use of more efficient operators for determining whether a face is boundary, I’m seeing increases of >20x on my 20-core desktop system, and with the new FastMode feature, as high as >70x. This means that it is now much faster than the evil vtkDataSetSurfaceFilter; typically I see 5x-10x faster (in fast mode ~20x). Serial performance is also faster. Finally, vtkGeometryFilter uses less memory (about 50% of vtkDataSetSurfaceFilter). (YMMV)

Gory Details:
History. First, a brief history lesson (as I remember it). Back in the days of yore, vtkGeometryFilter was written to extract the boundaries of non-polygonal datasets (typically used internally with vtkDataSetMapper). It used the abstract vtkDataSet API, and had some nice features in its ability to extract portions of the dataset using cell and point id ranges, and using x-y-z extents. However, it was really slow, mostly due to its use of GetCellNeighbors() to determine whether a face was boundary, and lots of virtual GetCell() methods. Behind the scenes, this filter also invoked BuildLinks() to expand the topological information of a grid data structure (here I am mostly focusing on unstructured grids); and it used a vtkPointLocator to merge coincident points.

Due to its early, slow performance, vtkGeometryFilter was a significant application bottleneck. As a result, the filter spawned its evil progeny (tongue firmly in cheek) vtkDataSetSurfaceFilter. This filter was much faster as it used a face hash table in which cell faces were inserted. Basically, if a face was inserted only once it was a boundary face, if twice an interior face. The filter also dropped the bells and whistles related to cell and pt id and extent clipping. Other work was done to speed up performance for other vtkDataSet types etc.

Now to current day: much has changed. Threading in various forms is available in VTK. Building cell links is threaded and way faster. We’re a lot smarter about and focused on writing performant algorithms. Voilà, it’s time for a vtkGeometryFilter makeover.

vtkDataSetSurfaceFilter. This filter has some issues. As mentioned, it uses a centralized hash table to hold every face in a dataset (that’s a lot of memory resources). Being centralized, the hash table is an inherent parallelization bottleneck. It produces different output than vtkGeometryFilter (for example in some cases it introduces “seams” or discontinuities in its output) so it can’t be used as a drop in replacement for vtkGeometryFilter (and vice versa - for example in the test QuadricDecimation.py replace vtkGeometryFilter with vtkDataSetSurfaceFilter and the test will fail). It has a nasty complex implementation: While its face extraction has grown over the many years (which is a good thing) to handle all cell types, including non-linear types, there is embedded tessellation logic in the filter (which I would argue belongs elsewhere, probably in the cell class itself).

vtkGeometryFilter. As is now implemented, it is threaded with vtkSMPTools. A new, faster method vtkUnstructuredGrid::IsCellBoundary() is introduced. Behind the scenes it uses threaded BuildLinks(). It introduces a FastMode, which examines the degree of each point in a cell (the degree of a point is the number of uses of a point by cells). If the degree is less than a specified threshold, the cell is a candidate for boundary face extraction, otherwise the cell is skipped. (It turns out that points on a boundary of a dataset in general have a lower degree than interior points - this is not 100% reliable but can be used to quickly extract an approximate boundary, and works better than you might think.) The usage of a point locator is eliminated. In the implementation of the filter, a threaded framework is used to process all dataset types. Also, vtkGeometryFilter now supports functionality like PassThroughCellIds and PassThroughPointIds. However, while vtkGeometryFilter supports all linear cell types, some non-linear types (e.g., Bezier related) are not supported (this could be added). Instead, if such unsupported types are discovered, vtkGeometryFilter will automatically delegate to vtkDataSetSurfaceFilter (if delegation is enabled).

Timing. I’ve tested on large datasets: a synthetic grid of 26.7 million hexes; a synthetic grid of 133 million tets; and a customer provided mesh of >509 million tets. Call these datasets: (A,B,C). On my computer the timings are: (this is informal, typical output)

Old vtkGeometryFilter: (19.41secs, 108.5secs, 571.5secs)
New vtkGeometryFilter: (1.351, 4.781, 25.27)
New vtkGeometryFilter (FastMode enabled): (0.8877, 1.454, 7.888)
vtkDataSetSurfaceFilter: (6.984, 38.36, 189.5)

Next Steps and Issues. I originally thought about unifying vtkGeometryFilter and vtkDataSetSurfaceFilter, but because they produce different output in some cases, and due to limitations of time, I gave up on this. Both filters are able to delegate to each other (there are DelegateOn/Off methods which enable automated delegation when needed - e.g., if vtkDataSetSurfaceFilter::DelegationOn is enabled, then the vtkDataSetSurfaceFilter will invoke vtkGeometryFilter for linear 3D unstructured grids).

vtkGeometryFilter could be expanded to support all cell types - I’d prefer to do this after rethinking the cell tessellation logic so it’s not all embedded in this filter. There are also some differences between the filters in how pieces are processed, this could be made consistent I believe. For now, delegation to vtkDataSetSurfaceFilter works fine in the rare case when nonlinear cells are detected.

Finally, the threaded algorithm in vtkGeometryFilter can generate output faces in a different order between runs. Typically this is not a problem because the rendering doesn’t change, and when picking the user queries the PassThroughCellIds, PassThroughPointIds, or some other data field to obtain information related to the underlying data. (In the many VTK tests I ran, I saw one instance where this affected downstream results - triangle stripping the output of vtkGeometryFilter produced a different set of triangle strips since there are more than one way to “correctly” triangle strip a mesh.) This could be fixed with some work.

amaclean · August 28, 2020, 10:44am

Impressive performance improvement! When committed, I’ll see if there are any issues with the VTK Examples.

Martin3d · September 3, 2021, 11:04am

Many thanks for this very instructive post!

I wonder if the Java wrapper of vtkGeometryFilter is up to date as I don’t have the method FastModeOn() neither SetDegree(). I built the wrapper with VTK-9.0.3.

I am trying to reduce the number of polygons of a vtkUnstructuredGrid containing hexahedrons which are actually lego like cubes which sizes are multiples of the smallest cubes in the geometry (see attached image and also a VTK file to reproduce the below outputs).
out0006.000.vtu (1.8 MB)

I think the degree setting of vtkGeometryFilter is ideal in my case : outer cube points share up to 8 quads, interior cube points share up to 12 quads.

So I tried to go parameter-less from Java but had no reduction at all for both filter. Is there something I should fix in the below code?

With vtkDataSetSurfaceFilter

vtkXMLUnstructuredGridReader reader = new vtkXMLUnstructuredGridReader();
reader.SetFileName(folder + file);
reader.Update();

System.out.println("#cells in file : " + reader.GetOutput().GetNumberOfCells());

vtkDataSetSurfaceFilter outerSurfFilter = new vtkDataSetSurfaceFilter();
outerSurfFilter.SetInputConnection(reader.GetOutputPort());
outerSurfFilter.Update();

System.out.println("#cells after filtering : " + outerSurfFilter.GetOutput().GetNumberOfCells());

Prints

#cells in file : 13767
#cells after filtering : 82602

Cells after filtering is exactly the number of cells in the input file x 6, which is the number of face of an hexahedron. The expansion is not surprising (I presume one hexahedron is replaced by 6 polygons) but no reduction at all is surprising!

With vtkGeometryFilter

vtkXMLUnstructuredGridReader reader = new vtkXMLUnstructuredGridReader();
reader.SetFileName(folder + file);
reader.Update();

System.out.println("#cells in file : " + reader.GetOutput().GetNumberOfCells());

vtkGeometryFilter outerSurfFilter = new vtkGeometryFilter();
outerSurfFilter.SetInputConnection(reader.GetOutputPort());
outerSurfFilter.Update();

System.out.println("#cells after filtering : " + outerSurfFilter.GetOutput().GetNumberOfCells());

The output is exactly the same as before.

Would you suggest anything to make it work better?

Thanks in advance for your help!

Martin3d · September 28, 2021, 4:15pm

I found that the file I am using has repeated vertices. The cubes of my shape share no point with other cubes. As vtkGeometryFilter works with point IDs and not with the underlying point coordinates, it can not filter anything

I haven’t found yet a way to tell VTK to merge repeated points in a dataset. I will confirm the above as soon as I find a way to do so.

Applying vtkGeometryFilter on a collection of hexahedron having shared vertices works.

will.schroeder · September 28, 2021, 4:55pm

vtkCleanPolyData and vtkStaticCleanPolyData obviously process polydata.

For other subclasses of vtkPointSet such as unstructured grids similar filters could be written. However, we need to define a bunch of degeneracies: e.g., what happens when the points of a hexahedron are merged, depending on which points/how many, you could end up with a whole host of frankenstein cells. And it gets worse with higher order cells and such. If you want to create a specialized filter that just merges points and doesn’t change cell type, this is easy to do. We’d love to see an MR appear

Martin3d · September 28, 2021, 5:27pm

Thanks for the suggestion.

I can start by applying the vtkGeometryFilter on my repeated-vertices-hexahedron which gives me a repeated-vertices-quad without merging anything (that’s what I found to convert hexahedron to quads but maybe there is something better)
Then I may apply vtkCleanPolyData to merge repeated vertices.
Then I may apply vtkGeometryFilter again on the shared-vertices-quads.

That should work. Output geometry would not be hexahedron but in most of the case there is a single visible face in the hexahedron, so I am fine