Soon I will push a revamped version of vtkGeometryFilter. Since it is used heavily in VTK, I wouldn’t be surprised to see some issues pop up (despite passing all tests). Please let me know if you see anything untoward or have any comments/questions.
The major improvement relates to performance, especially for unstructured grids. Due to threading and the use of more efficient operators for determining whether a face is boundary, I’m seeing increases of >20x on my 20-core desktop system, and with the new FastMode feature, as high as >70x. This means that it is now much faster than the evil vtkDataSetSurfaceFilter; typically I see 5x-10x faster (in fast mode ~20x). Serial performance is also faster. Finally, vtkGeometryFilter uses less memory (about 50% of vtkDataSetSurfaceFilter). (YMMV)
History. First, a brief history lesson (as I remember it). Back in the days of yore, vtkGeometryFilter was written to extract the boundaries of non-polygonal datasets (typically used internally with vtkDataSetMapper). It used the abstract vtkDataSet API, and had some nice features in its ability to extract portions of the dataset using cell and point id ranges, and using x-y-z extents. However, it was really slow, mostly due to its use of GetCellNeighbors() to determine whether a face was boundary, and lots of virtual GetCell() methods. Behind the scenes, this filter also invoked BuildLinks() to expand the topological information of a grid data structure (here I am mostly focusing on unstructured grids); and it used a vtkPointLocator to merge coincident points.
Due to its early, slow performance, vtkGeometryFilter was a significant application bottleneck. As a result, the filter spawned its evil progeny (tongue firmly in cheek) vtkDataSetSurfaceFilter. This filter was much faster as it used a face hash table in which cell faces were inserted. Basically, if a face was inserted only once it was a boundary face, if twice an interior face. The filter also dropped the bells and whistles related to cell and pt id and extent clipping. Other work was done to speed up performance for other vtkDataSet types etc.
Now to current day: much has changed. Threading in various forms is available in VTK. Building cell links is threaded and way faster. We’re a lot smarter about and focused on writing performant algorithms. Voilà, it’s time for a vtkGeometryFilter makeover.
vtkDataSetSurfaceFilter. This filter has some issues. As mentioned, it uses a centralized hash table to hold every face in a dataset (that’s a lot of memory resources). Being centralized, the hash table is an inherent parallelization bottleneck. It produces different output than vtkGeometryFilter (for example in some cases it introduces “seams” or discontinuities in its output) so it can’t be used as a drop in replacement for vtkGeometryFilter (and vice versa - for example in the test QuadricDecimation.py replace vtkGeometryFilter with vtkDataSetSurfaceFilter and the test will fail). It has a nasty complex implementation: While its face extraction has grown over the many years (which is a good thing) to handle all cell types, including non-linear types, there is embedded tessellation logic in the filter (which I would argue belongs elsewhere, probably in the cell class itself).
vtkGeometryFilter. As is now implemented, it is threaded with vtkSMPTools. A new, faster method vtkUnstructuredGrid::IsCellBoundary() is introduced. Behind the scenes it uses threaded BuildLinks(). It introduces a FastMode, which examines the degree of each point in a cell (the degree of a point is the number of uses of a point by cells). If the degree is less than a specified threshold, the cell is a candidate for boundary face extraction, otherwise the cell is skipped. (It turns out that points on a boundary of a dataset in general have a lower degree than interior points - this is not 100% reliable but can be used to quickly extract an approximate boundary, and works better than you might think.) The usage of a point locator is eliminated. In the implementation of the filter, a threaded framework is used to process all dataset types. Also, vtkGeometryFilter now supports functionality like PassThroughCellIds and PassThroughPointIds. However, while vtkGeometryFilter supports all linear cell types, some non-linear types (e.g., Bezier related) are not supported (this could be added). Instead, if such unsupported types are discovered, vtkGeometryFilter will automatically delegate to vtkDataSetSurfaceFilter (if delegation is enabled).
Timing. I’ve tested on large datasets: a synthetic grid of 26.7 million hexes; a synthetic grid of 133 million tets; and a customer provided mesh of >509 million tets. Call these datasets: (A,B,C). On my computer the timings are: (this is informal, typical output)
- Old vtkGeometryFilter: (19.41secs, 108.5secs, 571.5secs)
- New vtkGeometryFilter: (1.351, 4.781, 25.27)
- New vtkGeometryFilter (FastMode enabled): (0.8877, 1.454, 7.888)
- vtkDataSetSurfaceFilter: (6.984, 38.36, 189.5)
Next Steps and Issues. I originally thought about unifying vtkGeometryFilter and vtkDataSetSurfaceFilter, but because they produce different output in some cases, and due to limitations of time, I gave up on this. Both filters are able to delegate to each other (there are DelegateOn/Off methods which enable automated delegation when needed - e.g., if vtkDataSetSurfaceFilter::DelegationOn is enabled, then the vtkDataSetSurfaceFilter will invoke vtkGeometryFilter for linear 3D unstructured grids).
vtkGeometryFilter could be expanded to support all cell types - I’d prefer to do this after rethinking the cell tessellation logic so it’s not all embedded in this filter. There are also some differences between the filters in how pieces are processed, this could be made consistent I believe. For now, delegation to vtkDataSetSurfaceFilter works fine in the rare case when nonlinear cells are detected.
Finally, the threaded algorithm in vtkGeometryFilter can generate output faces in a different order between runs. Typically this is not a problem because the rendering doesn’t change, and when picking the user queries the PassThroughCellIds, PassThroughPointIds, or some other data field to obtain information related to the underlying data. (In the many VTK tests I ran, I saw one instance where this affected downstream results - triangle stripping the output of vtkGeometryFilter produced a different set of triangle strips since there are more than one way to “correctly” triangle strip a mesh.) This could be fixed with some work.