2D Surface cutter development

It appears the vtkCookieCutter doesn’t yet have an InsideOut option. vtkClipDataSet is not exactly precise for 2D surfaces.

This MR has certain issues. After tinkering with vtkCookieCutter, I felt it needed to be redesigned completely, hence SurfaceCutter.

I’ve written an implementation https://github.com/jaswantp/SurfaceCutter. It already works great and is efficient (Performance slightly better than vtkClipDataSet, way better than vtkCookieCutter).
I had to redesign the algorithm from the ground up to support inside out features in 2D and be precise. At present, the implementation only supports open surfaces and loops that are either +/-Z oriented. There are tests and a benchmark executable with realtime controls that move/rotate the surface, so feel free to play around.

It is derived from vtkDataSetAlgorithm, so it should support vtkPolyData/vtkUnstructuredGrid surfaces.
I would love to receive feedback on how it can be improved. Specifically, want to discuss the array dispatch functors and how they can be simplified.
My understanding of vtkDataArray, dispatch related tools comes from here, here, here and here.

I took a quick look at it, and the dispatch usage looks good to me.

My only suggestion would be to replace the vtkDataArrayAccessor usage with the vtk::DataArrayValueRange and/or vtk::DataArrayTupleRange utilities described in the last link you mentioned. Again, this is just a suggestion – there’s nothing wrong with the accessor code, the ranges just offer some convenience and allow idiomatic STL-style access to the array data.

Let me know if you have any more specific questions about the dispatch / range stuff.

Hmm, Thanks for your reply! I decided to not incorporate them yet since I was trying to maintain 8.2.0 compatibility. I use this for an internal project which has that dependency. :confused:

Besides, I want to know if it’s possible to get SoA access to the points, instead of the tuples x0,y0,z0|x1,y1,z1| so on… Is there a way to get a pointer to all the X’s, Y’s, Z’s? I’m aware that I’m asking for an SoA layout in a construct that was meant to abstract away such layouts! This might help reduce the start up time where I cache a triangle’s x, y coordinates (SoA and AoS), scalars into STL vectors. For large meshes (>100k verts) this takes significant time.

Speaking about start up costs and caching triangles inside STL vectors, here’s another question. What happens when I resize the data array within the functor. So, I’d want the existing points to remain and some extra space to be allocated/reserved(?). Does this deallocate, allocate and copy over all the points? I’m looking for something similar to STL vector’s reserve.

If there’s no copy, I could directly write to the points, scalars data arrays. At the moment, I reserve a big enough STL vector of templated TriMeta somewhere in the beginning to cache existing triangles and simultaneously fill up new ones. So, apart from the start up performance hit with this approach, it’s a bit of a mess to keep track of a specific triangle with its data, indices. I’ve a hunch the situation might be better if I could directly work with the data array accessors or ranges.

Is there a way to get a pointer to all the X’s, Y’s, Z’s

If your data is originally in SoA format, it can be stored in a vtkSoADataArrayTemplate and pointers to the individual components arrays are available through that class’s API. This can be detected via the dispatchers and added as a fast-path optimization when SoA arrays are detected, and is used in vtkDataArray::DeepCopy to optimize when both the input and output arrays are SoA layout (link).

If you’re using the vtkDataArrayAccessor interface, there is no way to get a pointer to any value in the array, since the accessor may be using the vtkDataArray double API, in which case the double values are computed on the fly – they do not exist in memory and thus are unaddressable. That said, you can just pull in the components you need with the APIType vtkDataArrayAccessor::Get(tuple, component) overload (link). This will affect performance differently depending on what the accessor’s ArrayT type is:

  • If ArrayT=vtkSoADataArrayTemplate<...>, this may be faster since you can avoid loading any data from the z component array. But you’ll only notice this if the algorithm is memory bound.
  • If ArrayT=vtkAoSDataArrayTemplate<...>, it may be slightly faster due to eliminating the copy. But the unused components will still need to be loaded into the cache, so this really only helps if the algorithm is compute-bound and the copy into the stack buffer is actually significant.
  • If ArrayT=vtkDataArray (e.g. the fallback case), it will be slower compared to the “tuple-at-once” accessor methods, since looking up individual components will require more virtual calls through the vtkDataArray API. You can see this effect in this benchmark result (taken from this blog post). But this may be acceptable, since the fallback paths are usually slow anyway.

So ultimately, if you’ve identified this step as a bottleneck, it’s worth trying both APIs and benchmarking for your particular algorithm and a typical input to see which is better for you.

What happens when I resize the data array

There are a few different resize methods, all of which have different behavior:

  • SetNumberOfTuples: Allocates and resizes, does not preserve data. Does not initialize new memory.
  • Resize: Allocates and resizes, does preserve data. Does not initialize new memory.
  • Allocate does similar stuff too, but is poorly documented and you’d have to dig around in the source code to figure out if it preserves data…
1 Like

Thanks for such a detailed reply. I’ll run diagnostics and profile it. I suppose it’s only worth refactoring if the data caches done once per Update() are really a major bottleneck!

Fwiw, I’ve fine tuned every other aspect of the algorithm. If any, I suppose the point-in-polygon tests/vtkDelaunay2D::Update()/vtkTriangleFilter::Update() might be suspects, since these are updated for each triangle everytime a loop’s edge intersects it. vtkPolygon::PointInPolygon() uses ray-cast with a static random number generator which isn’t thread-safe, so I use a custom pnpoly function.

Alright, I ran a few diagnostics. The reports are in this repository. They can be viewed directly in VS2019 without the need to compile/build the project from source.

Built in release config with

  target_compile_options(${PROJECT_NAME} PRIVATE "$<$<CONFIG:Release>:/Z7>")
  target_link_options(${PROJECT_NAME} PRIVATE "$<$<CONFIG:Release>:/DEBUG>")
  target_link_options(${PROJECT_NAME} PRIVATE "$<$<CONFIG:Release>:/OPT:REF>")
  target_link_options(${PROJECT_NAME} PRIVATE "$<$<CONFIG:Release>:/OPT:ICF>")

to generate embedded debug info in *.lib and -Zi to produce a .pdb file for the benchmark executable.

./benchmark.exe -m data/BigSurface.vtp -l data/TestPolys2.vtp
Mesh:-
 Cells : 131072
 Points: 66049
Loops:-
 Cells : 1
 Points: 145
...
and rotate the mesh about z-axis for about 50 seconds to collect profiling info.

Source code annotations:

  1. Cache information of triangles


    The caching process is NOT a bottleneck! (~6%)

  2. Let the surface acquire loop’s points, apply vtkDelaunay2D.


    Nothing unusual here either…

  3. Throw edges from loop onto the surface


    Nothing unusual here, the bounding box test above (14.29%) saves unnecessary segment-segment intersection calls.

    Nothing unusual here either.

  4. vtkTriangleFilter on the sub-mesh and subsequent array dispatch with ApplyConstraintsImpl.


    vtkTriangleFilter::Update() is unusual (~14%), perhaps I could create a vtkPolygon instance and call poly->Triangulate() rather than use the filter for each polygon…

  5. This is the last step.

Auxillary:

  1. createMesh
  2. createTriMesh
  3. extractTris
  4. pnpoly

Edit: added compiler flags used to generate debug info in release config.
Note: Let me know if it is preferred to have the images hosted elsewhere and post links here.

Great work! I briefly looked at the code and repository, and see that you are using a MIT license, so I intend to use some of what you’ve done. A question, what are your long term contribution plans (e.g., contributing this class, or pieces of this code) to VTK ?

Thanks! I planned to contribute.

I got busy with something else, but these are the changes I had in mind. I plan to do them some time soon (2-3 weeks).

  1. Adhere to general coding conventions in the VTK software process document. Most of the source should be fine, but I need to do a stringent check. If that document is outdated, please let me know and I’ll be glad to follow it.

  2. Integrate the current tests with VTK’s testing framework and follow the ritual described in that document.

  3. Contribute this as a new class; vtkSurfaceCutter. It cannot replace vtkCookieCutter since the former strictly works with and generates triangulated surfaces whilst the latter can handle polygons too.

The license was a “temporary”. I planned to contribute anyway so I should’ve used BSD or left it empty.

PS: Let me know if you’ve a better name for this class.

We might even consider replacing vtkCookieCutter with your work (there may be some reasons not to, we’d have to look into this).

Also, I’d love to talk to you about working at Kitware :-). EMail will.schroeder@kitware.com if you want to follow up.

Update: Added a second port which gives the loops as projected upon the surface. Pretty cool now. :nerd_face:

1 Like

Update:
In recent commits,

  • fixed a couple of memory leaks with vtkCellIterator. I learned about this when tests from my MR to vtkStripper failed :disappointed:
  • vtkNew when vtkSmartPointer was just too long!
  • In perf-critical sections; delaunay triangulations/edge flips/cell traversal, manually manage object lifetimes.

As it stands, SurfaceCutter is ready for integration into VTK. Should it be a separate class or replace vtkCookieCutter? @will.schroeder, @Yohann_Bearzi

Further improvements:
It’s already efficient and fast (in realtime). From my benchmarks:

  • ~1k verts, ~1k cells ~15ms /frame
  • ~10k verts, ~20k cells. ~80ms /frame
  • ~70k verts, ~140k cells ~350ms /frame

~2x faster than vtkClipDataSet

I tried vtkOBBTree / vtkStaticCellLocator / vtkModifiedBSPTree to accelerate certain stages but it did not always work as expected. I’ll have to explore this option later as it might involve some amount of tinkering with the source.

1 Like

My preference would be to replace vtkCookieCutter, unless there are some incompatible behaviors.

  • Allocate does similar stuff too, but is poorly documented and you’d have to dig around in the source code to figure out if it preserves data…

I think it’s safe to assume that Allocate always releases existing data. Happens here and for vtkIdList too.

My preference would be to replace vtkCookieCutter, unless there are some incompatible behaviors.

SurfaceCutter

  1. cannot cut polygons. (only triangles)
  2. does not copy/interpolate dataset attributes.

It would be tricky to implement 1. Working on 2 now …