Slow voxelisation with vtkPolyDataToImageStencil

rootofpi · April 18, 2021, 2:54pm

Hello everybody,
I am currently writing a code for numerical simulation and would like to use VTK for creating voxel geometries from given STL and OBJ files. I am new to the VTK library so I have tried to work with the wide range of examples - in particular by adapting the vtkPolyDataToImageData example but I am running into problems for more complex geometries: They take several minutes to voxelise.
My code starts by importing the corresponding files as described in these two examples (STL and OBJ) and then a code inspired by the vtkPolyDataToImageData example (see below) creates a voxel representation in the form of image data. While it seems to work for simple geometries quite fast (e.g. 15 seconds for this skull) I had trouble with several larger files (e.g. this building which takes several minutes). The code takes quite long for the vtkPolyDataToImageStencil part. Is there a better way to do it or do I simply have to live with it taking so long?

vtkSmartPointer<vtkImageData> VoxelisePolyData(vtkSmartPointer<vtkPolyData>& poly_data, int const resolution_x) {
  // Colour values for voxels
  constexpr unsigned char background_colour {0};
  constexpr unsigned char foreground_colour {1};

  // Get geometry bounding box and calculate domain size and resolution
  double bounds[6] = {};
  poly_data->GetBounds(bounds);
  double const domain_size[3] = {bounds[1] - bounds[0], bounds[3] - bounds[2], bounds[5] - bounds[4]};
  int const resolution[3] = {resolution_x, static_cast<int>(domain_size[1]/domain_size[0]*resolution_x), static_cast<int>(domain_size[2]/domain_size[0]*resolution_x)};

  // Output information about domain size and resolution
  std::cout << "Domain size (x,y,z): " << domain_size[0] << ", " << domain_size[1] << ", " << domain_size[2] << std::endl;
  std::cout << "Resolution  (x,y,z): " << resolution[0]  << ", " << resolution[1]  << ", " << resolution[2]  << std::endl;
      
  // Prepare voxel image: set resolution, spacing, extent, origin, image data type and background colour
  vtkSmartPointer<vtkImageData> image_data {vtkSmartPointer<vtkImageData>::New()};
  image_data->SetDimensions(resolution);
  double const spacing[3] = {domain_size[0]/resolution[0], domain_size[1]/resolution[1], domain_size[2]/resolution[2]};
  image_data->SetSpacing(spacing);
  double const origin[3] = {bounds[0] + spacing[0]/2.0, bounds[2] + spacing[1]/2.0,  bounds[4] + spacing[2]/2.0};
  image_data->SetOrigin(origin);
  image_data->AllocateScalars(VTK_UNSIGNED_CHAR, 1);
  image_data->GetPointData()->GetScalars()->Fill(background_colour);

  // Convert poly-data to image stencil
  vtkSmartPointer<vtkPolyDataToImageStencil> pd_to_img_stencil {vtkSmartPointer<vtkPolyDataToImageStencil>::New()};
  pd_to_img_stencil->SetInputData(poly_data);
  pd_to_img_stencil->SetOutputSpacing(image_data->GetSpacing());
  pd_to_img_stencil->SetOutputOrigin(image_data->GetOrigin());
  pd_to_img_stencil->SetOutputWholeExtent(image_data->GetExtent());
  pd_to_img_stencil->Update();
  // Convert image stencil to voxel image
  vtkSmartPointer<vtkImageStencilToImage> img_stencil_to_img {vtkSmartPointer<vtkImageStencilToImage>::New()};
  img_stencil_to_img->SetInputConnection(pd_to_img_stencil->GetOutputPort());
  img_stencil_to_img->SetOutsideValue(background_colour);
  img_stencil_to_img->SetInsideValue(foreground_colour);
  img_stencil_to_img->Update();
  image_data->DeepCopy(img_stencil_to_img->GetOutput());

  return image_data;
}

Thanks a lot!

dgobbi · April 19, 2021, 1:35pm

Would it be possible for you to run your samples in a profiler? If you can find out exactly what parts of the code in vtkPolyDataToImageStencil cause the bottleneck, then there might be some simple fixes that I can suggest.

I wrote this class many years ago, and originally I had planned to parallelize it and do other optimizations, but for my own applications the optimizations turned out to be unnecessary (my polydata only had a few thousand polygons, 100000 polygons at the very most).

rootofpi · April 19, 2021, 3:08pm

Hello David,
thanks a lot for the reply and your involvement in VTK in general.
I already noticed it is using one thread only but I think I read already in a comment of yours that you actually tried to design it in a parallelisation-friendly way. I will try to profile it with Valgrind and share more details then!

dgobbi · April 19, 2021, 4:35pm

Thanks, I’m interested to hear what you discover. If you have looked at the code, you can see why it is slow: for every slice of the image, it visits every polygon in the polydata.

It would run a lot faster if it looped through the polygons first, to find out which slices are touched by each polygon. If it did this first, then when it loops through the slices (which can be done in parallel) it would only have to visit a small list of polygons for each slice.

rootofpi · April 25, 2021, 7:11pm

I just ran perf on the code for the simple skull as well as the more complex building geometry on my very low-power notebook. The overhead of the two different scenarios is given below:

Skull
Domain size: 19.8403, 29.452, 27.5257;
Number of polygons: 40728 polygons
Runtime: 14 seconds;
35% vtkAOSDataArrayTemplate::GetTuple (libvtkCommonCore)
29% vtkPolyDataToImageStencil::ThreadedExecute (libvtkImagingStencil)
10% vtkPoints::GetPoint (libvtkCommonDataModel)
6% vtkPointSet::GetPoint (libvtkCommonDataModel)

Building
Domain size: 6076.34, 2917.93, 5934.75;
Number of polygons: 16234;
Runtime: 357 seconds;
38% vtkAOSDataArrayTemplate::GetTuple (libvtkCommonCore)
35% vtkPolyDataToImageStencil::ThreadedExecute (libvtkImagingStencil)
13% vtkPoints::GetPoint (libvtkCommonDataModel)
7% vtkPointSet::GetPoint (libvtkCommonDataModel)

What perplexes me is that the number of polygons “poly_data->GetNumberOfPolys()” is lower for the second geometry yet it takes a lot longer to complete. Maybe this has to do with the domain size? Have I used the library in a wrong way? Any clue what could be causing it or is there another factor that influences the computational complexity?

dgobbi · April 25, 2021, 7:56pm

Thanks for the profiling. It definitely looks like the algorithm spends more time looking through the polygons (GetPoint() and GetTuple()) than anything else.

The number of slices is what your code calls resolution[2]. The way the code is currently written, the time complexity of the algorithm is O(n*m) where n is the number of slices and m is the number of polygons. Decrease the resolution and you will see a proportional decrease in the time.

rootofpi · April 25, 2021, 8:40pm

Thanks again David for your quick reply…
Indeed O(m n) is actually the complexity I would have expected. But I put a x-resolution of 200 for both tests. This means that the geometry Skull has a resolution of (x,y,z): 200, 296, 277 and 40728 polygons while the Building geometry has a resolution of (x,y,z): 200, 96, 195 and 16234 polygons.I have started the voxelisation for another geometry (domain size of 158.64, 157.526, 51.3832) with 138932 polygons and a resolution of 200, 198, 64 and 1 hour and 30 minutes the process is still not completed. I have no idea how that comes…
From my understanding setting the dimensions should be similar to setting the dimensions of an image. So each entry of resolution should define the resolution on each axis. So why should the second geometry take 25x longer to complete with a resolution that is 4.4x smaller and has 2.5x fewer polygons than the first one? Am I misunderstanding the parameters of the SetDimensions function?

dgobbi · April 25, 2021, 9:16pm

That’s an important bit of information to include with the benchmark results.

The SetDimensions() method does exactly what you would expect. With the large time discrepancy, there might be something funny going on with the points.

For example, usually when two polygons share an edge, the two points that form the edge will only be stored in memory once. If they are stored twice (once per polygon), then we would say that they are duplicated. Sometimes the point duplication is intentional. But if present, point duplication can mess up algorithms that make assumptions about the topology of a model.

Try running the model through vtkCleanPolyData before you voxelise it, this will merge any duplicate points. Also check the number of points before and after the cleaning to see if it changes.

rootofpi · April 25, 2021, 9:24pm

That would make sense. I will have a look at it and update you! Thanks a lot for the detailed information!

dgobbi · April 25, 2021, 9:25pm

Also, call GetLines() on the data to see if it contains polylines in addition to polygons. There is code in vtkPolyDataToImageStencil for handling polylines, and it is less efficient than the code for handling polygons.

rootofpi · April 25, 2021, 9:51pm

Interesting… Cleaning it really solved my issues with the Building model. It would still output the same number of polygons and lines but the vtkPolyDataToImageStencil will process the model in a fraction of a second. Thanks a lot for the tip!
But I did not get to work the largest geometry, an .stl model with 138932 polygons. I also tried to apply the vtkDecimatePro filter with compression rates of 95% and cleaned it up afterwards but yet the vtkPolyDataImageStencil literally seems to be stuck processing it. Have you another idea what this might cause it?

dgobbi · April 25, 2021, 10:16pm

Make sure you clean the model before decimation, not after.

The vtkFeatureEdges class can check the topology (it’s best to clean the data beforehand). This class will basically show edges that are not shared between two polygons, you will want to use BoundaryEdgesOn() and NonManifoldEdgesOn(), but make sure to also use FeatureEdgesOff() and ManifoldEdgesOff(). The usage is to render the output of vtkFeatureEdges with vtkPolyDataMapper (or vtkDataSetMapper), and it will show you where the problem edges are.

The vtkPolyDataToImageStencil algorithm expects every edge to be shared by exactly two polygons. If this is not the case, then instead of generating contiguous contour when it slices the data (like it should), it generates a whole bunch of separate line segments and has to waste time with sorting and searching to figure out how the line segments fit together.

rootofpi · April 25, 2021, 10:40pm

A lot of great tips. Thanks a lot! I will see what I can do.
I have tried a few larger models now that have a lot more polygons and all work (with some artifacts though). That particular one that I mentioned still does not work but I guess the preprocessing is very important for that. I have already seen a few posts regarding it and I will try to follow the tips given there as well!
Thanks a lot for your time and sharing your knowledge!
Greetings from the other side of the Atlantic!