Build Python wheels with SMP tools for runtime backend selection

According to the build settings:

Users can set:

  • VTK_SMP_IMPLEMENTATION_TYPE (default Sequential): Set which SMPTools will be implemented by default. Must be either Sequential, STDThread, OpenMP or TBB. The backend can be changed at runtime if the desired backend has his option VTK_SMP_ENABLE_<backend_name> set to ON.

I don’t think this is enabled for the Python wheel builds that VTK releases. I suspect that Sequential is used by default for compatibility reasons, which is OK, but then does it make sense to otherwise build with the backends such that users can select a backend at runtime?

  • VTK_SMP_ENABLE_STDThread: ON
  • VTK_SMP_ENABLE_OpenMP: ON
  • VTK_SMP_ENABLE_TBB: ON

There seems to be technical reasons for why this is not done (see links below), but it would be nice for Python users in particular to be able to make use of the performance boost using the VTK wheels published on PyPI.

https://gitlab.kitware.com/vtk/vtk/-/issues/18668

If this should be tracked as a feature request, let me know and I’ll create an issue in GitLab.

BTW, I built manylinux2014 compatible wheels with TBB enabled (by default) by following:

This script was run in a manylinux2014 docker image. You have to use an older version of TBB (as VTK currently does) as the modern one doesn’t support the older toolchain used in manylinux2014.

TBH I dont think this was considered specifically, I think you could just try adding VTK_SMP_ENABLE_TBB to the Wheel CI conf and see how it goes.

I have no exact numbers immediately at hand, but experimented with STDThread/TBB/OpenMP SMP implementations in VTK a while ago. My results did not show big differences between the implementations. More or less in line with the numbers shown here: https://www.kitware.com/vtk-shared-memory-parallelism-tools-2021-updates/

I see the most recent 9.6.0rc3 has the STDThread backend available, but not enabled by default. The two others are unavailable.

If there are no significant performance gains in the OpenMP or TBB backends, I see no reason to have them in the Python wheels. Adding this will add to the total size and you will need to add in and pack the supporting runtime libraries.

1 Like

Very good point @hakostra , I forgot that STDThread is enabled by default, which makes it already available to use.

For simplicity, I would just use std:thread for now.

However, note that anytime the workload of worklets varies significantly, so that load balancing is important, significant performance differences are often realized between TBB and std::thread. I am currently seeing 50-75% speed increases (using TBB vs stdthread) on some Voronoi-based meshing applications.

1 Like

I’m seeing similar gains (~50%) using TBB vs stdthread for vtkGeometryFilter.

I tried adding this here:
https://gitlab.kitware.com/vtk/vtk/-/merge_requests/12884

Based on discussion in that MR, enabling TBB by default doesn’t seem like it’s currently viable, and users will need to build from source themselves to use it.

Sorry for being late to this party and missing that thread (no pun intended!!!). Note that there is a TBB wheel on linux: tbb · PyPI
So it is totally possible to use non-mangle TBB on Linux. There seems to be a TBB wheel that shows up on Homebrew on the mac. Not sure where it comes from.
Using TBB should be possible as long as we are willing to create and maintain TBB wheels for Mac Silicon and Windows. I suggest that we do that.

There is no wheel/build for ARM (aarch64), even on Linux. I guess this is a strategic/political choice from Intel, who seems to be the publisher of this package, to only build for x86. I see that as a big disadvantage, as arm is becoming more and more important. If one sets up to maintain repos/libraries/packages for Mac and Windows, one should also do it for ARM.

That is fine; the tbb wheel should ship the TBB libraries. The question is how other wheels that also want to use TBB’s C++ API should get access to the same libraries as that wheel.