As part of an internship at Kitware, I’m going to work on improving vtk SMP tools. And we would like to engage the broader VTK community for ideas and feedback.
The main goal of this work is to support VTK-m as a new backend of the SMP tools.
Some intermediate tasks will be to integrate more utility functions (inspired by VTK-m algorithms) such as Transform and Fill in vtkSmpTools.
As a potential thing to do we could also refactor SMP tools current architecture to support multiple backends at compile and select one at runtime.
Other non-exhaustive features could be: the support of lambdas, an environment variable to control the number of threads…
We are open to any suggestions you may have.
Let me know what your thoughts are about this, and if you have any questions.
Thanks for your attention,
I think this is a great idea and it will let developpers use VTK-m much more simply, even if not to it’s full capacity.
There is a pending MR for changing the (maximum) number of threads at runtime, using a
VTK_SMP_MAX_THREADS environment variable.
Please let me know if you have any feedback about this changes.
Great work, customers have asked for this feature, thanks
Why is this needed when the Initialize method passes the number of threads as a parameter?
Because this is a control that is important to expose at high-level but modifying every single filter using vtkSMPTools with a dedicated parameter would not be practical.
Where I work we’ve recently looked/tried to start using TBB with VTK. We did measure performance improvements using it, but recently gave up for a few reasons, which may interest you.
The newest TBB uses CMake and is thus easier to work with, but…
VTK doesn’t support the newest TBB https://gitlab.kitware.com/vtk/vtk/-/issues/18107
TBB doesn’t play nice with Thread Sanitizer (TSan) https://github.com/oneapi-src/oneTBB/issues/358 It’s quite valuable to us that our whole test suite passes with TSan, so introducing something that does not is disagreeable.
Given these issues, I’ve wondered about the feasibility of a GCD (Grand Central Dispatch) backend to vtkSMPTools. See: https://en.wikipedia.org/wiki/Grand_Central_Dispatch
This is a great point. When vtkSMPTools was first introduced, it included a backend based on vtkMultiThreader. It was a “batteries included” sort of backend, since it didn’t add any extra dependencies that might block its use in certain applications. Are C++11 threads mature enough that they could be used to create a similar batteries-included backend? I’m not asking anyone at Kitware to actually do the implementation, I’m just wondering if people think it’s feasible and whether it might be accepted into the VTK codebase.
I know it wouldn’t provide the same performance as TBB or other backends, but it would be close, and there is tremendous value in having something that can be part of the default VTK build. For example, people who grab VTK from PyPi don’t get any SMP at all (except for the vtkMultiThreader stuff in the imaging pipeline). There are huge parts of the VTK community who aren’t getting any benefit from the current SMP back ends. And I say this as a member of the community, who contributes to VTK for the benefit of the community.
I can only find one place in VTK where vtkSMPTools::Initialize() is actually called.
Indeed, because Initialize is automatically called when using vtkSMPTools::For.
Edit: Looks like I was wrong, I’ll let @Charles_Gueunet and @Timothee_Couble answer here.
The vtkSMPTools::Initialize() was not called before vtkSMPTools::For but after some discussion with @mwestphal and @Charles_Gueunet, we change this behavior in the above MR (7794)
So now, if a developer calls vtkSMPTools::Initialize(x) in their application, your changes are going to override/alter their specified thread count.
Good point, it would be good if VTK would defer to the application if necessary. Can we add a vtkSMPTools::IsInitialized() method to check if initialization has already occurred? I suppose it depends on the backend, evidently in TBB
tbb::this_task_arena::current_thread_index() function returns
tbb::task_arena::not_initialized if the thread has not yet initialized the task scheduler.
Yes indeed, we removed this behavior in the MR.
Multiple solution are still being discussed internally:
- one of them would be to have the possibility to create a local instance (of
vtkSMPTools or of a
vtkSMPToolsLocalParameters class) that would drive the number of thread locally.
- or we could also consider that Initialize should be called only once at the beginning of the pipeline.
The vtkSMPTools::IsInitialized would be especially useful for this second proposition.
I’ve been using vtkSMPTools a bit lately and have some comments. They may be tangential to what you are aiming to do, but here they are anyway:
- Sometimes, I want to pass numbers other than vtkIdType to
vtkSMPTools::For(). For example, I have a sparse set or map whose keys (perhaps hashes for unordered sets/maps) are
uint64_t and want to pass a range of keys to consider. Casting to vtkIdType won’t work since
vtkSMPTools::For() does math on its arguments.
- Sometimes I want to invoke a “for” loop on every thread without providing work. Is the correct thing to do to call
vtkSMPTools::For(0, vtkSMPTools::GetEstimatedNumberOfThreads(), functor)? If so, that should be documented. If not, can we add a new way to invoke a worklet? An example use-case: I have a functor that does flood-filling and want threads to steal seeds from and insert seeds into a single queue until it is empty.
- For the use case above, it would also be nice if there was a way for a thread to ask for its rank and the pool size (similar to
MPI_Comm_size). This would allow worklets to do fancier things like lockless writes to a shared queue.
[edit: removed a comment because I confused
vtkSMPTools::Initialize() with the functor’s
Initialize() method in the discussion above.]
I don’t know how doable it is, but I think it would be nice to be able to use SMP tools on iterators / ranges. This is especially useful for containers that are not indexed like arrays (
std::set for instance).
Here’s an example on how it could work with iterators in the worker:
void operator()(IteratorT begin, IteratorT end)
for(IteratorT it = begin; it != end; ++it)
// Do stuff
This could be called using
vtkSMPTools::For(container.begin(), container.end(), worker);
With ranges (maybe more optional, as iterators could do the same thing). Note that the range would likely be a proxy range that internally uses iterators from the input container:
void operator()(RangeT& range)
for(auto& element : range)
// Do stuff
This could be called using
Iterators should work on containers without random-access iterators as long as they have an O(1)
container::size() method and an efficient
std::advance() specialization (which is certainly possible with red-black trees used for
map – though implementations may vary).