Do we want to support GPU backend in the SMPTools?

Hi everyone,

We are currently trying to integrate VTKm as a new SMPTools backend with @Charles_Gueunet. (MR-7823).

We realized we couldn’t use the “standard” version of VTKm (with GPU backends) for this work. The GPU backends can’t use mechanisms such as pointer dereferencing and access to random iterators which are needed in some SMPTools functions. (e.g in vtkSMPTools::Transform and vtkSMPTools::Fill support non indexed arrays such as std::map and std::set).

After some reflection we wondered why the SMPTools doesn’t support GPU whereas the name SMP would suggest it does. Moreover we could technically support it through VTKm (the idea here is not to add a new GPU backend) and allow us to use the full potential of VTKm GPU backends.

But it may be incompatible with the current implementation of Transform and Fill and with some recent new suggestions (e.g. add std::map and std::set support for vtkSMPTools::For() @Yohann_Bearzi).

Finally if we don’t want to support the GPU backends we need to find a way to only use the CPU version of VTKm, one of them would be to compile twice VTKm: one for vtk algorithms and the other for the smptools (with CPU backends only). But as VTKm is quite long to compile it wouldn’t be perfect.

What future for the SMP tools, to become vtkMultiThreadTools, or to handle GPU with the according constraints?

Yes, most definitely, ideally using VTKm.

When you say that the GPU backend can’t do pointer dereferencing, does it mean that if you have a VTK pointer attribute (a vtkDataArray* for instance) in your functor, you will not be able to use VTKm?

If there is a reliable way to know at compile-time what functors / SMP calls are unable to use VTKm as a backend, with some meta-programmation, you might be able to turn on / off the backend if VTK is compiled with VTKm?

Your remarks just made me think about something else. What happens if a functor uses standard library containers internally, such as std::vector? I know that CUDA doesn’t support those yet, though they will at some point. Is there a mechanism in VTKm to translate those containers into something the GPU can understand, or is this another constraint?

If there is a reliable way to know at compile-time what functors / SMP calls are unable to use VTKm as a backend, with some meta-programmation, you might be able to turn on / off the backend if VTK is compiled with VTKm?

Oh I just realized that this makes the default backend choice a problem. We don’t want to have std::thread as the default backend when VTKm fails to use the GPU. And tbh, compiling VTKm twice (one for the CPU, and one for the GPU) seems horrific compile-time-wise…

Can this please be done after supporting runtime SMP backend selection? Then the new SMP backend can live in a new module without making everything have to wait for VTK-m to compile in such a build. Right now (unless I’m remembering incorrectly), any SMP implementation must live in VTK::CommonCore. If we’re adding GPU support in this way, it’s going to lose its “core”-ness really quickly (though one could argue it already has many things which aren’t “core” anyways, I’d like to avoid piling even more in).

The following statement confuses me. How does SMP imply GPU?

I usually think of parallel computing as being one of the following architectures:

  1. traditional job submission systems like SLURM and SGE (grid computing)
  2. message-passing interfaces, like VTK’s MPI-based parallelism across nodes
  3. symmetric multi-processing, which strongly implies shared memory
  4. GPU processing, where the processing does not occur in system memory

My understanding is that SMP and GPU are different architectures for parallelism. When I do SMP, I expect that it will at least appear as if the processing occurs in system memory.

You are right, SMP does not imply GPU, I might have misled Timothée here.So the argument about the name does not stand.

I’m still unclear on the whole concept of a GPU back-end for vtkSMPTools. When you pass a functor to vtkSMPTools::For(), the functor will in general contain compiled C++ code (compiled for the CPU, not for the GPU). Machine code that has been compiled for the CPU cannot run on the GPU.

I agree that it is possible to write functors for the GPU, but that is a special case, not the general case. All the filters in VTK that use vtkSMPTools use functors that are compiled for the CPU. Adding a GPU back-end won’t magically make CPU code run on the GPU.