Is it possible to implement vtkImageReslice on the GPU?

Anton_Botnov · November 27, 2023, 1:05pm

Hello, everyone. I ask the community to help me with the following question: Is it possible to implement vtkImageReslice on the GPU?

I have a proposed solution to this problem, but I may be missing something.

The algorithm is as follows:

Let’s assume there is an algorithm that takes a 3D voxel matrix and two cutting parallel planes as input. The goal is to obtain a 2D surface by interpolating voxel values between the planes.

The solution is as follows:

Calculate the width and height of the resulting 2D surface, as well as the vertical and horizontal spacing of this surface.
With a step equal to the spacing, create rays perpendicular to the cutting planes that will cover the entire 2D surface.
Using the Bresenham’s algorithm, calculate which voxels each ray intersects between cutting planes, and using a specified interpolation method, calculate the resulting value (e.g., the average among all intersections).

Bresenham’s algorithm is well-suited for GPU architecture, as each ray can be calculated in parallel.

How correct would it be to use such an algorithm for reslicing? Are there any disadvantages of such approach?

pieper · November 27, 2023, 1:42pm

A nice way to do this is to use the volume rendering infrastructure with the near clipping plane as your slice plane. You can set the transfer function to match what the window/leveled image reslice would have given you. If the volume is fully opaque the rays terminate at the first voxel they hit and the technique is very efficient. This has the advantage of cleanly generalizing to thick slab reconstruction, e.g. with MIP, and provides a nice foundation for integrating display of other 3d content in slice planes.

This is what we’ve done in cornerstone3D which is based on vtk.js, but I’m sure there are other examples too.

Anton_Botnov · November 27, 2023, 1:59pm

Thanks for response! It’s a good point.
But I need fast solution and don’t want to render whole volume, I need only interpolation of voxels between two slice planes, like in case of vtkImageReslice. So actually I have not only near slice plane, but also far and take into account all voxels between planes. I just looking for solution which will be faster then vtkImageReslice.

dgobbi · November 27, 2023, 2:13pm

Hi Anton, in your third step, I was’t sure why you wanted “the average among all intersections” between the two planes. Is your intention to do thick-slice (or slab) rendering, like vtkImageReslice::SetSlabModeToMean()?

Anton_Botnov · November 27, 2023, 2:28pm

Hi David. Actually I wrote it only for example, for now I use SetSlabModeToMax.

dgobbi · November 27, 2023, 2:46pm

Hi Anton, then my only comment is that there is no advantage to using Bresenham on the GPU, since you can efficiently shoot the ray using floating-point and do either 3D interpolation or (via shear warp) 2D interpolation. Using Bresenham would be more-or-less equivalent to nearest-neighbor interpolation unless you assigned weights to the voxels, but if you do that, you might as well just use the GPU’s built-in interpolation capabilites.

Anton_Botnov · November 27, 2023, 3:18pm

Thanks, David. I’m just thinking that if we have a lot of rays which covered entire projection surface. Then we only need to understand what traversal method use in order to calculate all intersected voxels by ray. If we know all intersected voxels we can choose one with maximum intensity (similar to SetSlabModeToMax) and use it as output pixel. And if we have ray per pixel, then we can calculate result of all pixels in parallel on GPU. Maybe there is a better solution for traversal method then Bresenham algorithm or there is modification for floating-point. The bottom line that such approach should be faster then any method implemented on CPU. Is that correct?

dgobbi · November 27, 2023, 3:33pm

Yes, it could easily be more than 100x faster on the GPU than on the CPU.

Anton_Botnov · November 27, 2023, 3:41pm

David, could you suggest suitable ray traversal algorithm for such approach? From the first glass seems that Bresenham’s algorithm can be adopted for solving this problem (modify it for floating-point)
And also I wondered that there’s no ready solution, I checked vtk, itk and some other frameworks and didn’t find any ready solutions

dgobbi · November 27, 2023, 9:17pm

Like you, I would define the “ray” as a line segment from a pixel on the front plane to the corresponding pixel on the back plane. Then I would choose a step size that is at most 1/2 the voxel spacing, and step along the ray, doing interpolation and compositing (using “max” for compositing).

I usually conceptualize ray traversal as follows. Let ‘A’ and ‘B’ be the end points. All points on the line are weighted averages of the ends. So if you want to do the traversal in n+1 steps (i = 0, 1, ... n) then:

P_i = (1.0 - i / n) * A + (i / n) * B

Or instead, use an increment vector ‘V’ and do the traversal with

P_(i+1) = P_i + V

which is more efficient but roundoff error increases with each step.

Regarding Bresenham, I suspect that you like the idea that you know exactly what voxels you hit. As for myself, I don’t find the idea of “hitting voxels” to be a useful concept. I conceptualize the “voxels” as infinitely small sample points, therefore the ray never “hits” them, it always passes between them. So some kind of interpolation is always necessary (even if it’s just nearest-neighbor interpolation).

In any case, creating a new implementation should not be necessary. The VTK volume mappers aready support this, though I didn’t find an example:

vtkVolumeMapper::SetBlendModeToMaximumIntensity()

By using a volume mapper in MIP mode, and adding clipping planes for your front/back slice planes, you might be able to get VTK to do what you need. You will probably also want to set the VTK camera to ParallelProjectionOn() so that you get parallel rays through the volume, rather than diverging rays.

Anton_Botnov · November 28, 2023, 10:05am

Hi, David. Correct me if I’m mistaken. In case of vtkVolumeMapper final result will be 3d volume, which will be cut by camera planes on final stage. So the pipeline is next:

Create whole volume by vtkGPUVolumeRayCastMapper (it will be done once)
Set blend mode to maximum intensity
Setup color table to grayscale
Setup camera transformation and camera near and far planes.
And result will be the same as vtkImageReslice?

dgobbi · November 28, 2023, 1:20pm

For the two planes, I was actually thinking of the mapper’s AddClippingPlane(vtkPlane) method. But setting the camera near/far planes might also work.

The result will be rendered on the screen, not output as a vtkImageData, so it isn’t really a drop-in replacement for vtkImageReslice.

Anton_Botnov · November 28, 2023, 1:55pm

Thanks. Will it be fast because we create 3d volume once and then only use different camera transforms and slice planes?

dgobbi · November 28, 2023, 2:15pm

In theory, yes. I’ve never tried it myself, though.

Anton_Botnov · November 28, 2023, 2:20pm

Thanks David, I will try

lassoan · November 28, 2023, 3:07pm

Using clipping plane should work well. You can set the clipping planes to a fixed position (where you want your image slice to appear) and shift/rotate the volume actor to reslice.

The main drawback (and that’s why we haven’t implemented this reslicing in 3D Slicer) is that we still need to maintain the entire CPU-based pipeline for cases when the volume is large or the GPU capabilities are limited. It also requires you to develop and maintain low-level GPU code if you want to higher-order texture interpolation, fusion with additional layers, etc. Nowadays it is not clear what that “GPU code” should be implemented in - it may not worth investing into OpenGL shaders, CUDA is only for NVIDIA hardware, OpenCL’s future is uncertain, VTK does not support WebGPU yet.

So, using GPU for image reslicing can make sense for a project where you have control over what hardware and operating system your users have, what kind of data they can load, and you can afford to reimplement all your custom GPU code using WebGPU API in a couple of years.

Anton_Botnov · November 28, 2023, 3:19pm

Thanks, Andras for explanation. I have suitable hardware for such approach. I will try vtkVolumeMapper-based solution with camera transformations and slice planes, I guess it should have same result as CPU-based vtkImageReslice, but with much more higher performance.

Anton_Botnov · November 29, 2023, 3:14pm

Hi Andras, I noticed that if I use more then one vtkOpenGLGPUVolumeRayCastMapper, then memory consumption increase. I guess it happens because image data loading to GPU memory each time I use new vtkOpenGLGPUVolumeRayCastMapper. But in my case this data is the same per each vtkOpenGLGPUVolumeRayCastMapper. Is there option to share memory between vtkOpenGLGPUVolumeRayCastMapper?

lassoan · December 2, 2023, 1:08pm

Sharing the volume’s texture buffer between OpenGL contexts is a must if you have multiple views. A couple of years ago it requires some VTK changes, but it might have changed since then.

It seems that you want to implement a complex 3D viewer application that uses multiple views. I would recommend not to redevelop this from scratch (based on pure VTK) but customize/extend an open-source application that already does this. For example, the VTK-based CustusX reslices images using shaders.

Anton_Botnov · December 4, 2023, 12:11pm

Hi, Andras, thank you for answer. You are right, I’m working on viewer and actually everything is ready for application, I just need to speed up reslising process and vtkOpenGLGPUVolumeRayCastMapper is a great solution instead of vtkImageReslice. But I see that there is no opportunity to share 3d texture between different mappers. How CustusX can handle this problem, seems it’s outdated? We use vtkExternalOpenGLRenderWindow per view (Axial, Sagittal, Coronal, e.t.c). And these windows should share the same OpenGL context