Volume rendering significantly slower in C++ compared to Python

I am trying to render a volume, read from an NRRD file. This is using VTK 8.2. The file is about 700MB, and simply loading it and displaying it in Python seems to happen quite quickly, with the first frame rendering under a second. However, when I run (what I believe to be) the equivalent C++ code, the time to render that first frame sits closer to 20 seconds.
My guess is that the C++ is doing some conversions or casts that the Python is not, but I’m not sure where, or how I could suppress that. Does anyone have any ideas?

The fast Python:

#!/usr/bin/env python

import vtk

reader = vtk.vtkNrrdReader()

data_alpha_range  = [0.0, 28.0]
data_colour_range = [-25.0, 40.0]

opacity_transfer_function = vtk.vtkPiecewiseFunction()
opacity_transfer_function.AddPoint(data_alpha_range[0], 1.0)
opacity_transfer_function.AddPoint(data_alpha_range[1], 1.0)

colour_transfer_function = vtk.vtkColorTransferFunction()
colour_transfer_function.AddRGBPoint(data_colour_range[0], 0.0, 0.0, 0.0)
colour_transfer_function.AddRGBPoint(data_colour_range[1], 1.0, 1.0, 1.0)

volume_property = vtk.vtkVolumeProperty()

volume_mapper = vtk.vtkGPUVolumeRayCastMapper()

volume = vtk.vtkVolume()

ren1 = vtk.vtkRenderer()
ren1.SetBackground(0.0, 0.0, 0.0)

renWin = vtk.vtkRenderWindow()
renWin.SetSize(800, 800)

iren = vtk.vtkRenderWindowInteractor()


The slow C++:

#include <vtkSmartPointer.h>
#include <vtkNrrdReader.h>
#include <vtkPiecewiseFunction.h>
#include <vtkColorTransferFunction.h>
#include <vtkVolumeProperty.h>
#include <vtkGPUVolumeRayCastMapper.h>
#include <vtkVolume.h>
#include <vtkRenderer.h>
#include <vtkRenderWindow.h>
#include <vtkRenderWindowInteractor.h>

int main(int, char *[])
    auto reader = vtkSmartPointer<vtkNrrdReader>::New();

    double data_alpha_range[2] = { 0.0, 28.0 };
    double data_colour_range[2] = { -25.0, 40.0 };

    auto opacity_transfer_function = vtkSmartPointer<vtkPiecewiseFunction>::New();
    opacity_transfer_function->AddPoint(data_alpha_range[0], 1.0);
    opacity_transfer_function->AddPoint(data_alpha_range[1], 1.0);

    auto colour_transfer_function = vtkSmartPointer<vtkColorTransferFunction>::New();
    colour_transfer_function->AddRGBPoint(data_colour_range[0], 0.0, 0.0, 0.0);
    colour_transfer_function->AddRGBPoint(data_colour_range[1], 1.0, 1.0, 1.0);

    auto volume_property = vtkSmartPointer<vtkVolumeProperty>::New();

    auto volume_mapper = vtkSmartPointer<vtkGPUVolumeRayCastMapper>::New();

    auto volume = vtkSmartPointer<vtkVolume>::New();

    auto ren1 = vtkSmartPointer<vtkRenderer>::New();
    ren1->SetBackground(0.0, 0.0, 0.0);

    auto renWin = vtkSmartPointer<vtkRenderWindow>::New();
    renWin->SetSize(800, 800);

    auto iren = vtkSmartPointer<vtkRenderWindowInteractor>::New();


    return EXIT_SUCCESS;

Any help would be appreciated.

Have you built VTK in release mode?

What takes a long time: starting up the application (loading all the DLLs), loading data, or starting the rendering? After the first frame is rendered, is the speed the same in the two environments?

Does your computer have multiple graphics cards? Have you set up your executable to use the same graphics card as in Python?

Yes, VTK has been built in release mode, though I certainly had a bit of a scare it could’ve been something so simple.

The part of the application I am timing is just the first render call. So, if the bottom of the C++ example above was expanded to be


auto last = std::chrono::steady_clock::now();


auto now = std::chrono::steady_clock::now();
std::cout << "Render time:\t" << std::chrono::duration_cast<std::chrono::microseconds>(now - last).count() << std::endl;


then a sample output on my machine is

Render time:    22879419

Once the first render has happened, however, the speed of both versions seems about the same.

My computer does have multiple graphics cards (it’s a laptop). And yes, this is where the problem lies. The default is supposed to be the high performance card, and when I run the .exe directly, it is indeed loading quickly (on par with the Python!). It looks like my problem was that Visual Studio used as a default the integrated graphics, so launching the code from there would inherit that setting. I can start VS specifically with my high performance card, and get the expected speed, as well.

My next question is, when I tried to debug this as a possible cause, I ran

std::cout << renWin->ReportCapabilities() << std::endl;

and got the output

OpenGL vendor string:  NVIDIA Corporation
OpenGL renderer string:  GeForce RTX 2080 with Max-Q Design/PCIe/SSE2
OpenGL version string:  4.5.0 NVIDIA 442.23

Why would ReportCapabilities() report a different card to the one being used to do the actual rendering?

Since you have everything ready to investigate this problem, it would be great if you could give it a try and debug this (and come up with a fix, if it turns out to be a VTK bug).

Just a follow up on this: turns out that it was indeed a Debug/Release problem. When I was going through the list of recommendations from @lassoan, the first thing I did was change which VTK dlls were being loaded. Of course, I had just been setting my PATH to pick which dlls were used (a mistake I have vowed never to repeat). When I updated that, I didn’t re-open Visual Studio, so I was still running things there with the Debug dlls. All the times I tried to run the program after that, however, were using the Release dlls. My copy of vtkpython was, of course, in the Release directory, which is the reason that things were loading quickly there. VTK is also handily using the better graphics card all the time, and I’m not even sure if I can force it to use the integrated graphics.

Thankfully, this means there isn’t a bug in VTK, either. Thank you for the help Andras!

1 Like