Update of vtkExtractVOI slow in sagittal direction

Hello everyone!

I am developing a DICOM viewer/segmentation software in C++, using vtk and vtk-dicom to load the images and OpenCV to process them within an Qt GUI.

For the initial loading process, I let a vtkDICOMDirectory browse and sort files in the given input directory, and pass the resulting filenames then to a function where a vtkDICOMReader reads the datasets. The resulting vtkImageDatas are then processed by three function, each of them handling a different axis where the volume is sliced along. The result is then converted to Vectors of cv::Mats and then further processed.

Everything works great and as intended. The functions can even be executed parallel via the Qt concurrent framework. The results are always as expected.

But in comparison to slicing for example in coronal and transverse plane, where the whole volume takes 0.3s and 0.9s respectively, the update call for the sagittal direction takes around 9.6s. Am I doing something wrong by design, or is this expected behavior?

The code for one of my slicing functions is the following:

cv::Mat DICOMReaderInterface::sliceAndConvertVTKVolumeX(const vtkSmartPointer<vtkImageData> &vtkVolume, int slice) {

int imageDimensions[3] = {0, 0, 0};
vtkVolume->GetDimensions(imageDimensions);
int imageHeight = imageDimensions[1];
int imageSlices = imageDimensions[2];

int numberOfImageChannels = vtkVolume->GetNumberOfScalarComponents();
int cvType = CV_16SC1;
if(numberOfImageChannels == 1){
    cvType = CV_16SC1;
}else if (numberOfImageChannels == 3){
    cvType = CV_16SC3;
}else{
    cvType = CV_16SC4;
}

vtkSmartPointer<vtkExtractVOI> extractVOI = vtkSmartPointer<vtkExtractVOI>::New();
extractVOI->SetInputData(vtkVolume);
extractVOI->ReleaseDataFlagOn();
extractVOI->SetVOI(slice,slice,0,imageHeight-1,0,imageSlices-1);
extractVOI->Update();

cv::Mat resultingCVMat = cv::Mat(imageSlices,imageHeight , cvType, extractVOI->GetOutput()->GetScalarPointer());

cv::Mat allignedMat;
cv::transpose(resultingCVMat,resultingCVMat);
cv::flip(resultingCVMat,allignedMat, 0);
convertCVSlice(allignedMat);

return allignedMat;
}

Thank you in advance!
Greetings

Significantly slower performance is expected along the axis where you cannot retrieve arrays of voxels at once, but 10x longer seems a bit too much. Maybe you run your code in debug mode? Or maybe there is something in vtkExtractVOI that is extremely inefficient - you may try vtkImageReslice instead.

Note that starting a Qt/VTK desktop application for medical image segmentation from scratch made a lot of sense 10-20 years ago, when desktop was the king and no good application platforms existed for medical image computing. Today, if you want a lightweight application then your users will demand it to run in the web browser (see for example what the OHIF viewer does). Your only way to justify a desktop application today is to offer high-end features that web viewers just cannot match. But it takes significant amount of time to develop that minimum feature set. To avoid reimplementing tons of basic features, I would suggest to leverage the huge amount of preexisting work by building on and improving free open-source Qt/VTK-based medical image computing application platforms such as 3D Slicer or MITK.

1 Like

Firstly, thank you so much for your quick response!

The measurements were retrieved from release unfortunately. I maybe need to correct myself here as well. It’s not a single update call which takes 9.6s, its the whole volume, consisting of 512 slices in x direction, sliced along this axis, which takes that amount of time.
For vtkImageReslice - i stumbled upon that as well, but the documentation warns about the following: “This filter is very inefficient if the output X dimension is 1.”
Which would probably lead to an comparable result, and makes absolutely sense considering your explanation. Would you suggest trying it anyways?
Overall everything works perfectly, it’s just the time/speed that sometimes puts me off.

Also thank you for your additional input. I read a lot about MITK while researching libraries, etc. but thought it would be a little bit too overwhelming to start right from there. Maybe i give it another shot.

Thank you again!

One solution to this problem is to throw memory at it. After you finish processing the data, you can generate three copies in three different orientations (vtkImageReslice can be used to change the orientation, or you could do it in OpenCV or numpy). That way the slice extraction will be fast for all orientations.

2 Likes

Thank you for your response!

That’s a clever approach I haven’t thought about at all. To clarify, you mean after receiving the vtkImageData from the reader, make two copies of it, each one pointing in a different direction where slicing in z-direction would lead to a faster result?

And, may I ask you a off topic question? While we are at times and speed:
For flat a folder with 845 *.dcm files, laying on a regular HDD, having around 425mb all together, the vtkDICOMDirectory needs around 14s for the update call. Is this duration as roughly expected?

Thank you in advance!

On my system, using vtkDICOMDirectory on 500 files takes 5s on a 7200rpm HDD, 1s on an SDD, and 0.05s on a RAM disk. So 14s for a 845 files on a 4200rpm or 5400rpm HDD seems reasonable.

1 Like

I managed to try what you were suggesting. I was able to cut down the slicing time for 512 slices to around 1s along the x-axis. y-axis stays around at 1s and z 0.3s Definitely a success! Aligning the volumes for slicing in x and y takes around 1.3s

Thank you again!

I’ll share my code how i rotated the volumes if someone else needs it in the future. Please let me know if I can improve the code and maybe did something not according to vtk best practices.

// read series with vtkDICOMReader
reader->SetFileNames(dicomdir->GetFileNamesForSeries(series));
reader->SortingOff();
reader->ReleaseDataFlagOn();
reader->Update();

vtkSmartPointer<vtkImageData> currentSeriesVTKZ = reader->GetOutput();

double center[3];
currentSeriesVTKZ->GetCenter(center);
int extent[6];
currentSeriesVTKZ->GetExtent(extent);
double origin[3];
currentSeriesVTKZ->GetOrigin(origin);
double spacing[3];
currentSeriesVTKZ->GetSpacing(spacing);

// Rotation around the y-axis
vtkSmartPointer<vtkTransform> transformY = vtkSmartPointer<vtkTransform>::New();
transformY->Translate(center[0], center[1], center[2]);
transformY->RotateWXYZ(-90, 0, 1, 0);
transformY->Translate(-center[2], -center[1], -center[0]);

// Rotation around the x-axis
vtkSmartPointer<vtkTransform> transformX = vtkSmartPointer<vtkTransform>::New();
transformX->Translate(center[0], center[1], center[2]);
transformX->RotateWXYZ(-90, 1, 0, 0);
transformX->Translate(-center[0], -center[2], -center[1]);

// execute translation and rotation around Y
vtkSmartPointer<vtkImageReslice> resliceY = vtkSmartPointer<vtkImageReslice>::New();
resliceY->SetInputData(currentSeriesVTKZ);
resliceY->SetResliceTransform(transformY);
resliceY->SetOutputSpacing(spacing[2],spacing[1],spacing[0]);
resliceY->SetOutputOrigin(origin[2],origin[1],origin[0]);
resliceY->SetOutputExtent(extent[4],extent[5],extent[2],extent[3],extent[0],extent[1]);
resliceY->Update();
vtkSmartPointer<vtkImageData> currentSeriesVTKY = resliceY->GetOutput();

// execute translation and rotation around X
vtkSmartPointer<vtkImageReslice> resliceX = vtkSmartPointer<vtkImageReslice>::New();
resliceX->SetInputData(currentSeriesVTKZ);
resliceX->SetResliceTransform(transformX);
resliceX->SetOutputSpacing(spacing[0],spacing[2],spacing[1]);
resliceX->SetOutputOrigin(origin[0],origin[2],origin[1]);
resliceX->SetOutputExtent(extent[0],extent[1],extent[4],extent[5],extent[2],extent[3]);
resliceX->Update();
vtkSmartPointer<vtkImageData> currentSeriesVTKX = resliceX->GetOutput();

I’m not sure if its better to set input connections or set the input data directly for the reslicer.
Also, I’m not sure if you really need two vtkImageReslicer, but when updating one a second time, after providing the new rotation data, I get the same volume two times. The same goes for looping with a instantiated vtkDICOMReader through for example five series. I get the same series five times, so I tend to create a new vtkObject within a vtkSmartPointer for every iteration. I hope this is correct.

I recommend SetInputData() unless you need to make a pipelined connection. The difference is this: with SetInputData() you have to call Update() on the input source yourself. But with SetInputConnection() the pipeline will automatically update the input according to rules that you can find in the VTK textbook and elsewhere.

A VTK filter “owns” its output. So every time you update a filter, it will overwrite its output. With filters like vtkImageReslice, repeated updates generally don’t require any memory to be freed or allocated, so the memory is simply reused. This is very efficient when VTK is used to build dynamically updating pipelines (which is what VTK was originally designed for).


I recommend against using RotateWXYZ(90, 0, 1, 0) to create a permutation transformation since it’s inexact:

>>> transformY.RotateWXYZ(-90, 0, 1, 0)
>>> print(transformY.GetMatrix())
vtkMatrix4x4 (0x561682cac8e0)
  Debug: Off
  Modified Time: 112
  Reference Count: 2
  Registered Events: (none)
  Elements:
    2.22045e-16 0 -1 0 
    0 1 0 0 
    1 0 2.22045e-16 0 
    0 0 0 1 

You can build the permutation transformation directly:

const double permutationY[16] = {
  0.0, 0.0,-1.0, 0.0,
  0.0, 1.0, 0.0, 0.0,
  1.0, 0.0, 0.0, 0.0,
  0.0, 0.0, 0.0, 1.0,
};
transformY->Concatenate(permutationY);

One reason for doing this is that the vtkImageReslice code has a fast path specifically for permutation matrices.

1 Like

Thank you again for taking your time to answer and explain.

This means even when updating a filter, receiving it’s output for a vtkImageData object, updating it again, and receiving the output for another vtkImageData object again, the content for both will be same because the smart-pointer points to the address of the filters content, which will then be the same for both images, logically?

I will adjust my code according to you suggestion concerning the permutation matrix, interesting detail. Thanks!

Edit: The performance improvement using a permutation matrix is incredibly, thank you!