Working with whole slide images in VTK

Hi everyone. I am new to VTK and I’ve been trying to apply it to whole slide images (WSI) in digital pathology. This has presented some interesting challenges and I would be interested to know how some of these challenges should be approached in the context of VTK.

  1. Firstly, I needed to be able to read in data from virtual slides obtained from different whole slide scanners. This was done using VTKs microscopy domain in tandem with openslide. This works fine. I can read in small images of these formats and display them no problem.

  2. The next issue I’ve encountered is rendering large, high resolution wsi’s for display and interaction. My approach to achieve this is to utilize image pyramids. I first tried to view the whole image using vtkImageResize to drop the resolution. This seemed to work to an extent, but its very slow and the lag is substantial. I’ve looked at other filters such as vtkImageResample but this is also slow. Is there an alternative filter I should be looking at or am I missing something else? To view regions of the image at high resolution I want to generate tiles to only read in the relevant sections of the image. I am thinking of using an array of actors to break up the image into tiles and display only the tiles which are in view of the window. I tried to generate a single tile using vtkImageClip. This works but again it seems slow. Is there another approach I should take to generate tiles?


Hi Jason,

I used to do some large-image rendering in VTK. I had started working on a multi-resolution pyramid approach but unfortunately my work with microscopy ended before I could finish it.

Some pointers:

  • The vtkImageResize filter the fastest way to resize an image in VTK, but by default it uses a windowed-sinc interpolator. To increase its speed, you can use its SetInterpolator() method to provide a vtkImageInterpolator (i.e. linear interpolation) rather than the default vtkImageSincInterpolator.
  • The vtkImageResize filter is also able to crop images (though its interface isn’t the best).
  • For displaying large images, the vtkImageResliceMapper (together with vtkImageSlice) works quite well. As with vtkImageResize, the best performance is achieved by tweaking the interpolator that is used. In this case, giving the mapper an interpolator that has the SlidingWindowOn() flag set can speed things up (but only if you aren’t rotating the image).

If you’re a Python user, I can dig through my old code to try to find the Python samples that I used to use for large image display. How big are the images, and how are they stored?

Thanks for the speedy response and advice David. I will definitely give it a try. I’m currently using C++, but I’m sure I would be able to correlate the logic from python across so any samples would be much appreciated.

The images I’m working with at the moment are saved as a .svs format (from a Leica scanner but I assume the sizes for the other image types will be around about the same). The size of the images (from what I’ve seen) ranges from 600MB to 4.5GB depending on the scanning resolution. Dimensionality ranges from 90 000 by 46 000 pixels to 168 000 by 78 000 pixels. It’s quite a challenge working with images this size but its definitely worth exploring the image pyramids concept for visualization. I see ITK can create image pyramids so I can maybe have a look at the logic they implement it and see how that translates in the context of VTK, but this again doesn’t address the visualization of a large image.

For a single 2D image, the interactions I want are only zooming and panning so I think I’d need to set up some sort of caching system to set up tiles pre-emptively of a panning movement?

I’ve interactively viewed 85000x45000 RGB images on 32GB systems without tiling, so 168000x78000 is not much of a stretch given 64GB or more. One useful trick is to cache the image (and optionally its pyramid) as a flat file on disk and then memory-map the file. This ensures that, no matter what the file size, the program will not crash due to running out of memory since the computer will automatically unload whatever parts of the image aren’t being viewed and which don’t fit in RAM. Essentially, a memory-map operation makes it appear to your program as if the entire image is in memory, even if it isn’t.

Tonight I’ll clean up my sample and post it here.

I’ve created a python sample here for viewing large images from the command line with very little overhead, it should be able to handle images of up to a gigapixel, but after that it might start to stutter depending on the amount of RAM. Be sure to try it on a simple jpeg photo first, before trying to use it on anything more exotic, and I also recommend trying the Python code as-is at least once (VTK can be easily installed for Python via “pip install vtk”).

@dgobbi This is so impressive. Would you like to add this to the vtk-examples?

If you are too busy I am happy to do it.

Thank you! I will try the python code out!
I will probably translate it across into c++ as well because I want to give the caching a try in c++.
So just to clarify the logic of caching. I could use a flat file which stores the “centered tile” in the rendering window. The centered tiles will also store references to the tiles surrounding it and cache those tiles pre-empting the panning movement. I then release the cache memory and cache a new tile when a panning movement occurs and a the rendering window displays a new centered tile. However for this to work smoothly tiles would need some sort of trigger regions (based on the rendering window) to alternate the display to the new centered tile. The tiles have overlapping regions so that the trigger region does not display the clipped boundary before showing the next tile. If I wanted to account for zooming, I could use a gaussian pyramid of N-amount of layers and store each layer in its own flat file. I could store layers in separate caches or lump them all together into a single cache. As I understand it, there is a trade-off in rendering time and cache memory size. As one increases the cache memory, the rendering time decreases. If one moves quickly, to a region that has not been cached there will be a waiting period for the tiles to be rendered. Something like that??

So I have quickly re-educated myself on the topic of whole slide images and how these scanning companies store the images. It looks like the file format stores the image levels for an image pyramid already. The Domains/Microscopy module which utilizes openslide to read these formats by default reads in the base level. However, in openslide you can set which level (hence the resolution) you would like to read the image in. This choice is presently not available in the vtkOpenSlideReader macro so one may be better off incorporating openslide. This would save time as we wouldn’t have to resize the image at all.

I hadn’t ever looked at vtkOpenSlideReader before, but yes, it only seems to support reading the highest-resolution image and doesn’t really take advantage of the features offered by these slide files. It would be great if someone could improve vtkOpenSlideReader to make it possible to select the level (hint: that person could be you, if you’re willing).

Regarding “trigger regions” etc, I would not go in that direction if I was to implement tiling myself. I would prefer to memory-map the tiles so that the OS kernel will load each tile automatically when a memory access occurs within that tile. This takes advantage of the years of work that OS designers have put into optimizing disk caching. There is a disadvantage to this approach, though, because it requires reading the whole file beforehand to create the cache.

Hi Andrew, I’ll be able to add it sometime in the next couple weeks. I’ve still got a branch with changes from the last hackathon that I need to merge into the repository.

Thanks @dgobbi , I’ll look forward to it. Please note that a few more python examples have been added since then and some have been reworked.