Proposal to add orientation to vtkImageData- Feedback Wanted

ken-martin · January 24, 2019, 4:45pm

Summary: We are considering adding orientation into vtkImageData to support rotations and inversions but not shears. We are looking for feedback on the idea and the two approaches listed below.

This document was created based on the Slicer wiki Labs page found here originally written by @lassoan with input from @jcfr and later updated based on comments from @ken-martin , @demarle , @berkgeveci , @lisa-avila and @jcfr. The intent of posting this is to get feedback on what the community feels about this idea and specific issues, ideas, etc. So please provide feedback to help direct this concept. As always remember that contributors time and effort is limited, so the goal is to get the most positive impact for the effort people put in. Consider scope and impact when suggesting directions (ha ha, directions – orientation) etc. Thanks!

Motivation

In many fields we often have to work with images that are arbitrarily oriented in space. We’ll refer to images that can have arbitrary orientation as “oriented images”, and images that have voxel coordinate axes aligned with the object coordinate axes as “axis-aligned images”.

VTK’s lack of orientation in vtkImageData requires many workarounds in applications or conversion to a much larger data representation such as vtkStructuredGrid, which increases complexity, probability of errors, and decreases performance:

Many file formats that we use (nrrd, mha, nifti, etc.) store the axis directions, but currently when we load these images, the direction information is lost.
Most VTK filters could be easily extended to accept oriented images as input and/or generate oriented images as output, but currently they only work with axis-aligned images.
When a filter takes multiple images as inputs then those cannot be used with oriented images. There is no workaround, other than resample the images or implement new filters. For example, vtkGridTransform or vtkBSplineTransform classes would need to work with oriented images (coefficients are stored in vtkImageData) because the DICOM standard requires transformation with arbitrarily oriented grids, but if the input is an oriented image as well, there is no common voxel coordinate system where all the input data could be transformed to.
Some filters have additional inputs to be able to pass axis directions (such as Filters that are often used with oriented images (such as vtkImageReslice), but collecting the direction information separately and then injecting it into these filters correctly makes these filters difficult to use.
While actors can make the volumes appear in arbitrary orientation, the orientation information has to be collected from the data reader and sent to the actors manually.
Linear transforms cannot be efficiently applied on vtkImageData because the transformation result cannot be stored in a vtkImageData (when the transformation includes rotation).

Software applications deal with this in various ways – none of them are great solutions:

3D Slicer does not use vtkImageData origin and spacing fields and just stores the VoxelToObject transform (origin, spacing, direction) separately in a 4x4 matrix. This makes implementing VTK-based image processing pipelines quite complicated to implement.
Paraview ignores this problem and assumes that all images are axis-aligned. This makes it challenging for users to use Paraview for their medical images that are often not axis-aligned.
ITK, ITK-Snap, etc. use ITK-based classes for image storage, although they use VTK for visualization.

Design

There are two main design options:

Create a new image data class, such as vtkOrientedImageData and during the transition time use that to distinguish between oriented and axis-aligned data. Once the transition time is over (probably in a few years), the old vtkImageData class can be replaced by vtkOrientedImageData (an alias may be kept for backward compatibility). ITK used this approach, but maybe because ITK objects did not have vtkInformation-type generic description that can be passed through pipelines. The disadvantage of this approach that a new class has to be added, all filters and other classes have to be changed to use this new type, and once the transition is complete (after many years) the vtkOrientedImageData would need to be retired and changed back everywhere to vtkImageData.
Keep using vtkImageData but add direction information to the class and vtkInformation object that already stores origin and spacing information. This approach would be less invasive, requiring less changes to existing code base, but probably it would require a special check in the pipeline for presence of image rotation for a couple of years, until VTK fully supports oriented image data.

To be investigated:

How to manage vtkStructuredPoints (which is a specialization of vtkImageData) and vtkUniformGrid (which is an extension of vtkImageData)?
If a new class is created, should it be child or parent of vtkImageData? All ImageData could be interpreted as a vtkOrientedImageData, so it seems to make sense to make vtkOrientedImageData the base class.

Algorithm feasibility

Implement oriented image class and filters can operate on oriented image data.

This phase is completed. We’ve found that oriented image data support greatly simplified data application code, but it was difficult to use filters that are not orientation-aware. It was not always trivial to update classes - had to spend time with completely understanding how the filter or transform worked (update how normals and derivatives, cutting plane positions and orientations are computed) - to add oriented image data support. However, the filters did not become slower or more complex by making them orientation-aware.

Related code is available here:

Simply implementing vtkOrientedImageData class did not help much because developers need to propagate direction information manually. It is important to make the pipeline accept vtkOrientedImageData.

Updated approach to implement mechanism that allows storing and passing direction data in vtkImageData. It would be nice to implement a mechanism to report error if a filter receives oriented image data as input but it can only work with axis-aligned image data.

Implementation is work in progress in this branch:

Questions

What is orientation specifically? Are we talking euler angles (3dof) or something more general that can include shears and inversions, lefthanded coords etc?

Answer:

The current thinking is that orientation would be a 3x3 matrix limited to an orthonormal basis with inversions, while it would be more general to support shears there is concern that shears would complicate some filters and that they may not be that common compared to rotations and inversions. This is what ITK supports. Clearly this is an important decision so feel free to speak up.

What should be done for vtkStructuredPoints and vtkUniformGrid ?

Currently, vtkImageData is used as base class for vtkStructuredPoints and vtkUniformGrid. If we add orientation to vtkImageData, then these classes also get it. It would be probably generally useful, but it has to be checked what is the preference of heavy users of vtkStructuredPoints and vtkUniformGrid classes (do they want to change their base class and not even have the opportunity to orient the grids, or just disable it in those filters that cannot handle it yet).

Answer: TBD but I think we should have them handle orientation as well

What about vtkRectilinearGrid ?

It is very similar to vtkImageData, should we add orientation to it as well?

Answer:

Yes. There are only a few filters that operate directly on rectilinear grids and many of the required changes will be similar to the changes made to vtkImageData.

One list of top-priority filters

See https://docs.google.com/spreadsheets/d/1cpTFnMqTymmBsWeveU2b1QjjU1d8V27f0HpBZJ-QbGE/edit#gid=1151021196

lorensen · January 24, 2019, 6:15pm

I was involved in the initial ITK implementation. We decided o start with an oriented image then transition that to the regular image.I would not do that again. I recommend adding the meta date to the current image and work from there.

See this discussion:

https://cmake.org/pipermail/insight-users/2008-October/027508.html

Also some early developer discussions:

https://itk.org/Wiki/Proposals:Oriented_Image_Registration

Also, I recommend looking at itk’s implementation and hopefully, vtk’s will be similar.

Finally, the Slicer folks have had lots of experience in using orientation. They are a valuable resource.

Bill

toddy · January 26, 2019, 2:48am

Well I hope I’m not going to completely throw a spanner in the works by suggesting a third option. I admit that I am not familiar with the VTK filter classes, so I’m just shooting the breeze. I’m also assuming that a breaking change is permitted.

In my opinion it seems convenient but illogical to embed orientation/inversion information in an image; convenient in the sense that it allows that information to be propagated through VTK, but illogical in the sense that (to me) an image exists without inherent spatial awareness.

So I would prefer the concept of an image container base class, vtkImageContainer (holding one or more images), that is passed to the filter classes for manipulation of the vtkImageData. At the same time vtkImageData should NOT know about its container, so that the same data could be placed into multiple containers. A sub-class of vtkImageContainer might be vkImageFrame which possesses a transformation matrix property. In this way the filter classes could keep the vtkImageContainer alive (via smart pointers) and allow the orientation/inversion information to be propagated through VTK. A filter that is spatially aware might then interrogate the image container to see if it is a vkImageFrame and adjust its behaviour accordingly. In some cases it might be appropriate to have the image frame perform a transformation before handing the image data to a filter.

With this proposal there would be no need to alter the vtkStructuredPoints, vtkUniformGrid or vtkRectilinearGrid classes and the filters that handle multiple images could share a single voxel coordinate system; the one associated with the vtkImageFrame.

Additionally vtkShearedImageFrame could be introduced as a a sub-class of vtkImageFrame for obvious reasons.

lassoan · January 26, 2019, 3:01am

Thanks for the feedback. Having a separate image container that can store image data with orientation was the solution initially implemented in ITK (itk::OrientedImage). It turned out to be a bad design and they ended up simply adding direction matrix to itk::Image. We could save some time by not going through this exercise again.

toddy · January 26, 2019, 3:44am

I just read the links that @Bill posted and I’m not surprised the way itk::OrientedImage was implemented caused confusion. The devil is in the detail.

Firstly, unlike itk::OrientedImage which implies another image class, the vtkImageContainer class is not an image and is named so that it wouldn’t be mistaken for one.
Secondly, I would suggest that methods like TransformIndexToPhysicalPoint() and TransformPhysicalPointToIndex() be moved to the vtkImageContainer class, where they would make sense, when vtkImageData has no spatial awareness.

lassoan · January 26, 2019, 4:23am

In my opinion it seems convenient but illogical to embed orientation/inversion information in an image

Thanks for sharing your opinion, but based on many years of experience with VTK and ITK and several applications that use them, we have come to the conclusion that image geometry is best specified by origin, spacing, direction, and extents. Less than that is very limiting (we have been suffering from missing direction in VTK for very long time) and more than that is not necessary (ITK proved this).

The additional container classes that you propose would make the design much more complicated for additional flexibility that we don’t need.

kit · January 28, 2019, 8:23am

+1. I frequently work with vtkImageData which is not axis aligned so having an orientation in VTK image data would make my life easier. Not to bothered about which of the two design options.

ken-martin · January 28, 2019, 2:21pm

Container classes definitely have their advantages and some systems make great use of them. vtk-m basically uses this approach in how they access raw data through flexible APIs which can handle all sorts of cases. I do think they work best in cases such as vtk-m where the library has been designed that way from the start. VTK has gone with an approach where the choice of data class (ImageData) heavily defines and constrains the API and data. In that context, adding an incremental feature to the existing ImageData is probably going to be easier (maybe not better) than a broader re-architecture of the data model which can be tricky.

IanL · January 28, 2019, 3:04pm

We frequently encounter data sets with non orthogonal pixel layouts in the medical imaging field. A frequent example would be a sinus CT series acquired with a tilted gantry, but there are other types. We currently have workarounds for this: incorporating the transform into the one given to vtkImageReslice or SetUserMatrix on vtkVolume, so not including skews in the pixel direction transform would not necessarily be an issue for us, but it would be neater if the image data/algorithms/mappers handled this for us behind the scenes.

dgobbi · January 28, 2019, 5:15pm

In my vtk-dicom package, I utilize VTK’s generic meta-data capabilities to attach a matrix to the vtkImageData objects.

Here’s how it works, in broad strokes. I have a header (vtkDICOMAlgorithm.h) that defines the following information key:

// A vector of 16 values that defines a 4x4 transform matrix.
static vtkInformationDoubleVectorKey *PATIENT_MATRIX();

If this info is attached to a vtkImageData, then any algorithm class can read it, use it, and pass it along with or without modification. For my particular implementation, I chose the name “PATIENT_MATRIX()” to indicate that the target coord system was the DICOM patient coord system. If something similar is brought into VTK, then the key would be defined in vtkImageData.h and would be named more generically.

One thing that I will strongly advocate for is a 4x4 matrix (rotation + translation) instead of a 3x3 matrix (rotation only). I’m sure some people will argue that the translation is redundant since vtkImageData already has an Origin, but there are two very good reasons for using 4x4:

1) Practicality of a 4x4 matrix

Both VTK and Slicer are designed to work well with 4x4 matrices. Modifying vtkImageReslice to incorporate an additional 4x4 matrix that it grabs from the pipeline would be trivial: internally, it would just have to be multiplied in with the existing ResliceAxes matrix. Also, all code that I’ve ever written to deal with oriented images in VTK has always done so by associating a vtkMatrix4x4 with the image.

2) Definition of “Origin”

When we implement oriented images in VTK, we want to define a simple transformation that takes our current (unoriented) images into an oriented coordinate system:

[unoriented image] --(transformation)--> [oriented image]

With our current (unoriented) implementation of vtkImageData, it is understood that “Origin” specifies a location in the unoriented coordinate system. Similarly, for vtkProp3D the “Origin” is a location in the original data coordinates while “Position” is a location in rotated (and possibly scaled) coordinates.

If we follow ITK’s approach (which I do not recommend), then Origin becomes a shift that occurs after rotation rather than before. In other words, if we follow ITK’s approach, it’s not a matter of taking VTK’s current image data coords and applying a rotation. Instead we would have to compute the (unoriented) image data coords without the Origin, then apply the rotation, and then add the Origin.

I think it would be easier (and certainly more flexible) if we keep our current VTK definition of Origin, but also allow a shift (in rotated coords) in the last column of the 4x4 matrix.

Considerations for DICOM

In the DICOM 2D Image coordinate system, the first pixel is defined to be at (0,0) or (0,0,0) and the spacing between pixels is given in mm. These original coordinates are rotated according to ImageOrientation(Patient) and shifted according to ImagePosition(Patient) to put the image into the Patient coordinate system. In my suggested implementation, the Orientation and Position are used to build a 4x4 matrix that provides the transformation. The Origin, as currently understood, would be set to zero.

Considerations for NIFTI

The situation with NIFTI is similar to DICOM, except that some implementations choose to incorporate the pixel spacing into the 4x4 matrix (which often works better when dealing with the NIFTI “sform”) while other implementations choose to incorporate the spacing into the untransformed image (which is more intuitive when using the NIFTI “qform”). The vtkNIFTIImageReader currently takes the latter approach.

Considerations for the VTK pipeline

If an “oriented” image is simply an unoriented image with an attached 4x4 matrix, then all that an “orientation aware” filter or mapper needs to do is read and apply the matrix. For vtkImageReslice and the VTK volume mappers this should be easy to achieve, since they already have mechanisms in place to deal with 4x4 matrices.

Summary

My proposal is to add a 4x4 matrix to vtkImageData as an information key. For most medical images, this 4x4 matrix will provide both orientation and position. For oriented images the vtkImageData Origin would most often be set to (0,0,0), but this would not be a requirement. Both the Origin and the shift in the matrix would have to be considered when computing the location of voxels in the oriented image (apply origin -> apply rotation from 4x4 matrix -> apply translation from 4x4 matrix).

ken-martin · January 28, 2019, 6:57pm

What about having the 4x4 matrix be an ivar on ImageData instead of on the information object? That way filters/mappers will know they can count on it being there and it would be a tad faster than an information lookup.

dgobbi · January 28, 2019, 7:42pm

Well, there has to be an information key for it, so that it can be carried by the pipeline the same way as e.g. SPACING. But when it’s attached to the vtkImageData object, yes, an ivar makes sense.

So within the pipeline it can use a vtkInformationDoubleVectorKey with 16 values, and within vtkImageData it can be an ivar of type vtkMatrix4x4. That would make it easy to use. We could use a generic name, e.g. “MATRIX” for the information key and “Matrix” for the ivar.

I might recommend that the key be optional in the pipeline (to indicate unknown or identity matrix), even if the vtkMatrix4x4 was always present in the image.

lassoan · January 29, 2019, 12:37am

I fully agree that conceptually we should use 4x4 homogenous matrix, which defines transformation from IJK to LPS coordinate system. This is what ITK does, too (it just stores the information in several variables).

I don’t mind if IJK to RAS matrix is stored as a single 4x4 matrix or separately (as origin vector, spacing vector, and axis directions 3x3 matrix). Adding just a direction matrix would be more similar to the current implementation, that’s why it seemed to be an obvious choice, but if we all agree that matrix format is used much more often (I would think so) then it would be a good opportunity to change it now.

image->GetMatrix() would be quite ambiguous, as it would not even tell the what is the from and what is the to coordinate system. I would choose a name that indicates the transformation direction, such as GetIJKToXYZMatrix, GetIndexToPositionMatrix(), GetVoxelToWorldMatrix(), or something similar.

It is up to us to decide, but we should make a clear decision now, so code review and update of all existing image filters are done accordingly.

Pro:

It would be convenient to allow storage of some acquisition transformations could be accommodated there
Certain simple pipelines (that all use filters that can handle non-orthogonal direction matrix) could run on sheared images, without the need to handle shearing transformation separately

Con:

We cannot store all acquisition transformations there anyway (e.g., varying slice spacing cannot be handled)
in the acquisition (because then some transformations, such as tilted-gantry acquisitions
Each filter would need to declare if it can deal with non-orthogonal, non-right-handed, maybe even non-homogenous, etc. 4x4 transforms
We would need to add consistency checks (more code, some extra computations, and potential source of errors)

Allowing non-orthogonal direction have advantages in the long term, it is just not clear if we can deal with the extra effort that requires for it in the short term. If I had to choose between having orientation matrix restricted to be right-handed orthogonal, or not having orientation matrix support at all; I would definitely go with the restricted option.

toddy · January 29, 2019, 2:01am

This discussion caused me to look back at some code I wrote years ago for handling multi-point constraints in finite element analysis. In that case I decided to follow the NASTRAN format for a grid point, which references a coordinate system for the input data. So I have reversed my earlier thinking, in the sense that I now see that a coordinate only has meaning when it is associated with a basis.

The great thing about introducing a coordinate system class (rather than just a matrix) is that it abstracts away the details of the data transformation from one coordinate system to another via its own TransformTo(CS, Point) method. Every coordinate system could expose an origin property and an orientation property, even if the underlying transformation is stored as a 4x4 matrix. Additionally non-orthogonal or even curvi-linear coordinate systems can be catered for via polymorphism. It sounds like at least some users could make use of such functionality. In the case of skewed coordinate systems, the 3x3 orientation matrix could be constructed from the first basis vector and the plane containing the second basis vector.

I still think a vtkImageFrame class would be useful for filters that handle multiple vtkImageData where the coordinate system referenced by the image frame could be used to transform each set of image coordinates from their local representations into a unified frame of reference before processing.

lassoan · February 1, 2019, 6:46pm

When we get an image in a non-linear coordinate system (variable slice spacing CT, US in spherical coordinates, etc.), then we typically use VTK transforms and resample image filter to get an image specified on a rectilinear grid. Then all further visualization and processing is done using existing filters. This is acceptable, because almost none of processing or visualization algorithms could be easily made work natively in a non-linear coordinate system (and the few that can, it can be handled as a special case).

We know quite well (based on many years of experience in many projects using multiple software toolkits) that adding image orientation will make application development much easier and further generalization and flexibility is not expected to make much difference (at least not in medical image computing). Maybe the only non-obvious high-level decision to make in VTK now is at what level shearing transforms should be supported.

dgobbi · February 1, 2019, 7:18pm

The problem with shear is that, unless the shear is very mild, it breaks a fundamental connectivity assumption that is built into many VTK algorithms. Let me explain in more detail:

Let’s represent an image voxel by its indices (I,J,K). For the purposes of interpolation, we need to define a small neighborhood, e.g. [(I,J,K), (I,J,K+1), … (I+1,J+1,K+1)], based on the assumption that we can find a voxel’s neighbors by incrementing (or decrementing) an index.

This assumption breaks down if there is a significant amount of shear. Let’s consider the situation of a tilted-gantry CT scan with 0.5 mm pixels but a 2 mm z-spacing between slices. If the gantry tilt is 20 degrees, then the shear causes a shift between adjacent slices of (2mm)*sin(20degrees) = 0.68mm, which is greater than the 0.5mm pixel spacing. So to get to the nearest neighbor in the Z direction, we would in fact have to increment not only the K index, but also the J index.

The VTK image algorithms/mappers that use neighborhoods could, potentially, check for the degree of shear and modify their neighbor definitions as necessary. But for now, they don’t, so using transformations with too much shear will produce incorrect results.

toddy · February 1, 2019, 10:29pm

I imagine this connectivity problem would also arise in orthogonal non-linear coordinate systems, such as a cylindrical coordinate system where the points are distributed too coarsely relative to the curvature/spacing.

lassoan · February 2, 2019, 1:58am

Yes, it is clear that most image processing algorithms would not work correctly with significant shear.

The main reason I would consider allowing it is the convenience of propagating this information along with the image. For example, right after the image comes out of the reader, you could store the transformation matrix, regardless of sheared or not. Returning a separate acquisition transform from the reader would be less convenient and developers might forget about checking it.

Allowing shearing transform at only some places (e.g., in readers and in some filters and mappers) would mean that we may need to add validity checks, more error handling, etc. It is unclear if the convenience of storing shearing transform justify these potential extra complications.

tim-evain · February 6, 2019, 9:43am

Hello folks,

First of all, thanks for considering this feature which is a long standing issue in my opinion

I’m mostly using VTK as the visualization tool of my ITK applications, so here my 2 cents:

Even if VTK and ITK are different tools with specific designs, I think that every effort to harmonize similar concepts would greatly improve usability and synergy.
So I would prefer having the orientation carried with the vtkImageData rather than adding a new class
I don’t mind if the spatial information is split or summarized in a 4x4 matrix, as long as we can easily access origin/spacing/orientation separately
I can’t envision the performance impact of such choice, so this is might be a bad design, but I would allow the setting of non-orthogonal direction in ImageData, then having a check on filters that can’t handle it. Maybe that might requires too much work for now
I think StructuredPoints and UniformGrid should handle orientation in the same way

Tim

estan · February 6, 2019, 9:53am

Just chiming in since I think (but correct me if I’m wrong) that not everyone picked up on what David was actually suggesting here. It was not, AFAIU, about incorporating the Origin (as currently defined) into the 4x4 matrix that will hold the new direction info, but about having an additional translation in that matrix, one that is in the rotated coord system (the also in the quote above), right @dgobbi?

Sorry if everyone is already on the same page here and I just misunderstood some of the recent responders