Upcoming changes to vtkCellArray

Over the next few days/weeks, we’ll be merging in some major changes to vtkCellArray to support new functionality. This involves some significant changes to the vtkCellArray’s internal data structures, but we’ve retained as much of the older API as possible.

The change is motivated by two goals:

  1. Split the current connectivity array into Offsets and Connectivity. This will allow random-access directly into the cell array, and ease integration with other toolkits and rendering systems.

  2. Allow the arrays to be stored as either 32-bit or 64-bit integers, regardless of VTK_USE_64BIT_IDS. This will (a) allow smaller datasets to use less memory by switching to 32-bit arrays, and (b) allow better zero-copy integration with datasets that might be using a different sized integer than what VTK was built with. The API will still use vtkIdType.

We’ve provided legacy implementations for as much of the older API as possible, but in particular, GetPointer and WritePointer had to be completely removed, since they simply cannot be supported with the new internals. New methods (ExportLegacyFormat, ImportLegacyFormat, and AppendLegacyFormat) can be used to simplify porting to the new API.

A new preprocessor definition, VTK_CELL_ARRAY_V2, is defined when the new vtkCellArray API is active to make it easier to support both the old and new APIs in external projects. It can be used to switch between legacy and modern usages of the class.

Patches are ready for VTK proper, the WikiExamples repo, VTK’s remote modules, ParaView, ParaView’s VisitBridge.

The merge request tha updates VTK is here: https://gitlab.kitware.com/vtk/vtk/merge_requests/5682

The new class documentation covers the changes, so I’ll just include that to give a summary of the changes:

vtkCellArray class documentation

vtkCellArray stores dataset topologies as an explicit connectivity table
listing the point ids that make up each cell.

Internally, the connectivity table is represented as two arrays: Offsets and
Connectivity.

Offsets is an array of [numCells+1] values indicating the index in the
Connectivity array where each cell’s points start. The last value is always
the length of the Connectivity array.

The Connectivity array stores the lists of point ids for each cell.

Thus, for a dataset consisting of 2 triangles, a quad, and a line, the
internal arrays will appear as follows:

Topology:
---------
Cell 0: Triangle | point ids: {0, 1, 2}
Cell 1: Triangle | point ids: {5, 7, 2}
Cell 2: Quad     | point ids: {3, 4, 6, 7}
Cell 4: Line     | point ids: {5, 8}

vtkCellArray (current):
-----------------------
Offsets:      {0, 3, 6, 10, 12}
Connectivity: {0, 1, 2, 5, 7, 2, 3, 4, 6, 7, 5, 8}

While this class provides traversal methods (InitTraversal, GetNextCell),
these are not thread-safe and are a bit difficult to use correctly as they
do not use the typical VTK iterator API. Prefer to use a local
vtkCellArrayIterator object, which can be obtained via:

auto iter = vtk::TakeSmartPointer(cellArray->NewIterator());
for (iter->GoToFirstCell(); !iter->IsDoneWithTraversal(); iter->GoToNextCell())
{
  // do work with iter
}

The internal arrays may store either 32- or 64-bit values, though most of the API
will prefer to use vtkIdType to refer to items in these arrays. This enables
significant memory savings when vtkIdType is 64-bit, but 32 bits are
sufficient to store all of the values in the connectivity table. Using
64-bit storage with a 32-bit vtkIdType is permitted, but values too large to
fit in a 32-bit signed integer will be truncated when accessed through the
API.

Methods for managing the storage type are:

  • bool IsStorage64Bit()
  • void Use32BitStorage()
  • void Use64BitStorage()
  • void UseDefaultStorage() // Depends on vtkIdType
  • bool CanConvertTo32BitStorage()
  • bool CanConvertTo64BitStorage()
  • bool CanConvertToDefaultStorage() // Depends on vtkIdType
  • bool ConvertTo32BitStorage()
  • bool ConvertTo64BitStorage()
  • bool ConvertToDefaultStorage() // Depends on vtkIdType
  • bool ConvertToSmallestStorage() // Depends on current values in arrays

Note that some methods are still available that reflect the previous
storage format of this data, which embedded the cell sizes into the
Connectivity array:

vtkCellArray (legacy):
----------------------
Connectivity: {3, 0, 1, 2, 3, 5, 7, 2, 4, 3, 4, 6, 7, 2, 5, 8}
               |--Cell 0--||--Cell 1--||----Cell 2---||--C3-|

The methods require an external lookup table to allow random access, which
was historically stored in the vtkCellTypes object. The following methods in
vtkCellArray still support this style of indexing for compatibility
purposes, but these are slow as they must perform some complex computations
to convert the old “location” into the new “offset” and should be avoided.
These methods (and their modern equivalents) are:

  • GetCell (Prefer GetCellAtId)
  • GetInsertLocation (Prefer GetNumberOfCells)
  • GetTraversalLocation (Prefer GetTraversalCellId, or better, NewIterator)
  • SetTraversalLocation (Prefer SetTraversalLocation, or better, NewIterator)
  • ReverseCell (Prefer ReverseCellAtId)
  • ReplaceCell (Prefer ReplaceCellAtId)
  • SetCells (Use ImportLegacyFormat, or SetData)
  • GetData (Use ExportLegacyFormat, or Get[Offsets|Connectivity]Array[|32|64])

Some other methods were completely removed, such as GetPointer /
WritePointer, since they are simply not able to be emulated under the current
design.

Hi Allison,
Just a quick question, what effect does this have if any on the usage of a vtkCellLocator? If I have a polydata with its strips set to a vtkCellArray (using the updated format i.e both offsets and connectivity are built to create a valid vtkCellArray which is in turn set as the strip of the poly data), subsequently I set the vtkCellLocator’s Dataset to the polydata and turn on AutomaticOn and call buildLocator().

what I would like to know is is there anything special that I need to do regarding the new changes to vtkCellArray if I plan to use it along with a cell locator as described above.?

The vtkCellLocators should work the same as before without any changes to your code.

Thanks for the expeditious reply.

The changes make a lot of sense. Is there anything similar to be expected regarding the handling of face/face-offfset for vtkUnstructuredGrid? I’m not really sure how it could be changed, but want to check before I start reworking my code that I have for populating vtkCellArray manually. Will need to juggle the allocation sizes accordingly, but only have vtkCellArray forward declared at that point…

The polyhedral face data is stored the same as before, just translated into the new layout: Instead of inserting a cellSize prefix before the face data, the size is used to update the offsets array.

In VTK 9.0, signature of GetNextCell has changed to require second parameter to be const, which makes it a breaking change. The following code no longer compiles:

vtkIdType * indices;
vtkIdType   numberOfPoints;
cellArray->InitTraversal();
cellArray->GetNextCell(numberOfPoints, indices);

to get it compiling with VTK 9.0, indices must be const:

vtkIdType const * indices;
vtkIdType numberOfPoints;
cellArray->InitTraversal();
cellArray->GetNextCell(numberOfPoints, indices);

but the updated code does not compile with VTK 8.2.

Is this intended? For a traversal to work with both versions, is there anything more elegant than #ifdef VTK_CELL_ARRAY_V2 ...?

For a traversal to work with both versions, is there anything more elegant than #ifdef VTK_CELL_ARRAY_V2 ... ?

That’s the only way to have this work in both versions, unfortunately.

1 Like

Generally these changes to vtkCellArray seem very positive, but there was a regression that led to a bug in SlicerDMRI. In earlier VTK versions, calling this method gave a pointer into the vtkCellArray

vtkIdTypeArray *cellArrayIds = cellArray->GetData();

but with the new API it returns a copy of the data using ExportLegacyFormat in the implementation.

This meant that our code, which operated no the cellArrayIds to add lines to the cell array, had no effect on the vtkPolyData and the lines were missing. After some investigation I was able to fix it with the patch linked below.

Realistically the change in GetData was an API regression because the behavior of the call changed. I’d argue that vtkCellArray::GetData() should have been removed along with GetPointer / WritePointer and replaced with a stub that generated a compile time error with a comment that described the replacement API and how to use it.

In addition, the current implementation of GetData() should have been renamed GetLegacyDataCopy().

It’s probably not good to make such a change to the API now that it’s been released, but I wanted to flag this in case somebody else runs into a similar issue in the future.

Perhaps the best solution would be to mark vtkCellArray::GetData() as deprecated.

1 Like

Why not updating the code in SlicerDMRI to use the new API instead to avoid an unnecessary copy? You can have access to the underlying arrays using GetOffsetsArray and GetConnectivityArray.

Of course we could do that but it’s not really the point. The point is that the API broke without warning.

Is there a plan to move from a vtkIdTypeArray to this vtkCellArray structure or something equivalent for faces in vtkUnstructuredGrid ?

Then the method
void vtkUnstructuredGrid::SetCells( vtkUnsignedCharArray * cellTypes, vtkIdTypeArray * cellLocations, vtkCellArray * cells, vtkIdTypeArray * faceLocations, vtkIdTypeArray * faces)

could be replaced by something like that:

void vtkUnstructuredGrid::SetCells( vtkUnsignedCharArray * cellTypes, vtkCellArray * cells, vtkFaceArray* faces)

It looks cleaner.

I agree with you, it would be much cleaner and provide random access to cell faces. I’m not sure the implications for backwards compatibility though …

Now that tools like the Slicer Preview Release and DiPy support reading and creating legacy VTK files that use these OFFSETS and CONNECTIVITY properties, can I suggest that the documentation be updated to describe this newly minted variation. This would help tools that attempt to support legacy VTK files without using VTK source as a dependency.

Slicer can read this new file format variant but it never writes it, as the files would be unreadable by all other software that implemented VTK file reading independently. The development version of Slicer accidentally used the new variant for a few weeks, but then we realized what changes were done in VTK and quickly reverted to writing the well-established VTK 4.2 file format.

Over the years, the .vtk file format has become a standard data exchange file format between many software. This is a great accomplishment. However, this also means that VTK library developers no longer have the freedom to change the file format.

To avoid upsetting VTK developers and users, and avoiding tarnishing VTK’s reputation in general, it would be better if .vtk file extension was only used for the 4.2 file format. If anyone needs the 5.1 format then a new file extension could be introduced for that instead that (e.g., vt5).

The file format change was unfortunate IMO and has caused no end of grief for users. Andras, to riff on what you are saying, another thought is to create a set of extensions: .vtk, vt4, vtk5, with .vtk compile-time specified (i.e., .vtk4 or .vt5, default to .vt4). I’m not sure this is worth the effort though…

Yes, it would be sufficient to make VTK file writer default file version configurable in CMake, set to 4.2 by default and add a note in the documentation that .vtk5 extension is recommended for 5.x files.

VTK-based applications (ParaView, Slicer, etc.) could add support for this newb.vtk5 file format without worrying about breaking backward compatibility.