Add vtkDataset::GetPoints function

Hello everyone

I would like to add a vtkDataSet::GetPoints function that computes the points array and return it.
I would like to do that to avoid using GetPoint() since it performs a computation at each call for vtkImagaData and vtkRectilinearGrid, while GetPoints would allow me to compute the points once and then query them more than once.

There are 2 options for doing this

  1. vtkSmartPointer vtkDataSet::GetPoints()
  2. void vtkDataSet::GetPoints(vtkPoints* points) like void vtkRectilinearGrid::GetPoints(vtkPoints* points)

The first solution would have a conflict with vtkPoints* vtkPointSet::GetPoints() but it would have 1 function for both.
The second solution would have no conflict with vtkPoints* vtkPointSet::GetPoints() but it would have 2 functions for the same thing for vtkPointSet subclasses.

Which one do you think is best?

I would avoid extending the API and introducing smart pointers. I would choose #2.

Where would you use the new GetPoints() function? Memory-wise, implicit geometry a major feature of vtkImageData and vtkRectilinearGrid, and any filter that generates vtkPoints for these dataset types is going to consume a lot of memory. The generated vtkPoints array will consume anywhere from 3 times up to more than 10 times as much memory as the vtkDataArray that holds the scalars.

I am thinking of utilize vtkImplicitArrays to reduce the memory cost and just have the x, y, z coordinates.

I would use it for in vtkExtractGeometry.

Yes, it does make more sense if you have a vtkImplicitArray that computes the coordinates on demand. For some of my own algorithms, I wrote vtkImagePointIterator to make it easier to write image-processing code that uses points. An implicit array could do pretty much the same thing, but with random-access.

I think my approach would be faster, and we could deprecate the vtkImagePointIterator in favor of that.
I plan to do the same for GetCellPoints.

Why not hosting a vtkPoints inside each subclass or vtkDataSet and update it’s implicit definition upon chaning extents / origin, etc?
GetPoints would become virtual.

That’s what i end up doing.

https://gitlab.kitware.com/vtk/vtk/-/merge_requests/10037

Yeah I think it looks better that way.

Problem now: you’re breaking backward compatibility for people who have custom vtkDataSet, so I think we should probably host a vtkPoints in vtkDataSet as well that would call GetPoint under the cover, and overwrite its behavior in subclasses.

We could just return nullptr in vtkPoints* vtkDataSet::GetPoints() and let the other classes override, right?

Oh that’s an interesting point.

What I’m afraid of is people that would assume that vtkDataSet::GetPoints returning nullptr means that there are no points, which would not be the case for custom subclasses. Similarly it would mean that we would still need to write a specific path using the GetPoint API in filters accepting vtkDataSet as input so they still work in all cases.

On the contrary, if GetPoints returns a legit implicit array returning the points that GetPoint would return, then I feel like we are getting ourselves away from the caveats I just described.

Any take on that @will.schroeder and @dgobbi ?

I agree. If you change the behavior of a pre-existing method name like GetPoints(), there are certain expectations. It shouldn’t return nullptr unless there are no points. People will not like it if vtkExtractGeometry and other filters suddenly stop working for their data.

It’s better to have a pure virtual method than a default implementation that returns nullptr. With a pure virtual method, people see the problems at compile-time and they know what they have to fix.

A default implementation that uses vtkDataSet::GetPoint() sounds like a reasonable solution.

GetPoints could also potentially throw a warning advising overriding this method. I don’t think we would spam the terminal because it doesn’t get called that often, and developers would be aware of this API enhancement and could optimize their class if they feel like it. Explicitly talking about implicit arrays could point them toward the right path easily.

That’s a good Idea.

In the same MR i will also introduce vtkAbstractCellArray so that vtkCellArray and vtkImplicitCellArray (a new class).

These are kinda disturbing changes, but i hope that performance and code simplification will justify them.

Array ranges with implicit arrays have a performance hit of ~25% if I remember correctly because they use the generic implementation. If we want performance, in addition of enhancing those core component APIs, it might be worth specializing array ranges for implicit arrays so we don’t have this performance hit anymore.

I will see what i can do. There is lots of room for improvement. We just need to be careful not breaking the API.

I have a couple of updates in my MR but they can be summarized as below:

When vtkImageData/vtkRectilinearGrid/vtkStructuredGrid call GetPoint, GetPointCells, GetCellType queries, they perform a calculation on demand. Parts of this calculation are repeated for each query. To improve the performance of these we can precompute those parts and utilize them inside implicit Points, Cells and CellTypes structures.

Using the vtkImplicitArray capability, we have added implicit Points, Cells, and CellTypes structures. vtkStructuredData now has 3 new methods to create these implicit structures:

  1. vtkSmartPointer<vtkPoints> GetPoints(vtkDataArray* x, vtkDataArray* y, vtkDataArray* z, int dim[3])
    1. vtkPoints contains an vtkImplicitArray as its data which utilizes, x, y, and z arrays and the dims to compute the implicit Points structure.
  2. vtkSmartPointer<vtkImplicitCellArray> GetCellArray(int dim[3], bool usePixelVoxelOrientation)
    1. vtkImplicitCellArray is a new class to defines an implicit Cells structure for structured data. It utilizes the vtkImplicitArray capability. vtkAbstractCellArray (also a new class) is the parent class of both vtkCellArray and vtkImplicitCellArray.
  3. vtkSmartPointer<vtkConstantArray<int>> GetCellTypesArray(int dim[3], bool usePixelVoxelOrientation)
    1. vtkConstantArray<int> is a vtkImplicitArray that is used to define an implicit CellTypes structure for structured data.

The above structures are used by:

  1. vtkImageData uses the implicit Points, Cells, and CellTypes structures to implement GetPoint, GetPointCells, and GetCellType queries.
  2. vtkRectilinearGrid uses the implicit Points, Cells, and CellTypes structures to implement GetPoint, GetPointCells, and GetCellType queries.
  3. vtkStructuredGrid uses and CellTypes structures to implement GetPointCells, and GetCellType queries.

Thanks to the addition, the highly performant and specialized code for structured data in vtkTableBasedClipDataSet can be removed since these new implicit structures implement the same performant functionality.

Moreover, there is a new vtkPoints* vtkDataSet::GetPoints() function, which for vtkImageData, and vtkRectilinearGrid return their implicit structure. If the vtkDataSet subclass does not define override this function then an internal copy is created and returned which should not be modified.

Lastly, because of this implicit points structure, void vtkRectilinearGrid::GetPoints(vtkPoints*) has been deprecated.

Isn’t dims[3] redundant, since it’s simply the sizes of x, y, and z?

I think you are right! Nice catch