Addition: stack vector

There are some places in VTK that could be improved using an array-like data structure: it would lie on the stack if its size is less than a compile-time knows size (typically twice the cache line size) otherwise lie on the heap and use the typical new/delete idom. Up to my knowledge, there is no internal structure that does this in VTK.

Following this conversation, I would like to add to VTK core such a data structure. You can find a first implementation here, currently named vtkStackVector. Although the current implementation provides methods such as reserve, resize and shrink_to_fit but also a copy constructor and a assignement operator, it should only be used as a one time scoped allocated vector. A typical usage is to replace a scoped array allocation followed by a deletion within a function.

Pros for this structure are:

  • its lightweight and easily droppable as the mr shows
  • avoid unecessary heap allocation, espacially for local scoped arrays
  • portable as it currently compiles on most compilers

Speed improvements have already been shown in this MR.

I’d like to emphasize on the fact that this is not a replacement for the std::vector: it’s not meant to be used as a dynamic vector, where it would performs poorly. It would also be bad, for performance reasons, to follow the usual vtkNew idiom, as the purpose of this new data structure is to avoid unecessary heap allocations.

I’m open to suggestions and feedback such as naming conventions, but also portability!

I am not sure we need a new construct that’s not a C++ standard for several reasons:

  • allocations should generally be not done inside performance-critical loops anyways.
  • is there any dramatic performance benefit to using vtkStackVector instead of a combination of std::array, std::vector and out-of-loop allocations in the vtkEnSightGoldBinaryReader? It’s a reader. I’d suspect the reading from disk will be the slower part anyways.
  • since it’s not something C++ developers are expecting to exist, I am not sure how many developers will actually use this in their code instead of the standard options.

Whenever possible, we need to start moving away from creating VTK-ism and using standard C++.

1 Like

I agree with this. I’ve done a review of the code as well and there are some open questions that I have related to some bits.

I think just using std::array should be considered here rather than adding new API.

@utkarshayachit
Can probably move smallish items out of loops. Makes the code slightly uglier, but doable.
If we want everything standard, I think we can also have a local wrapper for std::vector.
For example,

std::vector<int> someCache;
someCache.reserve(bigNum);  

...
someCache.resize(n);  

This makes the item dynamically resizable without multiple heap allocs each time.

@ben.boeckel - using std::array is non-starter. The sizes can be changing in various places. Not a fixed size.

If you only wanna go with std::array and std::vector, you could just do as I’ve done in my implementation and trick the compiler by using a pointer to either the beginning of the array or the beginning of the vector. Doing that would be better than what’s done in the reader (and in various places throughout VTK), but worse than the proposed data structure as you have the overhead of std::vector which implements a dynamically re-sizable array (you would potentially allocate much more than what’s needed).

If non-standard is the way then I could either:

  • keep the data structure within the reader cxx (which would not make it available for future performance improvements to other part of VTK)
  • use a mix of std::vector and std::array.

For the EnSight reader, it really looks like it would be sufficient to use a std::vector that is constructed outside of the loop and then resized as necessary within the loop. That would reduce the number of allocations from O(n) to O(1), or at worst O(log(n)), since the vector would quickly grow to the necessary size.

One problem with using a vtkStackVector here is that it doesn’t move the allocation out of the loop if the allocation size is more than the reserved stack size. In other words, the worst-case behavior is just as bad as before.

I tried to use std::vector, but for the data set I have we need to have the memory tight (simply switching to std::vector allocates 8gb more than the implemented method which already allocates ~30gb).

I think the best thing to do right now is benchmarking. I will come up with data and memory usage for both methods.

1 Like

Regarding memory use, the problem with vtkEnsightGoldBinaryReader is that it uses temporary arrays rather than simply reading the data directly into the VTK data structures, which would be much more efficient.

Current code:

  xCoords = new float[numPts];
  yCoords = new float[numPts];
  zCoords = new float[numPts];
  this->ReadFloatArray(xCoords, numPts);
  this->ReadFloatArray(yCoords, numPts);
  this->ReadFloatArray(zCoords, numPts);

  for (i = 0; i < numPts; i++)
  {
    points->InsertNextPoint(xCoords[i], yCoords[i], zCoords[i]);
  }

Since vtkPoints is just a wrapper around a VTK data array (a vtkFloatArray by default), the ReadFloatArray() method should be reading directly into vtkPoints. All that ReadFloatArray() would need is an additional argument for the stride to use for its output buffer. Then x could be written to every third element starting at 0, ‘y’ could be written to every third element starting at 1, ‘z’ to every third element starting at 2.


Edit: instead of saying "ReadFloatArray() should be reading directly into vtkPoints“, it would be clearer to say "ReadFloatArray() should be reading directly into the vtkFloatArray that vtkPoints uses to store the points.”

As a point of reference, a lot of work has gone into supporting AOS (array of structures) and SOA (structure of arrays) vtkDataArrays. See for example: https://blog.kitware.com/new-data-array-layouts-in-vtk-7-1/ and https://blog.kitware.com/working-with-vtkdataarrays-2019-edition/.