GetPointer vs ValueRange for typed data arrays

AllisonVacanti · December 16, 2020, 6:36am

tl;dr: Just storing the vtk::DataArrayValueRange<1>(array) object directly instead of the iterator (no begin()) will also provide operator[] access and is a bit cleaner. The range/iterator approaches provide additional runtime safety checks only in debugging builds. When building in release mode, identical assembly is emitted for the ranges and the pointer versions and performance is the same.

Now – since I love to get longwinded when talking about data models, a more detailed answer follows

This is what I would use:

// Deduce range type
using RangeType = decltype(
  vtk::DataArrayValueRange<1>(std::declval<vtkIdTypeArray*>()));

// Declare variable
RangeType polys;

// Initialize
polys = vtk::DataArrayValueRange<1>(m_Conn);

This looks weird . Each line makes 4 assertions and is the one suggested by docs.

Weird is subjective I suppose C++ has changed a lot over the years, and admittedly the newer syntaxes post-C++11 feel like a completely different language compared with the “C with classes” style from 1994, when most of the core VTK classes were originally written. It takes some adjustment.

The docs suggest this approach because the assertions are your friends – they catch bugs early and don’t affect performance unless you’re building in Debug mode, in which case testing and correctness are the goals of the build, not peak performance. You can also define VTK_DEBUG_RANGE_ITERATORS to get even more correctness checks to help track down trickier problems, again at the cost of performance. Alternatively, you can instead define VTK_ALWAYS_OPTIMIZE_ARRAY_ITERATORS to opt out of the checks and enable optimizations for the ranges/iterators in debug mode, which should help recover a significant amount of performance.

But performance is never impacted where it matters: using ranges and iterators in Release mode generally produces the exact same assembly as using a raw pointer. Towards the end of this blog post I summarized a series of benchmarks on the tight loop iteration performance of ranges vs raw pointers. If you’re using a typed vtkIdTypeArray to create the ranges, the relevant results will be the 3rd and 4th bar charts in that post comparing the vtkAOSDataArrayTemplate results. There’s virtually no difference between the Pointer vs the Range/Iterator benchmarks, all are within noise. Whether you write:

  const ValueType *data = array->GetPointer(0);
  ValueType sum{0};
  for (vtk::ValueIdType v = 0; v < numValues; ++v)
  {
    sum += *data++;
  }

or

  const auto range = vtk::DataArrayValueRange<TupleSize>(array);
  APIType sum{0};
  for (const auto& value : range)
  {
    sum += value;
  }

will have no impact on performance in a release build, but the second version helps catch bugs in debugging builds.

The sources are here if you’re curious what the implementations of the other benchmarks look like.

the functions Insert and Link are called in very tight loops that need to be performant. Which would be better? Is this a bad design?

I wouldn’t say it’s a bad design, but calling either of those in a loop is going to be bad for performance. Both allocate memory, and those allocations are expensive. The Insert method is especially troublesome – it’s branchy, and every call will execute a lot of extra instructions to check whether the array needs reallocation. Not to mention the fact that its behavior is completely different and surprising compared to any other container’s insert methods Use it cautiously, and read the docs for that one carefully!

Ideally, you’ll want to pre-allocate memory whenever possible and keep allocations/invalidations to an absolute minimum. There are three scenarios, in order from best performance to worst:

The exact final size of the array is known. Do a single allocation before the tight loop that exactly fits the data. Use Range APIs to make sure that your calculated size is correct and detect accesses beyond the predicted size when debugging.
The exact size is unknown, but has a known upper bound. Same as above, but preallocate the upper bound and then Resize to the final size once it is known.
Worst case: nothing is known about the final size of the array. Pre-allocate a reasonable guess at the final size with Allocate (which increases capacity without increasing the actual number of valid values) and then fill with InsertNextValue to make sure the array is grown as-needed. Finish up with Squeeze if you want to release any excess memory (this will also invalidate pointers/ranges, btw).

Will that mean GetPointer will soon be deprecated in coming releases?

No. Subclasses of vtkAOSDataArrayTemplate<T> (including vtkIdTypeArray) will always provide the typed GetPointer method. At one point I was pushing to deprecate GetVoidPointer in the generic levels of the array hierarchy, since that’s a real can of worms after the SOA-layout arrays were added. But I don’t think there’s been much interest in continuing that effort since I stopped working on VTK.

Also – unrelated, but as a heads up,

__revalidate_mem_accessors

Identifiers containing double underscores are reserved by compiler vendors. It’s best to avoid them.