Special handling of NaN values during interpolation / edge splitting

pgallo · April 12, 2024, 10:08am

Hi,
I’d like some guidance on the following problem.

I’m working with datasets that use NaN values to represent missing results for some points/cells in a vtkUnstructuredGrid. During common operations such as clipping or contouring, cell edges are split and new points are inserted in the dataset, with their attributes obtained by interpolating the two corresponding values at the original edge endpoints.
Since the interpolation code has no special handling of invalid values, if one of these two endpoints had a NaN attribute, the result of the interpolation is also going to be a NaN. Instead, in the case of interpolation between a valid value and a NaN, I need the valid value to be returned. I know that this has little to no mathematical basis but it’s how our post-processing works and I’m looking for ways to achieve this with VTK.

Do you know of any way that I could achieve such behaviour with VTK (in C++) ? If it’s not possible to alter the interpolation scheme in such a way, it would be also ok if I could get some information for the new points inserted in the dataset to know where are they coming from (in terms of which edge was split to generate them) so I could implement a routine that swaps NaNs with the desired values as a post-processing step after a contour/clipping operation.

Thanks in advance,
-PaoloG

Paulo_Carvalho · April 13, 2024, 10:00pm

Hello,

The new points after splitting becoming NaNs is the expected behavior. By definition, interpolating, say, between 42 and NaN is an indefined operation, hence the result is NaN. Then you need to fill all the NaNs in a post-processing step like you said. In my project, I leave this decision to the user:

In the example above NDVs (non-data values) from the database are mapped to NaNs in VTK objects. So, whenever the user finds NaNs in the data after using some algorithm, one can interpolate them as a post-processing step.

best,

PC

pgallo · April 15, 2024, 7:05am

Thanks Paulo,
you’re right, but how would I interpolate the results in a post-processing step ? I would need to know the IDs of the points from which the new point was generated (by splitting the edge). Is there any way to propagate this information through the filter ?

-PaoloG

cory.quammen · April 24, 2024, 6:12pm

Hi @pgallo,

This is a tricky problem because interpolation modes are embedded deeply within VTK.

One approach could be to add to VTK another interpolation mode that changes how vtkDataSetAttributes::InterpolateEdge, vtkDataSetAttributes::InterpolatePoint, and vtkDataSetAttributes::InterpolateTime operate. Currently, these functions support linear and nearest-neighbor interpolation. You want something that is linear-or-not-nan interpolation. Since the first two functions are what the clip and slice filters call, we could modify these functions to handle this new interpolation mode. The trick, though, is that a lot of filters would need to be modified to change from the using the default interpolation mode to this new mode. It’s not impossible, but more than a little work.

A less invasive way to do this might be to write a filter that fills in values where it finds NaNs. Ideally, the algorithms that produce these geometries could give you the ids of the two points on the cell edge from which they came, but they don’t. So how else could you get them? You could use vtkIdFilter to add cell ids to the input data. The cells of the cut data will then have the originating cell id. When you look at the output of the cut filter, you could find point IDs with NaNs, find the cells in the cutter output that use them, and use that info along with the originating cell ID to find the two points in the original data set from which that point was interpolated. Finally, you could select the non-NaN point and fill that in in the output of this new filter that fills in real data values where it finds NaN values. Phwew! That’s a fair amount of work, but it should be feasible and fairly self-contained.

Paulo_Carvalho · April 24, 2024, 8:42pm

Hello,

I use external algorithms for this. For example, kriging, which is a non-biased best fit estimator that also yields the estimation error, which is very important in natural resources estimation (e.g. petroleum, minining, underground water, etc.). Interpolation is a type of estimation. For a pure-VTK alternative, please, refer to @cory.quammen 's answer which seems promising depending on the application of your code.

best,

PC

pgallo · May 3, 2024, 6:57am

Thanks @cory.quammen and @Paulo_Carvalho for your inputs,
the idea of replacing NaNs with values based on the point IDs is valid, but introduces a performance hit that we don’t really want in our use case.
We actually went with an invasive approach and built a modified version of VTK that has an additional global flag to control the interpolation behavior, which is then checked in all functions that perform an interpolation (vtkDataArray::InterpolateTuple, vtkGenericDataArray::InterpolateTuple, ArrayPair<T>::Interpolate, ArrayPair<T>::InterpolateEdge and ArrayPair<T>::InterpolateOutput) to run some custom code instead of the default linear interpolation.

Regards,
-PaoloG