vtkTriangle fails with assertion 'Arrived in a branch thought to be dead' when called from FindClosestPoint

I’m using vtk objects on several parallel MPI processes for some computations (the vtk objects are launched locally on each process so should never interact with one another). Under certain conditions, and when operating on a different vtkUnstructuredGrid to my usual work, my jobs are terminating and/ or hanging with the assertion ‘Arrived in a branch thought to be dead’.

Specifically, I’m only getting this message when run in parallel with MPI using batch submission (slurm) on a high-performance cluster. I don’t get the error when running locally, or from the command line in the login node of the cluster. However, when I don’t get the error, all of the processes except for the root process eventually end up hung.

The issue also only appears when I use a vtkUnstructuredGrid object (read from an XML-format vtu) with polygonal cells (exported from StarCCM+). It works perfectly on tetrahedra and hexahedra.

Given the vtk objects are all independent of one another, I’m struggling to see how this could be an error with my parallelisation (though certainly not against someone suggesting it is). My only thought is that the geometry being treated by each process is impacted by called vtkMath::Random and vtkMath::Gaussian, with different seeds depending on the process rank.

I’ve managed to find where this assertion is thrown in line 306 of the vtkTriangle.cxx file on GitHub, but I’m really struggling to interpret what the error means. I can tell it implies that at least one of the ‘weights’ calculated using some matrix operations is above 1, and all of the weights are positive.

  • Does anyone understand what this actually means geometrically/ why this ‘branch’ is considered dead? Why is this combination of ‘weights’ not accounted for?
  • Does anyone have any recommendations for avoiding this issue?

Thanks!

Can’t upload attachments as a new user, but will share the problematic .vtu file as soon as the site allows.

Here’s the stack trace from the high performance execution:
SweptStroke: /home/gridsan/njenkins/VTK-9.3.0.rc1/Common/DataModel/vtkTriangle.cxx:305: virtual int vtkTriangle::EvaluatePosition(const double
, double
, int&, double
, double&, double
): Assertion `0 && “Arrived in a branch thought to be dead!”’ failed.

[c-16-12-4:181873] *** Process received signal ***

[c-16-12-4:181873] Signal: Aborted (6)

[c-16-12-4:181873] Signal code: (-6)

[c-16-12-4:181873] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fa7e5b77520]

[c-16-12-4:181873] [ 1] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7fa7e5bcb9fc]

[c-16-12-4:181873] [ 2] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7fa7e5b77476]

[c-16-12-4:181873] [ 3] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7fa7e5b5d7f3]

[c-16-12-4:181873] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x2871b)[0x7fa7e5b5d71b]

[c-16-12-4:181873] [ 5] /lib/x86_64-linux-gnu/libc.so.6(+0x39e96)[0x7fa7e5b6ee96]

[c-16-12-4:181873] [ 6] /home/gridsan/njenkins/VTK_build/lib/libvtkCommonDataModel-9.3.so.1(ZN11vtkTriangle16EvaluatePositionEPKdPdRiS2_RdS2+0xe38)[0x7fa7e77e825c]

[c-16-12-4:181873] [ 7] /home/gridsan/njenkins/VTK_build/lib/libvtkCommonDataModel-9.3.so.1(ZN14vtkGenericCell16EvaluatePositionEPKdPdRiS2_RdS2+0x6e)[0x7fa7e747212c]

[c-16-12-4:181873] [ 8] /home/gridsan/njenkins/VTK_build/lib/libvtkCommonDataModel-9.3.so.1(+0x5de1fc)[0x7fa7e77701fc]

[c-16-12-4:181873] [ 9] /home/gridsan/njenkins/VTK_build/lib/libvtkCommonDataModel-9.3.so.1(ZN20vtkStaticCellLocator28FindClosestPointWithinRadiusEPddS0_P14vtkGenericCellRxRiRdS4+0xa2)[0x7fa7e77677f4]

[c-16-12-4:181873] [10] /home/gridsan/njenkins/VTK_build/lib/libvtkCommonDataModel-9.3.so.1(_ZN22vtkAbstractCellLocator16FindClosestPointEPKdPdP14vtkGenericCellRxRiRd+0xbb)[0x7fa7e72e1dad]

[c-16-12-4:181873] [11] /home/gridsan/njenkins/VTK_build/lib/libvtkCommonDataModel-9.3.so.1(+0x5e8cef)[0x7fa7e777acef]

[c-16-12-4:181873] [12] /home/gridsan/njenkins/VTK_build/lib/libvtkCommonDataModel-9.3.so.1(_ZN22vtkAbstractCellLocator16FindClosestPointEPKdPdRxRiRd+0x70)[0x7fa7e72e1ce6]

Seems to be from @seanm via https://gitlab.kitware.com/vtk/vtk/-/merge_requests/9278

Solved!
Realised the inputs to my call were NaNs due to an error elsewhere - D’oh! Possibly worth altering the error message in that particular line of vtkTriangle as it felt a bit unclear, but this is clearly a me problem. Thanks for the quick response, though.

Well, we know it isn’t actually dead code :slight_smile: . Can you please file an issue so that we can add a test case for this too?

1 Like