Bus Error when updating vtkPolyData in timed loop

I am using code based on the VTK server example here (vtk.js) to create a Remote View application. My actual code is here:
# Python 3 server examplefrom http.server import BaseHTTPRequestHandler, HTTPS - Pastebin.com
I need to change the scene in response to some external input (not from the client browser). To achieve this, I have added an observer for a TimerEvent on the vtkRenderWindowInteractor. Generally everything is working and I can push rendering changes such as changing the camera view point, but I am having trouble when I actually try to change the data in the scene.

The actual concrete class that is being instantiated for the interactor is the vtkGenericRenderWindowInteractor, which by default does not have a functioning timer implementation – nothing is observing the CreateTimerEvent, so the CreateRepeatingTimer method doesn’t do anything. To remedy this, I have added timer functionality that watches for the CreateTimerEvent and uses the reaction task LoopingCall method to Invoke a TimerEvent at a regular interval.

Ultimately I get a callback that appears to be on the same thread as the rest of the VTK setup / rendering code. I remove the original actor that I created when the application loaded and create a new one and add the actor to the renderer. When it actually is rendered, I receive a bus error 100% of the time. It specifically occurs on line 552 of vtkPolyData::ComputeBounds (on VTK 9.0.3)

// Process each cell array separately. Note that threading is only used
// if the model is big enough (since there is a cost to spinning up the
// thread pool).
for (auto ca = 0; ca < 4; ca++)
{
  if ((numCells = cellA[ca]->GetNumberOfCells()) > 250000)
  {
    // Lambda to threaded compute bounds
    vtkSMPTools::For(0, numCells, [&](vtkIdType cellId, vtkIdType endCellId) {
      vtkIdType npts, ptIdx;
      const vtkIdType* pts;
      auto iter = vtk::TakeSmartPointer(cellA[ca]->NewIterator());
      for (; cellId < endCellId; ++cellId)
      {
        iter->GetCellAtId(cellId, npts, pts); // thread-safe
        for (ptIdx = 0; ptIdx < npts; ++ptIdx)
        {
          ptUses[pts[ptIdx]] = 1;   <- This error generates a SIGBUS error
        }
      }
    }); // end lambda
  }

I am running python 3.6.11 from anaconda, along with a version of VTK 9.0.3 that I compiled in order to have support for vtk.web

In my experience, vtkPolyData is not really suited for use in parallelized code. Methods that you would think should not change the state of the object actually do, and then you get corruption of the object. This may be the case here. You could inspect the source code to be certain.

You might want to try pulling the polydata point/cell info out into your own “bag of triangles” struct so that you can ensure that it will work properly in parallel, then process on that. That’s what we’ve been doing.

Thanks. I will give that a go. My intent was to avoid any separate thread / actual parallelization by utilizing the reactor framework used by the VTK remote web server code. In that code, the render scene is updated in response to RPC events coming from the vtkjs client running in the browser. I have examined the timer callbacks in the debugger and it seems that the timer event callback is in the same thread as everything else is running, although it’s possible I’m missing something. It seems like updating the scene in response to server side events should be pretty basic functionality, but so far I cannot find any examples.

Here is the backtrace for reference:

Thread 1 “python” received signal SIGBUS, Bus error.
0x00007ffff45d365f in vtkPolyData::ComputeBounds (this=0x555558032000)
at /usr/local/src/conda/vtk_with_web-9.0.3/Common/DataModel/vtkPolyData.cxx:552
warning: Source file is more recent than executable.
552 ptUses[pts[ptIdx]] = 1;
(gdb) bt
#0 0x00007ffff45d365f in vtkPolyData::ComputeBounds (this=0x555558032000)
at /usr/local/src/conda/vtk_with_web-9.0.3/Common/DataModel/vtkPolyData.cxx:552
#1 vtkPolyData::ComputeBounds (this=0x555558032000)
at /usr/local/src/conda/vtk_with_web-9.0.3/Common/DataModel/vtkPolyData.cxx:485
#2 0x00007ffff44adb75 in vtkDataSet::GetBounds (this=0x555558032000,
bounds=0x555555aa2360)
at /usr/local/src/conda/vtk_with_web-9.0.3/Common/DataModel/vtkDataSet.cxx:212
#3 0x00007ffff4c0bd7f in vtkPolyDataMapper::GetBounds (this=0x555555aa22b0)
at /usr/local/src/conda/vtk_with_web-9.0.3/Rendering/Core/vtkPolyDataMapper.cxx:123
#4 0x00007ffff4b7e10e in vtkActor::GetBounds (this=0x55555802bde0)
at /usr/local/src/conda/vtk_with_web-9.0.3/Rendering/Core/vtkActor.cxx:376
#5 0x00007ffff4bb9304 in vtkFrustumCoverageCuller::Cull (this=0x55555660f290,
ren=, propList=0x555557ee52f0,
listLength=@0x5555568f16b0: 1, initialized=@0x7fffffffa784: 0)
at /usr/local/src/conda/vtk_with_web-9.0.3/Rendering/Core/vtkFrustumCoverageCuller.cxx:91
#6 0x00007ffff4c264fc in vtkRenderer::AllocateTime (
this=this@entry=0x5555568f1460)
at /usr/local/src/conda/vtk_with_web-9.0.3/Rendering/Core/vtkRenderer.cxx:59-----Type for more, q to quit,–Type for more,–Type ----Type <R–Type for more, q to quit, c to continue without paging–
1
#7 0x00007ffff4c2814c in vtkRenderer::Render (this=0x5555568f1460) at /usr/local/src/conda/vtk_with_web-9.0.3/Rendering/Core/vtkRenderer.cxx:370
#8 0x00007ffff4c2b8b8 in vtkRendererCollection::Render (this=0x555556709340)
at /usr/local/src/conda/vtk_with_web-9.0.3/Rendering/Core/vtkRendererCollection.cxx:51
#9 0x00007ffff4c1be59 in vtkRenderWindow::DoStereoRender (this=0x5555560050c0) at /usr/local/src/conda/vtk_with_web-9.0.3/Rendering/Core/vtkRenderWindow.cxx:337
#10 0x00007ffff4c1b726 in vtkRenderWindow::Render (this=this@entry=0x5555560050c0)
at /usr/local/src/conda/vtk_with_web-9.0.3/Rendering/Core/vtkRenderWindow.cxx:297
#11 0x00007ffff3205b1f in vtkOpenGLRenderWindow::Render (this=0x5555560050c0)
at /usr/local/src/conda/vtk_with_web-9.0.3/Rendering/OpenGL2/vtkOpenGLRenderWindow.cxx:2057
#12 0x00007ffff4f79685 in vtkWebApplication::StillRender (this=0x5555568f2060, view=0x5555560050c0, quality=100)
at /usr/local/src/conda/vtk_with_web-9.0.3/Web/Core/vtkWebApplication.cxx:196
#13 0x00007ffff4f79b62 in vtkWebApplication::StillRenderToBuffer (this=this@entry=0x5555568f2060, view=view@entry=0x5555560050c0, time=3416,
quality=) at /usr/local/src/conda/vtk_with_web-9.0.3/Web/Core/vtkWebApplication.cxx:253
#14 0x00007ffff4f8c6f7 in PyvtkWebApplication_StillRenderToBuffer (self=, args=)
at /usr/local/src/conda/vtk_with_web-9.0.3/Wrapping/PythonCore/vtkPythonArgs.h:117

I suspect that simultaneous memory writes in threaded execution may be causing this problem. We can try using std::atomic<> and see if that cleans up the issue. Would you be willing to test a MR/branch ?

I’m curious, what’s the hardware / software computing platform? Most systems I’ve seen are tolerant of writing the exact same data value to the same memory location, but this may be a dumb assumption to make especially given your experience.

Also, just to make sure it’s not a data accessing error, make sure your cell array connectivity is 0-offset, and does not refer to any points outside of the range (0 <= ptId < numberOfPoints).

I have no problem using a test branch.

I am not using an esoteric hardware - just a core i7-6700. I have compiled with OSMesa support using the llvm driver. I have compiled VTK and mesa in a conda environment with python 3.6. It is all being run on Ubuntu 20.04.

I’m still curious how what I am attempting to do is different from how the web code updates the scene in response to rpc calls. When I set a breakpoint on that method, it appears to be on the same thread for the original call where it succeeds and later when it fails.

Before I dive into this issue next week, can you help me out? This line of code in vtkPolyData::ComputeCellBounds() controls whether the bounds is computed in serial or threaded:
if ((numCells = cellA[ca]->GetNumberOfCells()) > 250000)
Can you change and then recompile the value of the number of cells to be something greater than the number of cells in your dataset? (i.e., force the ComputeCellBounds() to run in serial). This will confirm that the issue is related to threading…

I have added your changes and I am still getting the bus error. Here is a link to the diff I use against the 9.0.3 tag if you want to confirm the changes.

I am getting the “Processing Serially” output that I have added. I do think the polydata objects I am adding would have been handled by the multi threaded code (around 330,000 cells).

I have a few conda recipes / packages that make it relatively easy to recreate my environment if you want those. I have a custom version of OSMesa, libglew, and VTK.

I removed all of the serial and parallel code that ensures only valid points are rendered and instead just used the std::fill to initialize the ptUses data member to 1 instead of 0 (all points should be rendered). When I do this, I no longer get the bus error and I am able to update the data in response to server side events. I am now getting some seg fault on the point pick handler when I select a point that I will have to debug.