About proper distributed AbortCheck

  1. Context

A few years ago, @Stephen_Crowell @Christos_Tsolakis and @berk.geveci led an effort to improve abort mechanism in VTK, this effort was not fully integrated but great point and design were made here: https://gitlab.kitware.com/vtk/vtk/-/issues/18463 (must read).

Im picking this up again, specifically I’m looking into the possibility to abort a filter while it is running in a distributed pipeline.

2. The CheckAbort problem

With the current implementation of CheckAbort(), the idea is to magically position the AbortExecute flag on all distributed processes at the same time. Then whenever each process use CheckAbort() on their own time, they will abort and return in error.

Since distributed processing in VTK is implemented using MPI, this is just not possible, as there is no way to talk to a distributed process while it is engaged in communication with another process sending messages back and forth.

Moreover, even in the scenario were we could position a flag on all processes at the same time, then there are still cases where it would not work, eg:

tick 0: proc A is processing, proc B is processing
tick 1: proc A is waiting on a MPI receive, proc B is doing some heavy processing
tick 2: all abort flags are set on all processes, proc A is still waiting on a MPI receive, proc B is still doing some heavy processing
tick 3:proc A is still waiting on a MPI receive, proc B finished processing, and check abort flag
tick 4: proc A is still waiting on a MPI receive, proc B aborts everything and never send with MPI
tick 5: proc A is still waiting on a MPI receive for a message that will never come, dealock

So clearly, current CheckAbort() mechanism doesn’t fit distributed computing, we need a way to synchronize these checks.

3. Two main usecases

There is two usecases to account for, MPI using filters, and non-MPI using filters.

For MPI-using filters, there is no magic possibility, the CheckAbort calls MUST be synchronous and akin to a MPIBarrier call. All processes wait for each other and then reduce the abort state so that everyone abort at the same time if any processes was aborted.

For non-MPI using filters, there can be more leniency, as the CheckAbort and reduction of the abort state can be started at different point on different processes, they will ultimely wait for each other and then abort together.

But there is one problem. processes don’t wait around, a singular processes with no data to process (because of a clip at the beginning of the pipeline for example) would just keep going and finish updating fully while the other processes are still clipping!

There is no other choice then to synchronize at the end of processing of each filter, before moving to the next filter.

It is not ideal in a task based distributed computing system but in reality, with VTK pipeline, processes with not a lot of work with one filter tends to not have a lot of work with all filters, so there is no balancing to do.

4. Actual Implementation

So we need two implementation, a synchronous impl and a lazy impl, and these will be triggered by calls from within the filter implementation. The lazy impl also MUST not depend on anything but Common modules in VTK because we cannot add dependency to ParallelCore.

So this must be handled using events.

vtkAlgorithm::CheckAbortAndInvoke that invokes CHECK_ABORT event and then actually CheckAbort is pretty easy, as long as it documents the needs for synchronous calls on MPI filters.

Adding vtkAlgorithm::CheckAbortDone() that invoke a new CHECK_ABORT_DONE event will then takes care of the synchronization of the end of filters.

5. What about the inter process communication ?

This is where it is still a bit murky to me.
I could easilly observe these events from the ParaView server layer and react on it to do the actual abort state reduction / syncronisation but I feel like this should be provided by VTK somehow, and I don’t know exactly how this could look like.

A possibility could be an AbortObserver that would be added to vtkAlgorithm in a generic version and ParallelMPI would factory-provide a specialized version that would be responsible to reducing the abort. This is more or less how the vtkProgressObserver is implemented.

Let me know what you think!

4 Likes

@mwestphal Have you thought about filters that run other filters internally? Could that cause deadlocks in the same way? Especially when processing distributed datasets that may not have cells on every rank?