Using vtkMultiProcessController.Allreduce() in Python to ignore null search results

I am trying to wrangle a parallelized computation using VTK methods (it is in Paraview if that helps anyone, but I believe the core question is purely VTK). I have been able to use the example here https://blog.kitware.com/mpi4py-and-vtk/ to good effect, but I want to make it better, and I can find very little in the way of examples for other ways to use Allreduce().

A simplified version of my code is below. The intention of each thread is for the user to input an ID for a point, and it returns the coordinates for the point. However, if this is parallelized across multiple threads, some of them will return a null result (the ID won’t be in the part of the dataset it is responsible for searching). So at the end of the parallelized search, there will be one good result and the rest are null.

  5 from mpi4py import MPI
  6 import vtk
  7 from vtk.numpy_interface import dataset_adapter as dsa
  8 import numpy as np
  9 
 10 #get MPI stuff
 11 gc = vtk.vtkMultiProcessController.GetGlobalController()
 12 rank = gc.GetLocalProcessId()
 13 comm = vtk.vtkMPI4PyCommunicator.ConvertToPython(gc.GetCommunicator())
 14 
 15 #perform lookup function to return the point corresponding to an ID, output is
 16 #a 3-tuple. If the lookup can't find the ID, then set it to all large negatives.
 17 point = some_lookup_function(ident)
 18 if point is None:
 19     point = (-999999.0, -999999.0, -999999.0)
 20     
 21 #do some array formatting (I am not fluent enough to know why this is necessary,
 22 #I'm just blindly following the example)
 23 pa = vtk.vtkFloatArray()
 24 pa.SetNumberOfTuples(3)
 25 [pa.SetValue(i, point[i]) for i in range(3)]
 26 pa_va = dsa.vtkDataArrayToVTKArray(pa)
 27 result = np.array(pa_va)
 28 
 29 #call to Allreduce (again, just blindly following the example)
 30 comm.Allreduce([pa_va, MPI.FLOAT], [result, MPI.FLOAT], MPI.MAX)
 31 
 32 print(rank, result)

I am currently getting around this by resetting null results to be arbitrarily large negative numbers (-999999), and using the MAX operation. This will work for a majority of my use cases, but it is not as robust as it could be. For instance, incorrect results will occur if the domain of the dataset ever exceeds -999999.

So my question is, is there a better way to implement this? My first thought was to add a 0 or 1 to the vktFloatArrays that indicates whether the ID was found or not. But I don’t know how to ensure that Allreduce() only uses that element of the tuple in its operation.

In general, I need to pair the concepts of “found the ID” and “this is its coordinates”, then Allreduce() on only the first concept while returning the second concept.

I was able to answer my own question due in part to inspiration from here: https://stackoverflow.com/questions/31388465/summing-python-objects-with-mpis-allreduce. I was afraid to deviate from the kitware blog example at first, but once I found out I can use lowercase allreduce() instead of Allreduce(), using a custom operator via MPI.Op.Create() was much easier to figure out.

My eventual solution was to make a customer operator like so:

 36 def ar_operator(tuple1, tuple2, datatype):
 37     indy1 = int(tuple1[3])
 38     indy2 = int(tuple2[3])
 39     if indy1 == 1:
 40         output = tuple1
 41     else:
 42         output = tuple2
 43         
 44     return output
 45 
 46 compOp = MPI.Op.Create(ar_operator, commute=True)

And then change everything else following the basic MPI calls to:

 15 #perform lookup function to return the point corresponding to an ID, output is
 16 #a 3-tuple.
 17 point = some_lookup_function(ident)
 18 if point is None:
 19     point = (0, 0, 0)
 20     logical = False
 21 else:
 22     logical = True
 23 
 24 #do some array formatting
 25 pa = vtk.vtkFloatArray()
 26 pa.SetNumberOfTuples(4)
 27 [pa.SetValue(i, point[i]) for i in range(3)]
 28 pa.SetValue(3, int(logical))
 29 result = vnp.vtk_to_numpy(pa)
 30 
 31 #call to allreduce
 32 comm.allreduce(result, op=compOp)
 33 
 34 print(rank, result)