vtkRedistributeDataSetFilter - cuts to MPI ranks


I’m using vtkRedistributeDataSetFilter to partition an input mesh across N MPI ranks. I then need to establish for each rank a set of candidate “neighbor” ranks that might contain shared vertices. When N is a power of 2 this is easy to do using GetCuts() API, since each cut maps to exactly one rank, and it’s easy to check intersections of bounding boxes. For other values of N, the filter uses and returns number of cuts that is a power of 2 greater than N, and it’s not obvious how these cuts map to ranks in the output dataset. Is there a way to extract this information?

I don’t think there’s a function designed to do that, but it’s doable to retrieve this information. There’s an example of how to retrieve the geometrical connectivity map in Parallel/DIY/vtkGhostUtilities.txx. Look, in VTK master, at vtkDIYGhostUtilities::ComputeLinkMapUsingBoundingBoxes. This function returns a std::vector<std::set<int>> which tells you how partitions are geometrically linked with each other. For each neighbor, it is a list of partition global ids, as defined by the decomposer you’re using. You can look at vtkDIYGhostUtilities::GenerateGhostCells to see how the decomposer can be generated. You’ll have to define a block structure that looks like that:

struct Block
  // My own bounding box
  vtkBoundingBox BoundingBox;
  // Bounding boxes of other partitions that I might be connected to
  // Mapping a partition / block global id to a bounding box
  std::map<int, vtkBoundingBox> NeighborBoundingBoxes;

You have to broadcast the bounding boxes using a similar approach as in vtkDIYGhostUtilities::ExchangeBoundingBoxes. Don’t forget to inflate the bounding boxes to make sure you don’t miss a link.

It is not very hard to retrieve what you’re looking for but it might deserve it’s own tool function doing all of that. Why do you need to have this information?

Hi Yohann, thanks a lot for your response! I will look into your suggestion and more into VTK code. It looks like it can provide exactly what I’m looking for.

We’re using VTK to read an unstructured grid dataset (.vtu/.pvtu), (re-)distribute it and convert it into our code’s internal data structure for performing parallel simulation. Our framework has its own routines for establishing and exchanging ghost cells/nodes once the local mesh partition is constructed. However, in order to avoid all-to-all communication (which can be expensive in very large parallel jobs), every rank needs to be seeded with a candidate list of neighbor ranks with which to exchange potentially matching nodes. This kind of information is typically cheaply available from the grid partitioner, and in particular it’s easy to compute when decomposition is done with a kd-tree - given that we can find a set of bounding boxes that contain current rank’s partition. So far we have this approach working well when number of ranks is a power of 2, and I’m trying to extend it to the general case.

Have you considered generating ghost cells using VTK before you convert to your internal data structure? The filter’s name is vtkGhostCellsGenerator and its code literally uses the chunk of code I told you to check out in my previous comment.

I’m not sure it made it into the latest release. It’ll be in the following one for sure.

This would make a lot of sense normally, however VTK is only one of mesh/dataset import tools in the code, there are a couple other format readers as well as an internal mesh generator, each with their own way of partitioning across ranks. So the common infrastructure for ghost generation is still needed (and preferred for consistency reasons). It’s not completely out of the question that we would let mesh importers also generate ghosts, but it’s a much bigger infrastructure change.

For the time being I implemented a very simple solution, by replicating on our side (in a simplified way, since we always have num_blocks == nearest_power_of_2(num_ranks)) the partition assignment procedure from vtkDIYKdTreeUtilities (https://gitlab.kitware.com/vtk/vtk/-/blob/d79e2977b601db1e0592c02b1536f39b6b367314/Filters/ParallelDIY2/vtkDIYKdTreeUtilities.cxx#L511). This produces exactly the information I was looking for - I can use the bounding boxes (cuts) extracted from the filter locally on each rank to determine neighbors, and the simulation works fine with any number of ranks.

Of course, it introduces a dependency on a particular implementation behavior of the redistribution filter, which is bad practice. I can’t call the VTK function mentioned above directly, because the corresponding header is private and not installed, but even that wouldn’t completely eliminate the dependency. Ideally as a longer term solution I would add a method to vtkRedistributeDataSetFilter returning a vector<int> produced by vtkDIYKdTreeUtilities::ComputeAssignments. Of course, I’m new to VTK APIs and don’t have a high-level view, so this suggestion is probably not a reasonable one.