Clarifications and Modifications of Global Ids

Yohann_Bearzi · October 1, 2021, 4:29pm

This post follows this discussion.

In VTK, cell data and point data hold a concept called “attribute types”. Some of the most noticeable ones are Normals, Scalars, and GlobalIds.

Global ids can be generated using vtkGenerateGlobalIds, or by the source when implemented.

Currently, in some filters, GlobalIds usage can be activated through a parameter (see vtkMergeCells::UseGlobalIds for instance), and in other newer ones, they are used for point / cell matching blindly if available (see vtkGhostCellsGenerator).

Using global ids instead of point positions can make some pipelines much faster because point or cell matching between partitions can be done using the ids instead of their position in 3D. So it would make sense to use them when available without having to opt-in, assuming that ids are correctly provided (which should be the case). As said earlier, it is already the case for vtkGhostCellsGenerator, but this behavior could (should?) be extended to all relevant filters.

On top of making filters faster, note that if 2 points have same location but have different data values, the only way to tell that those 2 points should not be considered as a unique point is by assigning them 2 different global ids. So usage of global ids is required in some circumstances.

Is there any objection about making the behavior of blindly using global ids when available in the input the default one for all relevant filters?
A current issue, not on the VTK side, but on higher level applications, such as ParaView, is that there is currently no visual way to know which array is used for global ids. When automatic usage of input global ids is enacted, this will be resolved, probably by writing the relevant array name in bold in the GUI.

Here is another point to talk about. Filters breaking global id information (vtkContourFilter for instance) currently dump them. They could probably, as proposed in the link above, be converted as pedigree ids. What value could be given to newly generated points? Perhaps -1? And do we also automate this behavior? i.e. the user should expect global ids to be converted into a pedigree id when ids cannot be carried?

mwestphal · October 4, 2021, 8:50am

I think this proposition make sense.

Jacques-Bernard · October 4, 2021, 11:41am

Small clarification: a global identifier of an element is a unique and unambiguous identifier to name or search for a particular element of the mesh of a simulation (the numbering is generally declined by type of elements).
This identifier only has meaning if it is provided by the code because the latter will keep this identifier for a cell throughout the simulation.
The other interesting point is that the global identifier of an element does not change according to the choice/number of the division of the simulation into subdomain.
This greatly facilitates the comparison between two simulations (trivial case of parametric studies without changing topology).

VTK manages this very well by allowing a field of values to be associated with this notion of global identifier via SetGlobalIds. In fact, there is not always a global identifier.

We note in fact that this global identifier is not naturally accessible in ParaView. It’s even worse, we highlight ID in FindData which is NOT this global identifier (when it is set) but just an element order index in a block, which has no interest, no sense for the physicist user!

A filter like the vtkContourFilter must keep the global identifier assigned to each cell of the input mesh if it has been positioned, but we cannot have a global identifier assigned to the nodes (it does not really make sense to keep just those of the nodes common between the input mesh and that produced).

Overall, I agree with Yoann. If there is a global identifier identified as such, then you might as well use it … I would even say that it should be used because it must be considered that it makes sense.

But if the source has not defined a global identifier or if a filter loses this notion, as for the nodes of the vtkContourFilter, then it must be considered that it does not exist.

Apart from the interface with the user, the global identifier is of little interest for the application of a filter apart from, of course, the filter of vtkGhostCellsGenerator which must use this information if it exists in order to speed up the calculation process.

I admit, perhaps wrongly, that I see no point in applying vtkGenerateGlobalIds which will have the effect of producing a new magnitude… durably !

Jacques-Bernard · October 4, 2021, 1:01pm

A small complement, to visually identify the magnitude which serves as a global identifier for cells and those for points (by SetGlobalIds), we could add a ‘#’ superimposed on the “cell logo” or “point logo” (no valid for “field logo”).

To bounce back from the other conversation, our readers return:

vtkCellIds,
vtkNodeIds,
vtkMaterialIds (same value for blocks of the same name, “coloring” driven by “community”),
vtkDomainIds (same value for parts with the same number, belonging of the cell or of the node to a calculation subdomain), and
generally we prefix by ‘vtk’ a quantity constructed by our readers (not available in the database).

cory.quammen · October 6, 2021, 2:09pm

Yohann, I think using Global Ids where relevant and passing them through makes sense.

One consideration - attribute types are somewhat of an old-fashioned concept in VTK that we have discussed moving away from. For example, we designate ghost cell arrays with a special array name rather than a data set attribute. Should we move that direction for GlobalIds as well?

Yohann_Bearzi · October 6, 2021, 2:23pm

The problem I see with doing that is that global ids arrays are explicitly tagged in our XML format, with no current requirement on their name. If we want to keep retro-compatibility with older files with a stricter global ids array naming, we would need to rename the concerned arrays when we read an old file, which might be confusing for the user if he opens his file in a text editor to check the arrays?

If we move away from attribute types, would it not make things that are somewhat easy to do more difficult? For instance, let’s say I want to compare normal estimators by actually rendering them, toggling them on and off in the GUI. Do I need to rename the normal array each time I switch between normal arrays? If normals are not deemed necessary to give a special name, how do we carry the information of what array currently represents the normals without attribute types?