Proposal: Add array component to `vtkAlgorithm::SetInputArrayToProcess()`

Proposal

I would like to change vtkAlgorithm::SetInputArrayToProcess() to optionally include a component index and add a method to fetch the component (if one has been specified) for a given input array.

I would also like to add a method to create a procedural single-component array (using vtkImplicitArray) on demand that holds the single component specified (in cases where the array has multiple components).

Motivation

I’ve run into several use cases where

  1. a filter requires a single value per point or cell; rather than force users to create an array-calculator or other filter to extract a component from an existing array, it would be nice to accept vector/tensor arrays along with a component number; or
  2. a filter produces a vector or tensor array and I would like downstream filters to process a single component of it.

Details

When SetInputArrayToProcess() is called, a vtkInformationVector is set with

  • the input port whose data holds the array (INPUT_PORT information key)
  • the input connection index whose data holds the array (INPUT_CONNECTION information key)
  • the array’s association (points/cells/rows/vertices/… via the FIELD_ASSOCIATION key),
  • the name (FIELD_NAME) or attribute-type (FIELD_ATTRIBUTE_TYPE) of the array,

What I propose is to add an information key vtkDataObject::FIELD_COMPONENT that is:

  • unset when the entire array is to be used rather than a single component;
  • set to a positive number when a single component of the given array is selected;
  • set to a special value (-1 or vtkDataObject::TupleL1Norm) when the L₁ norm of a tuple should be used;
  • set to a special value (-2 or vtkDataObject::TupleL2Norm) when the L₂ norm of a tuple should be used.

Existing filters may safely ignore the array component when it is specified and continue to behave the way they currently do.

New SetInputArrayToProcess() signatures would be added to accept an optional component-index (existing virtual method signatures would not change for backwards compatibility):

  void SetInputArrayToProcess(
    const char* name, int fieldAssociation, int component = vtkDataObject::AllComponents);
  virtual void SetInputArrayToProcess(
    int idx, int port, int connection, int fieldAssociation, const char* name,
    int component);
  virtual void SetInputArrayToProcess(
    int idx, int port, int connection, int fieldAssociation, int fieldAttributeType,
    int component);
  virtual void SetInputArrayToProcess(int idx, int port, int connection,
    const char* fieldAssociation, const char* attributeTypeorName,
    int component);

New or modified filters may call new methods on vtkAlgorithm to fetch component information (no matter which SetInputArrayToProcess() method was called):

  virtual int GetInputArrayComponentToProcess(int idx);
  virtual vtkSmartPointer<vtkAbstractArray> GetInputArray(
    int idx, vtkInformationVector** inputInfo);
  virtual vtkSmartPointer<vtkAbstractArray> GetInputArray(
    int idx, vtkDataObject* data, int association);

The latter methods would return either a smart pointer to the original array (if it had an acceptable number of components) or a smart pointer to a vtkImplicitArray that extracts the given component or norm from an original multi-component array as requested by the caller to SetInputArrayToProcess().

@berk.geveci @mwestphal I believe you will both have opinions about this. :slight_smile:

I would also like to add a method to create a procedural single-component array (using vtkImplicitArray ) on demand that holds the single component specified (in cases where the array has multiple components).

I don’t understand what this means.

The concept is reasonable. I don’t love the idea of adding yet new GetInputArray() methods when there are already 6 of them. I understand the need for backwards compatibility but I still don’t like it. I also don’t love the fact that a Get method can return a newly allocated array. It is not the norm for VTK (except for lazy evaluation where someone else still holds the pointer). ParaView handles this use case in the UI, doesn’t it? There is some kind of upstream magic filter that does this somehow? If this is for VTK only, maybe it is fine for people to apply a filter for the extraction?

The idea was that GetInputArray() would be called inside RequestData() and then discarded. The smart-pointer it returned would either be (a) already owned by the input data object or (b) a very small vtkImplicitArray that adapts an array already owned by the input data object.

Not that I’m aware of. All the filters that require a single component provide both an array-selector (i.e., a vtkSMArrayListDomain on a string property and a separate integer property that lets you choose a component). But that only works for 1 array/component-pair at a time.

My plan for ParaView was to add a vtkSMArrayComponentSelectionDomain whose property-helper would be aware of how to call the new method. There would be a pqArrayComponentSelector that the statistics algorithms (and others) could use to select any number of components of different arrays of interest. For example, if you want to compute correlations between X acceleration and each component of strain tensor, you would be allowed to do that. Or correlate X acceleration to Y acceleration.

The vtkImplicitArray class makes it easy to take an existing N-component array and return a single-component array that procedurally just fetches the i-th component for any given tuple. It owns a reference to the N-component array. So filters could just use that array internally as if the user had extracted the component of interest. There is a small time-overhead, but no space overhead – and if users care about speed, they can add calculator filters to the pipeline to extract components they care about.

The vtkImplicitArray class makes it easy to take an existing N-component array and return a single-component array that procedurally just fetches the i-th component for any given tuple. It owns a reference to the N-component array. So filters could just use that array internally as if the user had extracted the component of interest. There is a small time-overhead, but no space overhead – and if users care about speed, they can add calculator filters to the pipeline to extract components they care about.

Got it. It is useful to have this irrespective of the rest of the conversation.

If I turn on “Auto convert properties” in the settings and then try to contour a dataset with vectors in it, I see all the components of the vectors (with _X, _Y, _Z names) and the magnitude in the list. Isn’t this what we are talking about? Try: Wavelet → Random Vectors → Contour after turning on the “auto convert properties” option.

I thought auto-convert only did cell↔point conversion. I agree, with auto-conversion forced, there is no need for a new domain.

Auto-convert properties in ParaView works by putting a ParaView-specific vtkPVPostFilter in the pipeline to expose the different components and do the point-to-cell and cell-to-point data conversio.