vtkCompositeDataSet::ShallowCopy and vtkCompositeDataSet::RecursiveShallowCopy

In vtkCompositeDataSet, there is two shallow copy methods, ShallowCopy and RecursiveShallowCopy.

ShallowCopy only shallow copy the structure of the composite but do not shallow copy the dataset in the leafs, it actually just copy the pointers.

RecursiveShallowCopy does it on all datasets.

It means that shallow copying a composite dataset and then adding an array to one of the leaf actually modify also the shallow copy input, which is definitely unexpected for a VTK developpers used to using ShallowCopy in general.

There is already some bugs because of that, where using a filter will modify its input.

IMO, vtkCompositeDataSet::ShallowCopy should actually RecursiveShallowCopy and a new methods (ShallowCopyStructure?) should be introduced.

What do you think ?

FYI @spyridon97 @Yohann_Bearzi @berk.geveci @will.schroeder

Regarding the first point (changing ShallowCopy behavior): I do not like breaking backward compatibility. What about adding a new method to the base vtkDataObject API like so:

class vtkDataObject {
  struct CopyOptions
  {
    CopyOptions() = default;
    bool Shallow{ true };
  };

  virtual bool Copy(
    vtkDataObject* source,
    const CopyOptions& opts = CopyOptions())
  {
    if (opts.Shallow) { this->ShallowCopy(source); }
    else { this->DeepCopy(source); }
    return true; // or false if \a source is wrong type.
  }
};

Then vtkCompositeDataSet could

class vtkCompositeDataSet {
  struct CompositeCopyOptions : CopyOptions
  {
    CompositeCopyOptions() = default;
    CompositeCopyOptions(const CopyOptions&) = default;
    bool RecursiveCopyLeafNodes{ true };
    bool CopyStructureOnly{ false };
  };
  bool Copy(vtkDataObject* source, const CopyOptions& opts) override
  {
    const auto* compositeOpts =
      dynamic_cast<const CompositeCopyOptions*>(&opts);
    if (!compositeOpts)
    {
      // Construct CompositeCopyOptions from base class options,
      // to keep base options but take default values for composite-
      // data-specific options:
      static thread_local CompositeCopyOptions extendedOpts(opts);
      compositeOpts = &extendedOpts;
    }
    // Now copy using flags from compositeOpts...
    return didCopy;
  }
};

The old API (ShallowCopy, RecursiveShallowCopy, DeepCopy, CopyStructure) can be deprecated. The new API (which can include many varieties of copying, including structure-only copies) allows data subclasses more flexibility and has the desired new behavior.

Interesting suggestion, but it feel much larger that the issue i’m raising and it out of scope of the possible effort I can provide.

Regarding the first point (changing ShallowCopy behavior): I do not like breaking backward compatibility.

I’ve been looking at this deeper and you may be right. @utkarshayachit seems to be aware of the behavior of vtkCompositeDataSet::ShallowCopy and all its implementation in different inherited classes.

He states:

  /**
   * For historical reasons, `vtkCompositeDataSet::ShallowCopy` simply pass
   * pointers to the leaf non-composite datasets. In some cases, we truly want
   * to shallow copy those leaf non-composite datasets as well. For those cases,
   * use this method.
   */
  virtual void RecursiveShallowCopy(vtkDataObject* src) = 0;

And he also implemented the vtkPartitionedDataSetCollection::ShallowCopy using the same idea.

However, there is clearly a documentation issue is this behavior is literally not documented (except above, in another method) and there is already a few incorrect usages in VTK and ParaView.

The way forward seems to be improving documentation, correct the incorrect usages and hope for the best.

I’d apreciate a feedback on this @utkarshayachit.

It doesn’t feel like more than a couple hours of work to do the first part (add the new API). There’s no need to deprecate the old API immediately and you can fix classes you need to use the new API.

I’ve heard that VTK as well as perhaps external libraries are misusing ShallowCopy. That is one of the main reasons I suggested replacing the old API with a new one. By deprecating ShallowCopy, we can identify where it is used and ensure when people update it that they have proper guidance on how to use the new API. As long as the old API is around, old broken code will be.

I agree that ShallowCopy needs more documentation, but relying on documentation to enforce proper usage is a Bad Idea™.

Hi all,

Thanks for bringing this up @mwestphal.

In my opinion the non-recursive shallow copy is quite un-intuitive and very error prone. I also don’t see exactly where it might be needed, but that could be a lack of experience on my part.

For what it is worth, I believe the best solution would be to switch the current RecursiveShallowCopy into ShallowCopy and call the current ShallowCopy something else.

1 Like

This is my initial suggestion but we can’t just do that if we consider the current implementation to be a feature instead of a bug.

The existing behavior is not a feature, but I suspect swapping the improved method with the existing implementation would cause subtle, hard-to-find bugs in some cases.

Any more inputs ? @spyridon97 @berk.geveci maybe ?

I know that backwards compatibility is big thing/feature that VTK aims for, but as @will.schroeder has told me in the past, if something is wrong, even if it’s has been there for ages, we need to deal with the consequences of fixing it. Also as @mwestphal has mentioned, I am pretty sure that ShallowCopy is now used in my places assuming that it would perform RecursiveShallowCopy. It’s indeed an API change, but we need to have special functions for special operations, and not the other way around.

1 Like

It’s a question of how shallow is shallow? You can ShallowCopy a vtkPolyData and then happily modify the copied array, modifying the input unintentionally if you don’t know it is the input data. The implementation as it is just bumps this up a level.

However, composite datasets are already different enough from non-composite datasets that they don’t need another weird thing that makes them even more irritating to use. For that reason, I would be for changing the default ShallowCopy behavior to shallow copy the leaf datasets, which goes against my usual objections of braking backwards compatibility.

1 Like

With every post except @dcthomp advocating for the change, I’ll move forward with changing the behavior. There is no way to actually deprecate the current behavior. RecursiveShallowCopy will of course be cleanly deprecated.

Here are draft MRs for VTK and ParaView, where I introduce CompositeShallowCopy and change all usage that make sense. I found a few problematic usages of vtkCompositeDataSet::ShallowCopy, but of course, I did not scan for any problematic usages of vtkDataObject::ShallowCopy

Once I get a clean CI, I’ll take care of deprecating RecursiveShallowCopy

Thinking about breaking retro compatibility, the chance of breaking someone code is very low.
Of course, some people may use ShallowCopy when they would be able to use CompositeShallowCopy, but this is ok, the overhead is minimal.

The only case I could imagine is someone relying on the dataset-sharing capabilities of the previous ShallowCopy implementation to add arrays in multiple composite dataset at once, but that sounds ridiculous to work like that with VTK tbh.

1 Like

VTK MR has been merged.

ParaView MR has been merged.