About VTK Remote Modules

mwestphal · February 11, 2022, 10:10am

VTK Remote Modules have been added to VTK in 2015 in an attempt to easily add new modules to VTK from external repositories.

At the time, VTK was not very modular and creating modules was not easy as it is now, so it made sense.

However, having such remote modules begs the question, are these modules part of VTK or not ?
If Yes, the VTK maintainers should ensure they build and are being tested as well as any other parts of VTK. If not, then they should not be available as a build option of VTK, yet they are. So VTK remote modules can be considered actual parts of VTK.

With 9.0 and later, VTK is now much more modular than it was before and building VTK modules against VTK is the norm in many projects.

But then what is the use of remote modules ?
Should we add new ones and expand this ?
Should we keep the current ones but add no new ones ?
Should we outright remove them ?

I’d be happy to hear any inputs on this.

Please note there is 4 unmaintained remote modules on the verge of being removed, as this is not acceptable to have unmaintained code in VTK. https://gitlab.kitware.com/vtk/vtk/-/merge_requests/8896

@dgobbi @jcfr @ben.boeckel @cory.quammen @berk.geveci @Charles_Gueunet @Francois_Mazen

List of current remote modules

Here is the list and status of each of the VTK remote modules:

MomentInvariants:
https://gitlab.kitware.com/vtk/MomentInvariants/
Maintained and up to date

PoissonReconstruction:
https://github.com/lorensen/PoissonReconstruction
Unmaintained

PowerCrust:
https://github.com/lorensen/Powercrust
Unmaintained

RenderingLookingGlass
https://github.com/Kitware/LookingGlassVTKModule
Maintained and up to date

VTKSignedTensor:
https://github.com/KitwareMedical/VTKSignedTensor
Unmaintained and out of date

SplineDrivenImageSlicer
https://github.com/martinken/midas-journal-838.git
Unmaintained and out of date

vtkDICOM
https://github.com/dgobbi/vtk-dicom
Maintained and up to date
Note: This can be built as external as well

Remote modules vs external modules - #12 by dgobbi

mwestphal · February 11, 2022, 10:12am

I personnaly think VTK modularity is very important but that remote modules is the wrong answer to the right question.
The right answer would be even better modularity, examples and documentation.
A even better answer would be a community of VTK-based modules easilly available somewhere (think Slicer Extensions Manager).

Currently available remote modules should either be integrated in VTK proper and be removed and be VTK-based modules.

thewtex · February 11, 2022, 2:07pm

First, please recognize that these contributions, along with the VTKExamples, either would not exist or would not be enriching “VTK” as much as they would otherwise, if they were not in remote repositories if the community members who created them did not have the agency to do so.

At the time, VTK was not very modular and creating modules was not easy as it is now, so it made sense.

This is completely not the reason they are remote modules. From experience working personally with the people who created the PoissonReconstruction module, for example, David Doria and Bill Lorensen, this code would not be enriching “VTK” as much as they do now. The reasons that they are not in the VTK repository are:

There was too much code already in the VTK, which makes maintenance overly difficult. This results in Not a Bill Lorensen Dashboard, where errors and issues from other areas of the codebase mask potential issues in changes in the code, which inhibits contributions.
There were different rules for the “less-than external contributors.” Meanwhile, code would be merged without review and without passing CI, further corrupting the dashboard and making maintenance and contributions more difficult. I do not know what the situation is now, but this was at least the case at the time.
The contribution process was overly onerous.
There is an obstructionist attitude toward the “less-than” contributors to add new code or create a new repository. This comes from the typical “I am better than you” or “not-invented-here” or “I don’t personally see why this has value for my needs.” While this is present and can cause issues to some degree or another in software development generally, it gets accentuated when there is one single repository, which is inhibits maintainability, and where there is a single repository where control has to be shared.

If Yes, the VTK maintainers should ensure they build and are being tested as well as any other parts of VTK.

It takes effort to set up the build and CI configuration, but testing can occur on different repositories.

Given that in Open Source:

Community members should be respected
Community members should be credited and recognized for their work
Community members should be provided agency

And, to make changes to other repositories, VTK maintainers can either:

a) Create a pull request.

or

b) Create a fork.

or

c) Respectfully request that the original author transfer it to another organization with the understanding that they will retain admin access to the repository.

But you are suggesting that VTK maintainers cannot or are “above” contributing to repositories other than the VTK repository?

even better modularity, examples and documentation.

A even better answer would be a community of VTK-based modules easilly available somewhere (think Slicer Extensions Manager).

@mwestphal I agree with you. But this should happen regardless of whether there are remote modules.

Should we outright remove them ?

Currently available remote modules should either be integrated in VTK proper and be removed and be VTK-based modules.

Removing the code is fine. Merging into the VTK repository and claiming unilateral ownership is not.

berk.geveci · February 11, 2022, 6:45pm

I don’t like remote modules. I did at one point. I don’t anymore. My reasons:

They are referenced by a combination of a “raw” git URL and git SHA. If either of these disappears for some reason, all version of VTK that point to them are broken unless we rewrite history.
They are the anti-pattern of modularity in my opinion. We pull them all into the main VTK build which means that although repository-wise it is modular, build/distribution-wise, it is not. VTK gets bigger and bigger.

One objective of remote modules that are referenced by VTK is discoverability by those that already use VTK. You checkout VTK. You look at build options. Ah look. There is a powercrust module. Let me see what it does. How often does this work? Clearly for the 4 modules that Mathieu is removing, not very often. Otherwise, someone would have discovered that they are broken with current VTK. If you want to discover, for example, powercrust for VTK, all you have to do a quick search and you get Tim Hutton’s original version.

The other point is that it makes building easier. You switch a flag and build VTK. Boom. I agree. But I have come to like modularity more. I like having separate projects that depend on VTK. I never ever write a module that gets sucked into VTK unless I plan on adding it to VTK. VTK takes a long time to configure and compile. I’d rather not rebuild it every time I change my module.

I vote for (in this particular order):
0. Continue to encourage developers to contribute to and maintain core VTK

If necessary, improve VTK’s build infrastructure such that it is super easy to build VTK extensions AND applications that depend on VTK and those extensions. This includes wrapping.
Making it super easy to make Python wheels of these extensions.
Create some kind of collection of well-known VTK extensions. It could a Web site or a github org or something like that.
Make sure that we have contract testing of some of these extensions so that we can either fix them or VTK when an issue comes up.
Remove remote module support from VTK.

ben.boeckel · February 11, 2022, 7:26pm

This is not about “remote repositories” in the “VTK code not living in vtk/vtk.git” sense, but “VTK has flags to build code from remote repositories” through the instructions in its Remote/ directory. There is no effort here to absorb any and all VTK code out there. It’s about figuring out where the code referenced under the Remote/ directory should live. Either it is part of VTK or it is not. If it is, then it should be put into vtk/vtk.git. If it is not, we should remove the pointer from Remote/ and let it be its own thing.

IMO, it’s about figuring out who is responsible for the code. If it is VTK developers, why would they need to run around to different repos to change things? If it is external developers, why does VTK have a link to hook it into its own build?

This is about as easy as it can be. Yes, there are a lot of knobs to dial in, but they all have their purpose and the less common ones have sensible defaults.

Wheels are…fine once we provide the SDK the wheel was made from so that downstreams can use the code at install time. PyPI-acceptable wheels are a different matter. AFAIK, there’s no method that doesn’t involve duplicating the VTK libraries because PyPI will reject wheels that do not provide all libraries not on the “assumed to be available” list. VTK is not in that list.

Yep, contract testing is fine. VTK already uses its own test suite as a contract test (of sorts) to verify that its install tree has everything necessary to at least build its own test suite. Other projects can certainly follow the same pattern.

berk.geveci · February 11, 2022, 8:08pm

I don’t understand. On PyPI, if you have a wheel that depends on the VTK wheel, does it still have to provide the VTK libraries?

ben.boeckel · February 11, 2022, 8:15pm

I know of no way to tell auditwheel (the tool behind the scenes on Linux) “the libraries are provided by dependency X”. My understanding is that PyPI runs auditwheel prior to accepting Linux wheels. There are analogous tools for macOS and Windows. There are some open MRs and issues to help make local wheels that don’t glob up everything:

Add --include and --exclude options to auditwheel repair by rossant · Pull Request #310 · pypa/auditwheel · GitHub
Add --exclude option to auditwheel repair by martinRenou · Pull Request #368 · pypa/auditwheel · GitHub
Only bundle some select libraries · Issue #339 · pypa/auditwheel · GitHub

But even if such behavior existed, how to tell PyPI to use these flags is another question.

thewtex · February 11, 2022, 10:49pm

This is a an important limitation. And, network access is required to download them.

This would be less of a limitation if it was possible to build the modules externally with local a fetch.

They are the anti-pattern of modularity in my opinion. We pull them all into the main VTK build which means that although repository-wise it is modular, build/distribution-wise, it is not. VTK gets bigger and bigger.

The builds for remote modules should be in their repositories, not the main repository.

Otherwise, someone would have discovered that they are broken with current VTK. If you want to discover, for example, powercrust for VTK, all you have to do a quick search and you get Tim Hutton’s original version.

How do you know to look for powercrust in the first place?

There should be less of an expectation that remote modules have the same level of robustness as the “core” of VTK since they are not enabled by default, and they are remote modules. A note can be placed in the CMake variable description if the impression is that this not the case, e.g. THIS IS EXPERIMENTAL. By limiting the size of VTK, the “core” can be more robust. There are a limited number of VTK maintainers.

Yes, remote modules can and will be broken at times. The suggestion here that VTK without remote modules, especially non-default flags, is never broken, has not been my experience. If we want the core of VTK to be less broken less often, then limiting its scope will help, not hurt.

I understand your pain, and building a module externally during development will make for a different experience.

I do not agree that a distinction between a code repository and a project is a problem.

In terms of benefits, there is the one-CMake variable engagement, easy pointers to related code to improve discoverability and encourage development. There is also the benefit of encouraging development of VTK classes that can be used with the project because of a shared sense of ownership that just comes from having that pointer in the repository. And, the modules include a dependency resolution system. This has to be regenerated with a superbuild or something similar.

Doesn’t it get easier to understand who is responsible for the code when it is split into multiple repositories? You look at the contributors to the repository instead of doing a git blame archaeological project.

If it is VTK developers, why would they need to run around to different repos to change things? If it is external developers, why does VTK have a link to hook it into its own build?

I am concerned that the term “external developers” here serves to unnecessarily denigrate contributors who are improving the project. And we all benefit from these improvements.

mwestphal · February 12, 2022, 7:41am

This is not acceptable imo. Such unstable and experimental code should not be part of VTK.
When someone turns on an option in CMake for VTK, they expect the option to work as intended.

I am concerned that the term “external developers” here serves to unnecessarily denigrate contributors who are improving the project. And we all benefit from these improvements.

External developers: developers who do not contribute to VTK but to other projects. An external developers become a VTK developer as soon as they start opening MRs in VTK and using the contribution process to VTK.

VTK developpers are helping many developpers with this process and the number of VTK contributions have been steadily increasing the past few years, while also increasing code quality and the overall contribution process.

External developpers are not improving VTK per se, they are using VTK for their own projects, which can be very benificial for the whole VTK ecosystem and community (eg, the VTKExamples) but these are different projects, not VTK.

berk.geveci · February 12, 2022, 2:18pm

I wonder if there is a middle ground. I wonder if we can reference certain VTK extensions inside VTK, even somehow support building them, without making them modules (e.g. part of VTK). Some kind of superbuild process within VTK. The resulting build(s) would be separate - with their own installation process etc. This would support the discovery process and ease of build. And modularity in that one would be able to build & distribute them separately.

amaclean · February 12, 2022, 7:53pm

I should probably deprecate the examples in vtk-examples for the removed modules. There have never been any complaints about them not working, which tells me no one has tried to build them!

ben.boeckel · February 14, 2022, 7:41pm

This sounds like package management to me. I don’t think VTK should get into that game. C++ sucks at this and many people have tried (and are trying), but I don’t think we’re going to solve it.

berk.geveci · February 16, 2022, 12:56am

That wasn’t my intention. Didn’t we talk about potentially building some of the “modules” in VTK as separate entities for separate distribution? As a first step in modularization… This could be something similar.

ben.boeckel · February 16, 2022, 5:14pm

Yes, but this would still be a closed set of things. Maybe I was crossing wires. These things wouldn’t be under find_package(VTK) anymore; they’d be their own CMake package at least. How this gets split would need to be investigated, but the splits would be visible in the usage patterns (with some level of compat).