More Pythonic VTK wrapping

berk.geveci · January 28, 2024, 1:44pm

How about eval() or evaluate()? When used in the context of functions, these imply returning a value.

I don’t like overloading get_output any further since it has meant getting the output without execution from the beginning. run() has the same issue as execute() in that it does not imply returning a value.

toddy · January 28, 2024, 11:51pm

Perhaps it should have been done the other way around whereby the default behaviour of GetOutput() was to always perform an evaluation but supplemented by a caching mechanism to return a stored value when the filter/reader was unmodified.

amaclean · January 29, 2024, 3:22am

I really don’t see any issues with execute() … to my mind the pipelines are normally executed or evaluated. execute() doesn’t imply that no value is returned. I would normally expect a result … Null, Pass, Fail, the pipeline …

berk.geveci · January 29, 2024, 6:59pm

Hi folks,

Here is an update:

gist.github.com

https://gist.github.com/berkgeveci/5dadcd8c08490a735c385c93be0ccdee

pipeline_demo.py

import vtk
from vtkmodules.util.execution_model import select_ports

w = vtk.vtkRTAnalyticSource().execute()

p = vtk.vtkRTAnalyticSource() >> vtk.vtkContourFilter(contour_values=[100])
print(p.execute().number_of_points)

p = p >> vtk.vtkShrinkPolyData()
print(p.execute().number_of_points)

This file has been truncated. show original

Highlights:

No more need to create a Pipeline object explicitly. Now >> returns a pipeline object that enables reuse of pipelines (and composition)
I went with the select_ports(input_port, algorithm, output_port) way of selecting ports.
I went with execute() as the alternatives suggested are not better.

ahernsean · January 29, 2024, 7:02pm

eval() and evaluate() are pretty good. They’re both better than execute(). I’m still partial to generate(). It’s the best I’ve heard suggested so far. It implies something being produced as a result.

berk.geveci · January 29, 2024, 7:20pm

Can we get some votes for eval(), execute(), and generate()?

I vote for eval().

cory.quammen · January 29, 2024, 7:21pm

eval()
execute()
generate()

0 voters

banesullivan · January 29, 2024, 7:54pm

i have a strong aversion to eval since it is a Python built in Built-in Functions — Python 3.12.1 documentation

will.schroeder · January 29, 2024, 8:01pm

I agree with Bane (eval bad). I also like to use VTK terminology (i.e., execute()).

Zach_Mullen · January 29, 2024, 8:04pm

Would it be possible to not have to explicitly call execute on the pipeline at all? Rather, if you try to read an output property, the internals would know that it needs to execute the pipeline and then do so?

berk.geveci · January 29, 2024, 8:05pm

Would it be possible to not have to explicitly call execute on the pipeline at all? Rather, if you try to read an output property, the internals would know that it needs to execute the pipeline and then do so?

Yes, but that would be a change of behavior for VTK. I am not willing to take that on. Our goal here is to simplify / integrate but not change / improve behavior necessarily.

will.schroeder · January 29, 2024, 8:06pm

Some filters have multiple inputs / outputs, and in some cases these are optional. In most cases they all need to be connected before execution makes sense.

cory.quammen · January 29, 2024, 8:22pm

If you implemented this in __getattr__ it shouldn’t require a change in VTK itself. Let’s say instead of invoking execute() you simply access a output attribute. __getattr__ could perform the update on the pipeline when it the attribute output is requested and give you the result. The implementation would, I suspect, be very similar to the implementation of an execute() function. It’s mainly a syntactic preference in my view. The main advantage of an attribute is it is perhaps easier to name - you don’t have to find a verb that also indicates a noun is returned.

That’s a good point about multiple outputs. How would this be handled with either an execute() or a output attribute? Return a tuple of VTK objects?

berk.geveci · January 29, 2024, 8:34pm

This is still a behavior change. Given how we create properties, I would expect .output to call GetOutput() on the algorithm, which does not execute. This is the current behavior.

I did not think of returning a tuple of data objects. That’s doable. This currently returns the 2nd output:

select_ports(alg, 1).execute()

but if you want the other input after executing, you have to do:

alg.GetOutputDataObject(0)

Because of the way Python properties work, we cannot do:

alg.output(0)

We can implement that as a custom function if requested though.

berk.geveci · January 29, 2024, 11:33pm

One more thing: many of the multiple output algorithms have specific Get methods for their other outputs. For example, the clip filter has the method GetClippedOutput() which would have a corresponding property called 'clipped_output` so the following is always possible:

output = (source >> clip).execute()
c_output = clip.clipped_output

But I am starting to like this idea a lot:

output, c_output = (source >> clip).execute()

marcomusy · January 30, 2024, 9:24am

How about this

p = vtkRTAnalyticSource() >> vtkContourFilter(contour_values=[100])
p.update() # returns self (or None)
print(p.output.number_of_points)

where now output is a python property returning None before update() is called.
In case of multiple outputs output can be a tuple and you can still do

output, c_output = (source >> clip).update().output

isn’t this the most similar sequence of actions to what we normally do in vtk?

algo = SomeSource()
algo.Update()
out = algo.GetOutput()

ahernsean · January 30, 2024, 12:41pm

I love the idea of a multi-output filter returning a tuple by default. Very Pythonic!

Aashish_Chaudhary · January 30, 2024, 4:44pm

I’m not seeing the entire thread, but having the () “function call” operator would be better to execute the pipeline, as then we could treat pipeline reference as lambda. Another option is run() - it is shorter than eval and execute. I can understand why most of the members here voted for the execute option. That is probably most familiar to VTK experts, though run() might be a better, shorter, and more familiar option for newcomers.

MicK7 · January 31, 2024, 1:06pm

As a vtk pipeline is more or less a graph, it would be nice to have a pythonic API coherent with networkx or other graph package that would allow plotting the pipeline graph in matplotlib. This may need to create a dedicated pipeline class.

berk.geveci · January 31, 2024, 1:08pm

I would be happy to add support for that once the core infrastructure is completed. There is already a Pipeline class in the current implementation (which I am working to add to the MR) so it should not be hard. Contributions are welcome too