More Pythonic VTK wrapping

How about eval() or evaluate()? When used in the context of functions, these imply returning a value.

I don’t like overloading get_output any further since it has meant getting the output without execution from the beginning. run() has the same issue as execute() in that it does not imply returning a value.

Perhaps it should have been done the other way around whereby the default behaviour of GetOutput() was to always perform an evaluation but supplemented by a caching mechanism to return a stored value when the filter/reader was unmodified.

I really don’t see any issues with execute() … to my mind the pipelines are normally executed or evaluated. execute() doesn’t imply that no value is returned. I would normally expect a result … Null, Pass, Fail, the pipeline …

Hi folks,

Here is an update:

Highlights:

  • No more need to create a Pipeline object explicitly. Now >> returns a pipeline object that enables reuse of pipelines (and composition)
  • I went with the select_ports(input_port, algorithm, output_port) way of selecting ports.
  • I went with execute() as the alternatives suggested are not better.
1 Like

eval() and evaluate() are pretty good. They’re both better than execute(). I’m still partial to generate(). It’s the best I’ve heard suggested so far. It implies something being produced as a result.

Can we get some votes for eval(), execute(), and generate()?

I vote for eval().

  • eval()
  • execute()
  • generate()
0 voters

i have a strong aversion to eval since it is a Python built in Built-in Functions — Python 3.12.1 documentation

1 Like

I agree with Bane (eval bad). I also like to use VTK terminology (i.e., execute()).

2 Likes

Would it be possible to not have to explicitly call execute on the pipeline at all? Rather, if you try to read an output property, the internals would know that it needs to execute the pipeline and then do so?

Would it be possible to not have to explicitly call execute on the pipeline at all? Rather, if you try to read an output property, the internals would know that it needs to execute the pipeline and then do so?

Yes, but that would be a change of behavior for VTK. I am not willing to take that on. Our goal here is to simplify / integrate but not change / improve behavior necessarily.

1 Like

Some filters have multiple inputs / outputs, and in some cases these are optional. In most cases they all need to be connected before execution makes sense.

If you implemented this in __getattr__ it shouldn’t require a change in VTK itself. Let’s say instead of invoking execute() you simply access a output attribute. __getattr__ could perform the update on the pipeline when it the attribute output is requested and give you the result. The implementation would, I suspect, be very similar to the implementation of an execute() function. It’s mainly a syntactic preference in my view. The main advantage of an attribute is it is perhaps easier to name - you don’t have to find a verb that also indicates a noun is returned.

That’s a good point about multiple outputs. How would this be handled with either an execute() or a output attribute? Return a tuple of VTK objects?

This is still a behavior change. Given how we create properties, I would expect .output to call GetOutput() on the algorithm, which does not execute. This is the current behavior.

I did not think of returning a tuple of data objects. That’s doable. This currently returns the 2nd output:

select_ports(alg, 1).execute()

but if you want the other input after executing, you have to do:

alg.GetOutputDataObject(0)

Because of the way Python properties work, we cannot do:

alg.output(0)

We can implement that as a custom function if requested though.

One more thing: many of the multiple output algorithms have specific Get methods for their other outputs. For example, the clip filter has the method GetClippedOutput() which would have a corresponding property called 'clipped_output` so the following is always possible:

output = (source >> clip).execute()
c_output = clip.clipped_output

But I am starting to like this idea a lot:

output, c_output = (source >> clip).execute()
1 Like

How about this

p = vtkRTAnalyticSource() >> vtkContourFilter(contour_values=[100])
p.update() # returns self (or None)
print(p.output.number_of_points)

where now output is a python property returning None before update() is called.
In case of multiple outputs output can be a tuple and you can still do

output, c_output = (source >> clip).update().output

isn’t this the most similar sequence of actions to what we normally do in vtk?

algo = SomeSource()
algo.Update()
out = algo.GetOutput()

I love the idea of a multi-output filter returning a tuple by default. Very Pythonic!

I’m not seeing the entire thread, but having the () “function call” operator would be better to execute the pipeline, as then we could treat pipeline reference as lambda. Another option is run() - it is shorter than eval and execute. I can understand why most of the members here voted for the execute option. That is probably most familiar to VTK experts, though run() might be a better, shorter, and more familiar option for newcomers.

As a vtk pipeline is more or less a graph, it would be nice to have a pythonic API coherent with networkx or other graph package that would allow plotting the pipeline graph in matplotlib. This may need to create a dedicated pipeline class.

I would be happy to add support for that once the core infrastructure is completed. There is already a Pipeline class in the current implementation (which I am working to add to the MR) so it should not be hard. Contributions are welcome too :slight_smile: