vtkCollection... am I using it well?

Hello, I have a Python project where I use (visualize, model, interpolate, etc.) several VTK objects at the same time, including polydata, images, structured grids, etc.

At the moment I am collecting all these objects in a vtkCollection, since I had the impression that this would be more “correct” and efficient.

Since I also need to store various metadata for each vtk object (both numerical and strings), a simpler way to collect the objects and their metadata would be to add them to a dictionary with one key for each object, the object itself as a value, and the metadata as other values. This would also allow searching/selecting objects based on the metadata (e.g. select all objects with data_a == “b”), that is a key requirement.

My first question is: is it correct that the first solution, with objects collected in a vtkCollection, is more efficient (e.g. faster access to objects), or using the dictionary would be OK as well?

The second question is whether there is some other (better) method that someone can suggest.

Thanks very much!

Both the VTK collection and the Python dict stores object pointers, so they are equally fast.

It is a bit more complicated to store metadata in VTK objects (in field data), so searching based on metadata in VTK objects would be probably a little bit slower than just searching in a Python dict. However, the difference would become noticeable if you have thousands of VTK objects in the collection and/or you would want to look up data hundreds of times per second (both of these are quite unlikely).

So going for the simple Python dictionary is simpler, as fast as vtkCollection to store (pointers to) objects, and faster for searching/selecting based on metadata.

Great, I’ll use the dictionary!

Thanks very much!

One more question: suppose I store a polydata or some other object in a dictionary, and nowhere else. I.e. I create it as a “value” in a dictionary (with a key to call it). If I delete the dictionary item, do I delete also the vtk object, or it will still occupy some space in my RAM? In theory in Python, objects that are no more referenced by anything are removed from memory, but I do not know if this applies also in this case.

Thanks very much!

So, everything work in Python as you would expect. If you store a VTK object in a Python variable then it will increment the VTK object reference count by one (and when you delete the Python variable or set it to a different value then the reference count will be decremented by one). When the reference count of the VTK object reaches zero (nothing references the object anymore) then it is deleted from memory.

Perfect!

Still one more question: in another issue you suggested to try using xarray for metadata etc…

Now suppose I’d like to store my (several) vtk objects in a data structure together with their metadata, and to be able to search, select, etc., the vtk objects based on the metadata.

By using xarray I could store the vtk object (actually the pointer to the vtk object) in a “column” (or dimension in xarray terms) and the metadata in other columns. Do you see any problem with this data structure?

To further explore this idea, probably the same can be done also with Pandas. Which one would be better in your opinion, Pandas or xarray?

Many thanks!

VTK does not support xarray, so you would not directly benefit from storing image metadata in xarray if you only use VTK. If you only work with images and you also use other image processing packages, such as ITK, then it could simplify things if you store basic image metadata (origin, spacing, axis directions) in standard fields in xarray, and store your custom metadata there, too.

If you only use VTK or you use all kinds of data objects, not just images, then probably the simplest and most flexible is to store metadata in an associated dictionary object.

Hi and thanks for prompt answer!

No, I’m not using images specifically. The idea is different.

I have a collection of VTK objects, for instance polydata surfaces and polylines, some structured mesh etc. Each one has some associated metadata that defines its role in a model (a geological model of the subsurface).

At some point in my application I need to select objects according to some specific metadata (e.g. geological age), some other time according to another one (e.g. composition), etc.

My present solution is to store the VTK objects in one vtkCollection, where they can be retrieved with an index, and the metadata in a Pandas dataframe, with an index that is synced with the one in the vtkCollection. In this way I can search the Pandas dataframe for all objects with a set of properties, find the indexes, then call the VTK objects… it works but it is pretty complicated!

The new idea is: if I can put the VTK objects themselves (actually pointers to the objects) in the table, everything would be much simpler.

The data structure will look like:

Id | vtkObject | name | age | composition | etc
———————————————————
1 | poly1 | aa | 120 | quartzite | …
2 | poly2. | bb | 35 | limestone | …

I made a test with Pandas and I have seen that I can put a VTK object in a cell. But I was wondering:

  1. whether there is some problem that you are aware of with this approach

  2. if this could work better with xarray.

Thanks very much!

All these options can be good. If it’s the first time you design an application then you’ll inevitably make many mistakes, but you’ll learn from them, so it will not be entirely wasted time.

If you want to reach your goals more quickly and learn from good examples instead of figuring out everything by yourself, and build something that makes an impact then I would recommend to not start writing a new application from scratch. Instead, find an open-source application that does something similar to what you want to achieve and customize/extend that. If you describe your overall goal then we might be able to recommend related VTK-based applications/frameworks.

Thanks! I’ve already “visited” many projects of this kind, but they are all focused on providing streamlined functions for the visualization of a few objects, and not the management of a project with many objects, relationships between these objects, metadata, etc.

The most relevant projects I have seen so far are PyVista and Vedo (lately vtkplotter), and actually I am already re-using some of their code, but the overall management of the collection of objects is not their focus.

If you have useful examples… thanks very much!

Otherwise I’ll try the hard hay as outlined above. Could you confirm that a pointer to a VTK object stored in a Pandas or xarray dataframe behaves like one stored in a simple dictionary? So e.g. if I delete the row that points to the object, the object will have zero references and will be deleted from memory?

Thanks very much!

They are just libraries, not applications.

For example, you can customize/extend Paraview for general technical visualization or 3D Slicer for medical applications. If you describe your overall goal and most important features then we may give more specific advice.

Yes. You store a shared pointer in all cases.

Thank you very much!

I’ll study 3D Slicer and Paraview and come back.