Rust bindings

Short update: WrapVTK is awesome! It provides me with a lot of information and i will probably not need to extract any more information as far as I can tell right now. I have chosen to tackle the following individual problems in order to completely generate the bindings from scratch.

  1. Generate xml data using WrapVTK
  2. Parse xml data
  3. Construct inheritance hierarchy for all classes in all modules
  4. Identify traits as non-inherited public methods
  5. Define 1:1 conversion between C++ and Rust types
    • Remove/Treat modifiers such as const or unsigned
    • Destructure Generics std::array<float, 3> ↔ [f32; 3]
    • Treat positioning of pointers/references
  6. Generate Code for ffi and traits
    • Generate C++ code which can be wrapped in Rust
    • Generate CMakeLists file
    • Determine linker args
    • Generate Rust ffi glue and implementation code

In particular point 5. seems to be a tough nut to crack since it requires me to partially parse the C++ types. I have though about using libclang to handle some of the parsing but right now it seems like more work and I would only need a very minor subset of their functionality.

In the future I will also have to decide how to publish the generated code. I will probably provide crates with corresponding version numbers (i.e. vtk-rs094, vtk-rs091) and also provide the tool wrap-vtk which can generate the bindings as a separate crate.

I also made quick visual of the number of public (non-inherited) methods per module.

1 Like

Yeah, point 5 is a doozy. It’s important to manage expectations.

Regarding generics, the Python wrappers are able to parse the templates (to a certain degree) and wrap specializations of those templates, given a list of parameters to use for instantiation. This is done by some very ugly C code that you probably don’t want to see. What I’m thinking is this: I could add an option to vtkWrapXML to make it do what the Python wrappers do, that is, instantiate the templates over a range of parameters.

One thing I didn’t see in your list was a way to manage overloaded method calls. That is, when a method is overloaded in C++, how to figure out which overload to call from rust based on the rust argument types? This was particularly tricky for the Python wrappers because Python doesn’t even have the concept of overloads.

Rust also does not have a builtin notion of overloading methods. There are some ugly workarounds but the idiomatic way of solving this is to provide methods with separate signatures.

Option1

  1. Determine all Methods with identical name
  2. Parse their arguments and try to determine how arguments can be combined.

Let’s assume that we have the following methods.

void set_name(int idx, int port, const char* name);
void set_name(int port, const char* name);
void set_name(const char* name);

In Rust, we define 3 methods for the bindings with different names. We link them to generated C++ methods with the same name.

mod ffi {
    fn set_name1(idx: i32, port: i32, name: &str);
    fn set_name2(port: i32, name: &str);
    fn set_name3(name: &str);
}

We can then write one method with optional arguments which calls one of these 3 methods depending on which argument is specified.

fn do(port_idx: Option<(i32, Option<i32>)>, name: &str) {
    match (port_idx) {
        Some((idx, Some(port))) => ffi::set_name1(idx, port, name),
        Some((idx, None)) => ffi::set_name2(port, name),
        None => ffi::set_name3(name),
    }
}

Option2

Another way would be to determine the ā€œmeaningā€ of the parameters and thus define new methods with different names. We would still need to make sure that they do not collide with already present ones. For example:

void set_name(int port, const char* name);
void set_name(const char* name);

could be converted to

mod ffi {
    fn set_name_with_port(port: i32, name: &str);
    fn set_name(name: &str);
}

This could be very problematic in the case where we have unnamed arguments.

Summary

I think that there are some tradeoffs between discoverability, overall code-size, conciseness and type ergonomics. Especially Option2 is hard to solve since automatically generating function names would require me to have good understanding of their effect which is of course not guaranteed by only parsing argument names or not doable (unnamed arguments).
I personally prefer the readability of Option2 but I am not settled on this yet. I think I would see what the situation is in practise and if I can generate these function names somewhat reasonably even if I have to use some heuristics to determine a suitable name.

I am not completely sure what you mean to be honest. I have multiple interacting problems which I am facing. The first and most simple but annoying problem is simply having spaces in types and variable order of arguments with identical outcomes. This applies in particular to modifiers such as const but also to combinations such as unsigned int.

I have not yet considered any template conversions and will stick to providing bindings for the basic classes. In Rust, it should not be necessary to explicitly use types such as vtkNew<T> for wrapping but I will probably use it behind the ffi.

Could you clarify which problem your comments would address? I do not seem to fully grasp this.

WrapVTK normalizes the types, so for example its output will always use const long long and never int long signed const long regardless of whatever crazy ordering and spacing is used in the C++ header. Once normalized, the primitive types in the .xml files are:

 "void"
 "char"
 "signed char"
 "unsigned char"
 "short"
 "unsigned short"
 "int"
 "unsigned int"
 "long" # width is platform-specific
 "unsigned long" # width is platform-specific
 "long long"
 "unsigned long long"
 "float"
 "double"
 "long double" # not used by VTK

I’ll try to expand/clarify my comments on generics. You mentioned std::array<float, 3>, but VTK defines several class templates of its own. The ones that are most used are vtkVector and its superclass vtkTuple:

template <typename T, int Size>
class vtkVector : public vtkTuple<T, Size>
template <typename T, int Size>
class vtkTuple

There are also templates for special-purpose vtkDataArray subclasses.

template <class ValueTypeT>
class vtkSOADataArrayTemplate
  : public vtkGenericDataArray<vtkSOADataArrayTemplate<ValueTypeT>, ValueTypeT>

Taking this as an example, the Python wrappers take the template that’s defined in the C++ header, and substitute ValueTypeT with float in all the class interface methods to get a type that can be wrapped in Python.

In C++: vtkSOADataArrayTemplate<float>
In Python: vtkSOADataArrayTemplate_IfE (the suffix denotes <float>)

The same is done for several other primitive types (short, int, etc.). Other templates in VTK are also specialized for non-primitive types such as std::string or for counts (e.g. the sized vtkVector template mentioned above).

Currently, vtkSOADataArrayTemplate.xml just lists vtkSOADataArrayTemplate as a template:

<class name="vtkSOADataArrayTemplate" template="1">
  <tparam name="ValueTypeT" type="typename" />
  ...
</class>

But it could be modified to also output the template instantiations that are used by VTK:

<class name="vtkSOADataArrayTemplate<float>">
  ... give definition with "float" in place of "ValueTypeT"
</class>

etc. for double, int, etc. This would be similar to what the Python wrappers do internally, and the Python wrappers do it because Python doesn’t itself have generics, and therefore it’s the individual specializations that are wrapped.

Type Normalization

THAT is SO NICE! :folded_hands: I was wondering all the time how often these scenarios actually might happen and since I have tackled some other problem in the meantime I will not spend any more effort on this.
My idea for the platform-specific types long and unsigned long would be to cast to 32bit types. This should not cast down (as far as I can tell) but stay identical or cast to 64bit when converting to C++ input and not make any problems. The C++ functions would still use the native type.

Generics

I understand now. I think for now I will focus on function signatures which use the class templates as input/output. I do not plan to support the usage of the generics themselves within Rust for my first version of vtk-rs.

vtkVector<double, 3> GetPosition();
void SetPosition(vtkVector<double, 3> &position)

I will then have to create conversions between vtkVector<T, N> and my chosen Rust equivalent. In this case, I would probably opt for the array type [T; N]. In this way, I can expose the function without having to deal with the class template vtkVector<T, N> explicitly.

In total, this means of course that all functionality which might be provided by vtkVector<T, N> is lost and not directly exposed within the crate.

Future Plans

This in particular would be very useful for the next steps in exposing the templates themselves. I see two (partially distinct) ways forward in exposing them.

1. Expose only what is used by VTK.

This would be a good starting point and very reasonable to tackle as a next step.
Although it might already be too much to be covered by a finite amount of cases. Assume a function which takes a generic.

template<class N> void TakeGeneric(vtkVector<double, N>);

If we wanted to expose this functionality exhaustively, we would have to write bindings for each possible case.

void TakeGeneric_0(vtkVector<double, 0>);
void TakeGeneric_1(vtkVector<double, 1>);
...

I do not know if VTK uses any public template functions. But certainly, constructors of these classes would qualify.

2. Provide Macros to generate Bindings for Templated Classes

This attempt is very, very intricate and will probably require lots of work. Since Rust is compiled and has a preprocessor, it is in principle possible to recreated the templating approach of C++. I do not know if any dedicated crates exist to handle this functionality. But in principle (although practise might be very hard) it might be possible to provide a Rust proc-macro which scans Rust code and generates appropriate C++ code which can then be linked together.
However, this is very difficult and not straightforward at all and I will probably never go down this path at all.