UString In VTK

Good morning everyone,

whilst implementing a new feature in VTK, one of my colleague had performance issues w.r.t. to array type conversion. More precisely, he was casting something to a vtkFloatArray using the vtkArrayDownCast template function. After looking at the code, this function is nothing but a paththrough to the internal SafeDownCast function contained in every vtk object. The function looks like this and can be found here:

  static vtkTypeBool IsTypeOf(const char* type)                                                    \
  {                                                                                                \
    if (!strcmp(thisClassName, type))                                                              \
    {                                                                                              \
      return 1;                                                                                    \
    }                                                                                              \
    return superclass::IsTypeOf(type);                                                             \
  }                                                                                                \
  vtkTypeBool IsA(const char* type) override { return this->thisClass::IsTypeOf(type); }           \
  static thisClass* SafeDownCast(vtkObjectBase* o)                                                 \
  {                                                                                                \
    if (o && o->IsA(thisClassName))                                                                \
    {                                                                                              \
      return static_cast<thisClass*>(o);                                                           \
    }                                                                                              \
    return nullptr;                                                                                \
  }

Looking at the code, it is clear that the IsTypeOf function is the culprit for two reasons:

  • the call to strcmp
  • the recursion when strcmp succeeds

This results in multiple branching recursive calls, which can be costly (especially if the SafeDownCast is called multiple times in a row).

One potential, and easy to implement method that we could add the VTK is the UString type (see here). Quoting the comments:

/// A ustring is an alternative to char* or std::string for storing
/// strings, in which the character sequence is unique (allowing many
/// speed advantages for assignment, equality testing, and inequality
/// testing).
///
/// The implementation is that behind the scenes there is a hash set of
/// allocated strings, so the characters of each string are unique.  A
/// ustring itself is a pointer to the characters of one of these canonical
/// strings.  Therefore, assignment and equality testing is just a single
/// 32- or 64-bit int operation, the only mutex is when a ustring is
/// created from raw characters, and the only malloc is the first time
/// each canonical ustring is created.

This new type could be use in place of the const char* used for the internal type, and we could then replace the strcmp used in IsTypeOf. Although I’ve not tested it, I’m 90% sure we would get performance improvements (this is based on personal knowledge from my other projects). Also, if we can get some improvements, this would affect a lot VTK (and paraview) as VTK has 5664 calsd to this function and paraview 48449 :slight_smile:

Tell me what you think!

This is basically an interned string implementation (if I’m not mistaken). Interning typenames might be useful, but note that there are places in the code where the types come in as strings, so some way to programmatically access the interned string database would be necessary.

Cc: @brad.king since I know you’ve worked with interning strings before.

For reference, the quoted infrastructure is a hand-implementation of RTTI that has been in VTK since before the C++98 standard even existed to define standard RTTI for the language.

The linked ustring looks pretty complete, but unless VTK were to adopt it for more purposes it may be overkill for RTTI.

It may be possible to convert the implementation to use the real C++ RTTI infrastructure in combination with some kind of map to convert in coming string names into the corresponding type_info. That map would be similar to what ustring does to intern raw strings.

Yes indeed, the mentioned code is a hand-implementation of RTTI. We could indeed use typeid to do the SafeDownCast, and, as you said, have a map from strings to type_info.

I’m not familiar enough with VTK to see if there would be other places that would benefit from ustring. I’ll have a deeper look into that to see if that’s feasible!

Note that there was an investigation into RTTI in mid-2019 (by Bill Lorensen). It didn’t work as we wanted because VTK’s class hierarchy is too deep for some platforms (macOS?) to give correct answer. It wasn’t actually faster there either.

Good to know thanks!

My point (and and what @brad.king said) is that we could come up with a faster solution than strcmp (an int64 comparison for example). I will try to pull up something and test the performance!

@Thomas_Caissard, @ben.boeckel, @brad.king
I already tried to push this idea 2 years ago without success. https://gitlab.kitware.com/vtk/vtk/-/merge_requests/4444