Summary
VTK now includes a constexpr
(compile-time) string hashing algorithm. This enables things such as more legible (and faster) code when comparing a test string to multiple potential matches:
#include "vtkStringToken.h"
using namespace vtk::literals; // for ""_hash used below.
vtkStringToken task;
switch (task.GetId())
{
case "foo"_hash: foo(); break;
case "bar"_hash: bar(); break;
}
The other use-case for vtkStringToken
is to tokenize a vtkStringArray
into a vtkTypeUInt32Array
and use the resulting integer array for visualizations.
Details
In the example above, the ""_hash
string-literal operator computes the hash of the strings at build time. Then, at run time, the task
’s hash is computed and compared to the fixed integer hash values in the switch()
statement.
You can also construct string tokens at run-time like so:
const char* ex1 = "baz";
std::string ex2a("foo"), ex2b("bar");
vtkStringToken ex1t(ex1, std::strlen(ex1));
vtkStringToken ex2t(ex2a + ex2b);
The following sections cover more implementation details and caveats when using string tokens.
Comparisons and Containers
Exact equality and inequality tests between string tokens is fast: a single integer comparison, while inequality tests attempt to compare the source strings alphanumerically so that lexical ordering is preserved.
ex1t == ex2t; // Fast integer math
ex1t != ex2t; // Fast integer math
ex1t < ex2t; // Slow string lexical comparison
The string token provides a specialization of std::hash<vtkStringToken>
which simply returns the hash-code. This allows you to use string tokens in unordered maps and sets.
Because the less-than (<
) operator is slow, you should always prefer unordered sets and maps when using string tokens for indexing.
Recovering the string from a hash
Hashes computed at run-time by calling the constructor of vtkStringToken
can be recovered by calling the token’s Data()
method. This is possible because the vtkStringToken
class owns a static instance of vtkStringManager
, which holds a map from hash values to the original strings.
If you call token.Data()
on a hash computed at build time (or even at run-time using the static vtkStringToken::HashString()
method), the string will not be present in the vtkStringManager
and an empty string will be returned; a warning message will also be emitted on the first failed lookup.
Hash algorithm
We use the Fowler-Noll-Vo FNV-1a hash algorithm to produce a 32-bit unsigned integer (type-aliased to vtkStringToken::Hash
).
This hash is not intended for cryptographic use – not that a serious person would think to use a visualization library to store passwords… but just in case you are feeling a bit silly, don’t do it.
String literal operators
VTK provides two string-literal operators to perform hashing. Both are declared in the vtk::literals
namespace in vtkStringToken.h
.
-
""_token
, which produces avtkStringToken
instance by hashing the string. -
""_hash
, which produces an integer of typevtkStringToken::Hash
by hashing the string.
Both of these perform the hash computation during compilation. As mentioned above, this means the vtkStringToken::Data()
method will return an empty string. If you create a class that returns tokens created with the string-literal operators (for example, to avoid hash computation in a tight loop), you can work around this by constructing run-time hashes of the strings at some point in your code that is not run frequently (such as a static initializer called from your class constructor or application setup).