Hi everyone,
I’m currently working on a new and more generic way to load data for VTK readers.
Current status of data loading in VTK
In VTK, all readers have at least the SetFileName. Some of them such as legacy readers, XML readers, PLY reader, have a function such as SetInputString or similar, while others don’t have such functionality , ex: vtkOBJReader. There is no other way of loading data, afaik.
New approach
The new approach is to expose a “SetInputStream” function that enables the user to specify a std::istream as input for readers.
Most of the reader uses kwsys::fstream internally, and the ones that accept string as input uses std::istringstream, so it does not need a lot of refactor for them.
Exposing this would enable user to use custom streams, which in turn would enable users to extend what is currently supported in VTK.
I already implemented locally this feature for vtkXMLReader.
Resource
Since dealing with standard iostream library can be hard, especially for C++ neophytes, so I made a small PoC of a kwsys module that enables easier resource creation, while staying compatible with C++ iostreams.
The module defines a higher level interface base class, kwsys::Resource which only has read and seek virtual functions, and nothing else. The user can then give this Resource to a kwsys::ResourceStream, which holds a kwsys::ResourceStrembuf. This kwsys::ResourceStream is a std::istream so it can be passed to the new SetInputStream function.
Here is the synopsis of the header:
namespace @KWSYS_NAMESPACE@ {
class Resource {
public:
virtual ~Resource() = default;
virtual std::streamsize read(void* buffer, std::streamsize bytes);
virtual std::streampos seek(std::streampos pos, std::ios_base::seekdir which);
};
template<typename CharT, typename Traits = std::char_traits<CharT>, std::size_t BufferSizeValue = 1024>
class BasicResourceStreambuf : public std::basic_streambuf<CharT, Traits> {
explicit BasicResourceStreambuf(Resource& resource)
void SetResource(Resource& resource);
Resource& GetResource() const;
};
template<typename CharT, typename Traits = std::char_traits<CharT>, std::size_t BufferSizeValue = 1024>
class BasicResourceStream : public std::basic_istream<CharT, Traits> {
BasicResourceStream();
explicit BasicResourceStream(Resource& resource)
void SetResource(Resource& resource);
};
using ResourceStreambuf = BasicResourceStreambuf<char>;
using ResourceStream = BasicResourceStream<char>;
}
Example
Here is an example, that load data from memory, using a MemoryResource which is a kwsys::Resource.
// This is a resource that replaces the SetInputString functions
// Unlike std::streamstring, it is a view, it does not copy the buffer
// buffer is a standard container
vtksys::MemoryResource resource{buffer.data(), buffer.size()};
// The resource is only referenced, so it must stay alive until the la Update() of the user
vtksys::ResourceStream stream{resource}; // Assign the resource, the stream will pass it down to the streambuf
// reader is a vtkXMLReader subclass
reader->SetReadFromInputStream(true); // Must be set to use user-provided stream
reader->SetInputStream(stream); // This is the new function, for all vtkXMLReader
reader->Update();
Wrapping
Having a custom class for resource may enable wrappers to offer an interface for having custom resources, in python, C#, Java, … using a similar mechanism as vtkPythonAlgorithm, which can be subclassed from Python code.
Note
The example is also interesting because it does reveal a problem with the vtkXMLReader::SetIntputString which is the memory footprint. When using this function, the reader will hold 2 copies of the data when decoding, because it stores the input string, and then create a std::istringstream which also copies the data.
Here is the memory footprint for the following code, with buffer containing a 700Mio VTI:
1: std::string buffer{...};
reader->SetReadFromInputString(true);
2: reader->SetInputString(buffer);
3: reader->Update(); // 3 is during update, after this->OpenStream()

And the memory footprint using the new approach:
1: const std::string buffer{...};
vtksys::MemoryResource resource{buffer.data(), buffer.size()};
vtksys::ResourceStream stream{resource};
reader->SetReadFromInputStream(true); // Must be set to use user-provided stream
2: reader->SetInputStream(stream);
3: reader->Update(); // 3 is during update, after this->OpenStream()

Though, it could be possible to remove this problem by storing the std::istringstream only, and freeing the initial buffer after SetInputString. This change is not very hard, and I have it locally, but is cause a change in vtkXMLUnstructedGridReader.
I haven’t tested if other readers share this behavior.
Conclusion
Here is all the things I wanted to show for this proposal!
Here is some questions I want to ask:
- What are your thoughts on this proposal ?
- Do you think a features should belong to kwsys or VTK directly ?
- Does the resources/streams should be manipulated through shared pointers ?
Thank you, Alexy.
Ping: @Francois_Mazen @finetjul

