Hi everyone,
I’m currently working on a new and more generic way to load data for VTK readers.
Current status of data loading in VTK
In VTK, all readers have at least the SetFileName
. Some of them such as legacy readers, XML readers, PLY reader, have a function such as SetInputString
or similar, while others don’t have such functionality , ex: vtkOBJReader. There is no other way of loading data, afaik.
New approach
The new approach is to expose a “SetInputStream
” function that enables the user to specify a std::istream
as input for readers.
Most of the reader uses kwsys::fstream
internally, and the ones that accept string as input uses std::istringstream
, so it does not need a lot of refactor for them.
Exposing this would enable user to use custom streams, which in turn would enable users to extend what is currently supported in VTK.
I already implemented locally this feature for vtkXMLReader
.
Resource
Since dealing with standard iostream library can be hard, especially for C++ neophytes, so I made a small PoC of a kwsys module that enables easier resource creation, while staying compatible with C++ iostreams.
The module defines a higher level interface base class, kwsys::Resource
which only has read and seek virtual functions, and nothing else. The user can then give this Resource
to a kwsys::ResourceStream
, which holds a kwsys::ResourceStrembuf
. This kwsys::ResourceStream
is a std::istream
so it can be passed to the new SetInputStream
function.
Here is the synopsis of the header:
namespace @KWSYS_NAMESPACE@ {
class Resource {
public:
virtual ~Resource() = default;
virtual std::streamsize read(void* buffer, std::streamsize bytes);
virtual std::streampos seek(std::streampos pos, std::ios_base::seekdir which);
};
template<typename CharT, typename Traits = std::char_traits<CharT>, std::size_t BufferSizeValue = 1024>
class BasicResourceStreambuf : public std::basic_streambuf<CharT, Traits> {
explicit BasicResourceStreambuf(Resource& resource)
void SetResource(Resource& resource);
Resource& GetResource() const;
};
template<typename CharT, typename Traits = std::char_traits<CharT>, std::size_t BufferSizeValue = 1024>
class BasicResourceStream : public std::basic_istream<CharT, Traits> {
BasicResourceStream();
explicit BasicResourceStream(Resource& resource)
void SetResource(Resource& resource);
};
using ResourceStreambuf = BasicResourceStreambuf<char>;
using ResourceStream = BasicResourceStream<char>;
}
Example
Here is an example, that load data from memory, using a MemoryResource
which is a kwsys::Resource
.
// This is a resource that replaces the SetInputString functions
// Unlike std::streamstring, it is a view, it does not copy the buffer
// buffer is a standard container
vtksys::MemoryResource resource{buffer.data(), buffer.size()};
// The resource is only referenced, so it must stay alive until the la Update() of the user
vtksys::ResourceStream stream{resource}; // Assign the resource, the stream will pass it down to the streambuf
// reader is a vtkXMLReader subclass
reader->SetReadFromInputStream(true); // Must be set to use user-provided stream
reader->SetInputStream(stream); // This is the new function, for all vtkXMLReader
reader->Update();
Wrapping
Having a custom class for resource may enable wrappers to offer an interface for having custom resources, in python, C#, Java, … using a similar mechanism as vtkPythonAlgorithm, which can be subclassed from Python code.
Note
The example is also interesting because it does reveal a problem with the vtkXMLReader::SetIntputString
which is the memory footprint. When using this function, the reader will hold 2 copies of the data when decoding, because it stores the input string, and then create a std::istringstream
which also copies the data.
Here is the memory footprint for the following code, with buffer containing a 700Mio VTI:
1: std::string buffer{...};
reader->SetReadFromInputString(true);
2: reader->SetInputString(buffer);
3: reader->Update(); // 3 is during update, after this->OpenStream()
And the memory footprint using the new approach:
1: const std::string buffer{...};
vtksys::MemoryResource resource{buffer.data(), buffer.size()};
vtksys::ResourceStream stream{resource};
reader->SetReadFromInputStream(true); // Must be set to use user-provided stream
2: reader->SetInputStream(stream);
3: reader->Update(); // 3 is during update, after this->OpenStream()
Though, it could be possible to remove this problem by storing the std::istringstream
only, and freeing the initial buffer after SetInputString
. This change is not very hard, and I have it locally, but is cause a change in vtkXMLUnstructedGridReader
.
I haven’t tested if other readers share this behavior.
Conclusion
Here is all the things I wanted to show for this proposal!
Here is some questions I want to ask:
- What are your thoughts on this proposal ?
- Do you think a features should belong to kwsys or VTK directly ?
- Does the resources/streams should be manipulated through shared pointers ?
Thank you, Alexy.
Ping: @Francois_Mazen @finetjul