Reducing loading time

Tees · September 10, 2021, 5:23pm

I want to work with time-series files in .vti format. The different time steps are in different files. For testing purposes I have the data halfcylinder01.00.vti, halfcylinder05.00.vti and halfcylinder10.00.vti.

In a preprocess I do the following:
I load the data, merge the single scalars to a vector array, calculate the magnitude of the vectors and set them as scalars. delete the single arrays (because I merged them to the needed vectors) and after that I write them again as a .vti file for further use.
Can I do some additional clean up steps besides deleting the arrays to save loading time through the Image reader?

My next question would be if there are any examples about Caching Image Data, I sadly did not find much besides the vtkImageCacheFilter Class Reference and I do not understand how I apply it to the data. Like can I save the different image data in the cache and somehow directly load the time step I need?
Any documentation advice I could read would help me a lot.

Thank you in advance!

lassoan · September 11, 2021, 8:15pm

The best approach depends on what you want to achieve and what resources you can assume your users have (how much CPU, GPU memory, how fast CPU and GPU).

If you have plenty of memory then you would have an actor for your data set at each time point and then you may be able to render at hundreds of frames per second. Even volume rendering can be achievable at this high frame rate if you have a good GPU with lots of memory.

In contrast, if it took too much time or memory to run all the processing on all time points in advance, then you could just load, process, and render when the user changes the current time point.

In term of file format, VTI is not a popular file format for images (it is only used by VTK), so if you have 4D data, then I would recommend to use NRRD file format. It is a very simple format, but it allows you to clearly specify dimensions of 4D data sets. For example, if you turn off compression then you can load a 4D cardiac CT (10 frames, about 2GB data in total) typically in about 10-15 seconds.

Tees · September 12, 2021, 9:00am

Thank you for your answer.

it is 3D data and also available in NetCDF and Amira. I think I do not understand the following

“If you have plenty of memory then you would have an actor for your data set at each time point and then you may be able to render at hundreds of frames per second. Even volume rendering can be achievable at this high frame rate if you have a good GPU with lots of memory.”

How could I accomplish having an actor for each time step? Can I somehow load like 10 images in a buffer and if the users inspects later time steps I switch the images in the buffer in a FiFo way?

In contrast, if it took too much time or memory to run all the processing on all time points in advance, then you could just load, process, and render when the user changes the current time point.

it takes some time to load the file, that is excactly what I want to change with some kind of buffer for some time steps.

I am sorry if I did not understand your answer, I am not very expirienced in vtk!

lassoan · September 12, 2021, 11:35am

For each time step, you can run the reading and processing pipeline, grab the resulting data object, set it in an actor, add it to the renderer, hide it, and save that actor in a structure. To replay the sequence, you would hide the current actor and show the next actor in a loop. Since all the processing has been already done, all the graphics resources has been already allocated, downloaded to the GPU, etc. the switch between time pints will be very quick.

Since this approach will load and process all the images, it needs time to complete. To improve the user experience, you could display the first time point immediately and process the other frames in the background while the user is busy looking at the ones that are already loaded. Or, you could store a 4x4x4 downsampled version of the data set that you can load 50x faster (in a few seconds) and while the user looks at this somewhat blurred sequence you would load the full-resolution data in the background.

Amira is a proprietary format, so I would not use it for archiving. Netcdf is just a container, not a concrete 4D image format (e.g., MINC2 is a concrete file format that uses HDF5), so no other software than yours will be able to use it. It could make sense if you want to store the full and the downsampled arrays in the same file, but then I would consider using the more modern OME-Zarr format, which is as powerful as netcdf (based on Zarr) but is is concrete file format that increasingly more medical image computing software supports.

Tees · September 13, 2021, 7:51am

I will try that. Is there some kind of array to store actors in it?

lassoan · September 13, 2021, 12:10pm

This data structure is specific to your application. For example it can be an array of structures. The structure contains the actor with some additional metadata. At the minimum the metadata is the time point value, but if you have multiple views then you will need to store the input of the mapper, and for each view an actor (and for convenient access, maybe the mapper, too).

These things will get complicated by the time you implement a basic feature set. So, I always recommend not to create yet another application from scratch, but find a VTK-based open-source application that does something similar to what you need and customize/extend that. Tens of millions of dollars were spent to implement applications, such as ParaView and 3D Slicer, which you can use for completely free, without any restrictions, and they already take care of most of the things you need. They are general-purpose frameworks, so you always need to do some customizations to have optimal performance and user experience for a specific workflow, but it is still much less work to implement and maintain them rather than redevelop everything from just plain VTK.

Tees · September 14, 2021, 1:45pm

The problem is that I need it to be in VTK because it is an ongoing project…
I have already implemented the different filters like glyphing, stream lines and LIC which is working fine and fast on the vtkImageData and if no filter is activated it shows the polydata.
The Documentation is just hard to understand.
For example couldn’t I read the different vti files and store the ImageData in the vtkImageCacheFilter?
and then for each time step I load the image data out of the cache and run it through the geometry filter to see the polydata?
Is there no rudimentary way to implement this? Or with the use of vtkTemporalDataSetCache? It just seems to be a problem everyone should have when viewing time data with VTK.

Thank you for your time!

lassoan · September 14, 2021, 7:31pm

VTK has some classes to help with time sequences, but when I tried to use them when implementing universal 4D data support in 3D Slicer (images, transforms, meshes, markups, cameras, etc.) then I found them overly complicated (for what they provide) and incomplete. This infrastructure is probably developed for ParaView and only used there.

vtkImageCacheFilter was introduced in 1999 and there is no sign anyone using this class or even mentioning it after 2004.

ParaView uses vtkTemporalDataSetCache and some other temporal processing classes. The features seem to be all focused on meshes and I’m not sure how useful they are for images. You probably need to dig into these classes some more and/or seek help from ParaView developers (maybe on the ParaView forum).

ParaView and 3D Slicer (and MITK, etc.) are built on VTK, too, so it should not be hard to bring over anything from your existing project. What you gain is that you don’t need to develop/maintain all the basic features (data import/export, interactive visualization in multiple views, time sequence support, etc.) but you can rely on the application platform to take care of those, and you can focus on the actual problem you want to solve. I know that this is often a hard sell, it can be difficult to let go things that took a lot of effort to develop, and even if you know what’s right, convincing your supervisors might not be easy. Let us know if we can help with anything in this.

Tees · September 17, 2021, 7:02pm

So switching to paraview is sadly not an option.
If I switch to vtk files and unstructured grids, would there be an option to cache it for different time steps?
I tried for example to load the timesteps into the unstructured grid reader and then save it as a data object in vtkDataObjectCollection. Like that:

void savetimestep(const vtkSmartPointer<vtkUnstructuredGridReader>& unstructuredGridReader){

unstructuredGridReader->SetFileName(fileString.c_str());
unstructuredGridReader->Update();
objectBuffer(unstructuredGridReader->GetOutputDataObject(0));
}

void objectBuffer(const vtkSmartPointer<vtkDataObject>& dataObject_){
	objColl->AddItem(dataObject_);
}

But it seems that all entries in my list point to the same object what also makes sense.

What I want to do is something like activate my different filters like glyph or streamline or lic and then play it and see how it changes over time.

the different filters are working fine on the unstructured grid just the loading time for a vtk file is too long.

Is there some kind of array of unstructured grids?

Or do I go in cicles now?

Greetings and thank you I really appreciate your help!

lassoan · September 18, 2021, 2:22am

If you need fastest possible speed then you can run the pipeline for all time points (potentially in background threads) and add all the resulting actors to the renderer. When you replay the time sequence, you just switch visibility of actors on and off. I described this and alternative options above. I don’t have anything more to add to this discussion.

Tees · September 20, 2021, 7:47pm

Thank you for your help!
I think I found a very different way solving this problem with MPI. It might not be the cleanest solution but if I do it with vtk MPI I kind of have my buffer and the needed speed. although I will have some idle time and the work load will not be excactly divided, I hope that this will be enough for me.

Tees · September 24, 2021, 11:41am

I hope you can assist me with your expertise one more time!

As I said I used MPI to read the data faster and then I send the files to the root processor looking like that:

vtkSmartPointer<vtkUnstructuredGridReader> gridReaderArray[27];
	for(int i=0;i<27;i++){
		gridReaderArray[i]= vtkSmartPointer<vtkUnstructuredGridReader>::New();
	}
//Reading her
//and after sending
mpiComm->Send(gridReaderArray[timestep]->GetOutput(),root,tagNumber);

On the root processor I save the recieved files as unstructured Grid:

vtkSmartPointer<vtkUnstructuredGrid>arrayGrid[151];

for (int i=0;i<81;i++){
	arrayGrid[i]= vtkSmartPointer<vtkUnstructuredGrid>::New();
    mpiComm->Receive(arrayGrid[i],recFrom,tagNumber);
}

I get with the data set surface filter (which was faster than the geometry filter) a much faster result.
Is there anyway I can allocate the needed space for the array?
I mean the same way in c++ with malloc if I do not want the constant as input here arrayGrid[151]?

There is also a fast mode on the geometry filter and surface filter but I can not activate it. I tried:

geometryFilter->SetInputData(arrayGrid[timestep]);
geometryFilte->FastModeOn();
geometryFilter->Update();

same for the surface filter, it says it is no member of that class.
I am kind of unsure, if I even used the smart pointer right to get the grid array.
After I saved the timesteps in my grid array I tried to load them in actors to speed it up as you said, but I horribly failed. You said above something about saving the mapper too. So if I try to eg use a glyphfilter on the input:
would I do it like that:

glyphMask->SetInputData(arrayGrid[timestep]);
glyphMask->SetOnRatio(500);
glyphMask->GetOutput()->GetPointData()->SetActiveVectors("Vectors");
glyphMask->Update();
glyph->SetInputConnection(glyphMask->GetOutputPort());
glyph->SetSourceConnection(glyphArrow->GetOutputPort());
glyph->ScalingOn();
glyph->SetVectorModeToUseVector();
glyph->SetColorModeToColorByVector();
glyph->SetScaleFactor(0.4);
glyph->OrientOn();
glyphMapper->SetInputConnection(glyph->GetOutputPort());
glyphActor[timestep]->SetMapper(glyphMapper);
renderer->AddActor(glyphActor);
setActorVisibility(renderer, renWin);//sets all other actors invisible
glyphActor[timestep]->SetVisibility(1);
glyph->Update();

and after one loop I could just set the visibilities? That is what I tried. Or do I need to also make an array of mappers? I am unsure if I am doing that right.

Greetings Tassilo