There are two common causes for vtkImageReslice to be slower than it should be:
Some people only need to extract a single slice from the volume, but write their pipeline such that the OutputExtent of vtkImageReslice is a 3D extent. Then, after vtkImageReslice has produced a full 3D volume, they display just one slice of that output volume. All the extra work of reslicing the full volume instead of just extracting one slice will cause the filter to take e.g. 100x longer than it should. I don’t know if this fits your situation, you would have to show me your code.
If you let the VTK pipeline automatically update filter or reader that precedes vtkImageReslice, instead of calling Update() manually on that read/filter, then a special feature of the pipeline called “streaming” can automatically be engaged. This can cause the upstream pipeline to be executed every time that vtkImageReslice itself is updated, resulting in a slowdown.
A further consideration is the way that the CPU cache handles different memory access patterns. This doesn’t impact performance as much as the two items described above, but if you benchmark sagittal and coronal separately it is normal to see that sagittal slicing is 2 or 3 times slower than coronal. You should check to see if this is the case with your code. Note that speed issues related to the cache are expected and there isn’t much to be done about them unless you want to do cache-specific profiling and optimization.
Finally, note that vtkImageReslice is multithreaded but it only uses the CPU, so it cannot be expected to come anywhere close to the performance of GPU interpolation. Actually, 5ms to extract a 512x297 sagittal slice from a 512x512x297 volume seems to be in the right ballpark for an i3 CPU.