Here is stackoverflow related question:
Please look at the second answer.
Here is the current implementation on vtk:
https://gitlab.kitware.com/vtk/vtk/-/blob/master/Common/Core/vtkMath.cxx#L1610
Thank you for the answer
Here is stackoverflow related question:
Please look at the second answer.
Here is the current implementation on vtk:
https://gitlab.kitware.com/vtk/vtk/-/blob/master/Common/Core/vtkMath.cxx#L1610
Thank you for the answer
In general, source code need to be optimized for readability first. If you can pinpoint a performance bottleneck then it may be reasonable to do performance optimization, which usually makes the code less readable but if the performance improvement is perceptible then it may worth it. Therefore, to answer this question we would need to see:
Very often “clever” code runs faster in some environments and slower in others, as seems to be the case with this one, too (see some of the answers in the stackoverflow discussion above). So, evaluating performance impact can be a lot of effort. In the end you may need to add a switch that allows selecting the best implementation for a specific hardware/software environment (see for example the Optimization
flag of vtkImageReslice), which further complicates everything.
For all these reasons, performance optimization is rarely if ever driven by availability of clever algorithms, but by performance profiling of important real-world use cases. Performance profiling will also help in determining if improving matrix multiplication speed is really your best option. Most likely you will find that you can achieve much better performance for that particular use case by avoiding the need for those matrix multiplications by caching, reorganizing the code, etc.
+1 to Andras’s comments.
My experience suggests that by 1) redesigning algorithms, 2) avoiding excessing new/delete, 3) designing efficient API’s, and/or 4) threading routinely produces 5-100x performance gains. Low-level optimization typically produces very modest gains (which may be worth it for important workflows as Andras suggests), at the expense of code complexity. And when we are talking millions of LOC code complexity is a many-headed medusa that I’d rather not face