Just a quick status update. I got the full filter framework running: threading, templates, array dispatch, new data structure / algorithm design (to eliminate memory fragmentation and enable threading). Also optimization paths for different types of smoothing (simple, boundary, feature, nonmanifold).
(I’m probably going to be sorry that I am saying the following but here goes…) I am seeing ~20x speedup on my machine at this point (Bones.vtk, my desktop 40 thread machine, highest level of algorithm optimization - no boundary, feature, nonmanifold edges). I expect as we clean this up the speeds will likely drop, but I am now seeing from about ~4.5secs to 0.225secs. Of course, all it takes is one bug to change everything so don’t count on this (yet) but I am hopeful
This is an interesting algorithm to work on because threading, algorithm/data structure redesign, templating / dispatch - each make a big difference. I wish I could say what’s most important but it’s hard to separate out the effects one from another. For example, while threading is important, because I had to use an array of std::atomic, it’s likely that thread-scalability suffers. And there was a tremendous use of GetPoint/SetPoint during iteration so dispatch really cut out a lot of data access overhead. The algorithm itself was decidedly serial and used excessive amounts of fragmented memory…
Anyway I need to come up for air for a little while as next week is very demanding. I’ll keep poking at the algorithm when I can and push a MR in a week or two for you to start testing. I also want to do a little benchmarking etc. moving forwards.