VTKSMP stdthread performance

Thanks for doing the check. I looked though the VTK codebase and there around a dozen other classes where this GetStorage issue is likely to be severely hurting STDThread performance.

I wonder how TBB does its thread local. With the TBB backend, calling Local() seems to not slow thing down at all (or at least, not significantly). TBB (i.e. oneTBB) is open source but I’ve never looked through their code.