Adding a couple Information keys to enable temporal filters to be run in situ

Yohann_Bearzi · May 2, 2023, 2:44pm

There are problems currently when running some temporal filters in situ. Since the entire time series is not available up straight, temporal filters need to be able to produce a correct output up to the available timesteps, and then to be run on upcoming timesteps using some internal cache. An example of a filter failing in situ is vtkTemporalStatistics.

However, we should, for performance reasons, still be able to run a filter on the entire time series without having to generate intermediate states, typically when using ParaView without catalyst.

As a consequence, in filters for which it applies, there should be 2 modes: a regular mode (compute all the time steps at once), and an in situ mode (access to incomplete temporal data). In the regular mode, one can just call Update() to get the output. In situ, one needs to call UpdateTimeStep(double t) for each time step.

Ideally, there shouldn’t be extra parameters in temporal filters to avoid users producing wrong outputs because they forgot to turn ON the in situ mode inside a filter that needs it. Instead, I propose that we add a new Information key that should be set on source output information in situ. The key would be passed downstream and all filters needing to change how they behave would know when to. We could call it INCOMPLETE_TIME_STEPS. When set to true, filters for which it applies can cache whatever data they need to cache in order to produce a complete dataset for each time step. An implementation is proposed here.

This key would be automatically set on all sources when using ParaView Catalyst (see implementation here).

An issue arises when setting ON INCOMPLETE_TIME_STEPS: the filter doesn’t know when to reinitialize its cache. The temporal cache held by concerned filters needs wiped when the user wants to rerun all the timesteps as he changes a parameter in the visualization for example. There is no way to know when a filter gets updated to add a new timestep, or when it does from scratch. To give awareness of such context, I propose to add a new information key that gets set by the executive automatically: WIPE_ALGORITHM_CACHE. It is set to 1 when any upstream filter has called Modified(). It gets propagated downstream by the pipeline. When a concerned temporal filter sees it set, it knows to wipe its temporal cache. You can find how it would be implemented and how to use the key here.

Does anyone have any comments or suggestions?

mwestphal · May 2, 2023, 2:52pm

@nicolas.vuaille @Francois_Mazen @Timothee_Couble @lgivord

danlipsa · May 2, 2023, 2:56pm

To me, WIPE_ALGORITHM_CACHE is too generic. I would rename that and link it to the first key. So have something like the following instead:

WIPE_INCOMPLETE_TIME_STEPS_CACHE

Yohann_Bearzi · May 2, 2023, 3:04pm

Good point. Now that I think about it, this key as I implemented it so far always gets set when an upstream filter called Modified(), even when INCOMPLETE_TIME_STEPS is not set.

So maybe I should either set it in this context and rename it WIPE_INCOMPLETE_TIME_STEPS_CACHE, or leave the code as is and rename it UPSTREAM_CALLED_MODIFIED so this key is available if someone needs this information for another reason?

danlipsa · May 2, 2023, 3:18pm

In that case UPSTREAM_CALLED_MODIFIED seems fine. I cannot think of a case when an upstream filter get its Modified() called and you should not clear the cache.

nicolas.vuaille · May 3, 2023, 8:08am

@jfausty It may be of interest for the Particles Tracking filters

Jens_Munk_Hansen · May 3, 2023, 11:06pm

I’m personally struggling a bit to find a solution for client receiving data that needs to expose TIME_STEPS downstream. The problem is that it doesn’t know a time range from start and further downstream, I need a temporal cache, which should have a certain depth. My current solution is some dirty hack of calling a home-made update time range for each Update(). I guess this solution could be of use for me.