How to properly cleanup vtk to use it across different threads on Windows (python-vtk)

filsilva · August 14, 2021, 7:30pm

Hi All,

For our project we (@devmessias ) need to use timers that are executed in other threads and need to call some vtk functions on them. This works fine on Linux and Mac as the same OpenGL context can be shared across different threads. However on Windows, it seems that this is not allowed and there may be a need to clear the context after using it in a thread, and maybe initialize it correctly in the new thread. So my question is how to properly do this cleanup using python vtk?

Here is a example:

from threading import Timer
import numpy as np
import vtk
import time


class IntervalTimer:
    def __init__(self, seconds, callback, *args, **kwargs):
        """Implements a object with the same behavior of setInterval from Js

        Parameters
        ----------
        seconds : float
            A postive float number. Represents the total amount of
            seconds between each call
        callback : function
            The function to be called
        *args : args
            args to be passed to callback
        **kwargs : kwargs
            kwargs to be passed to callback

        """
        self._timer = None
        self.seconds = seconds
        self.callback = callback
        self.args = args
        self.kwargs = kwargs
        self.is_running = False
        self.start()

    def _run(self):
        self.is_running = False
        self.start()
        self.callback(*self.args, **self.kwargs)

    def start(self):
        """Start the timer"""
        if self.is_running:
            return

        self._timer = Timer(self.seconds, self._run)
        self._timer.daemon = True
        self._timer.start()
        self.is_running = True

    def stop(self):
        """Stop the timer"""
        if self._timer is None:
            return

        self._timer.cancel()
        if self._timer.is_alive():
            self._timer.join()
        self.is_running = False
        self._timer = None


State = {'in_request': False}


def callback(*args, **kwargs):
    print('init callback')
    window2image_filter = kwargs['window2image_filter']
    if State['in_request']:
        return

    State['in_request'] = True
    window2image_filter.Update()
    window2image_filter.Modified()
    vtk_image = window2image_filter.GetOutput()
    vtk_array = vtk_image.GetPointData().GetScalars()
    # num_components = vtk_array.GetNumberOfComponents()

    w, h, _ = vtk_image.GetDimensions()
    np_arr = np.frombuffer(vtk_array, dtype='uint8')
    if np_arr is None:
        State['in_request'] = False
        return

    # do some stuff

    print('callback finished') 
    State['in_request'] = False


colors = vtk.vtkNamedColors()
bkg = map(lambda x: x / 255.0, [26, 51, 102, 255])
colors.SetColor("BkgColor", *bkg)

cylinder = vtk.vtkCylinderSource()
cylinder.SetResolution(8)

cylinderMapper = vtk.vtkPolyDataMapper()
cylinderMapper.SetInputConnection(cylinder.GetOutputPort())


cylinderActor = vtk.vtkActor()
cylinderActor.SetMapper(cylinderMapper)
cylinderActor.GetProperty().SetColor(colors.GetColor3d("Tomato"))


ren = vtk.vtkRenderer()
renWin = vtk.vtkRenderWindow()
renWin.AddRenderer(ren)
iren = vtk.vtkRenderWindowInteractor()
iren.SetRenderWindow(renWin)

ren.AddActor(cylinderActor)
ren.SetBackground(colors.GetColor3d("BkgColor"))
renWin.SetSize(300, 300)
renWin.SetWindowName('CylinderExample')

iren.Initialize()

window2image_filter = vtk.vtkWindowToImageFilter()
window2image_filter.SetInput(renWin)
timer = IntervalTimer(
        1,
        callback,
        *[],
        **{'window2image_filter': window2image_filter}
    )
print('Timer created') 


ren.ResetCamera()
ren.GetActiveCamera().Zoom(1.5)
renWin.Render()
time.sleep(1000)

This works on Linux and Mac but fails on Windows with:

2021-08-14 15:06:15.122 (   1.718s) [                ]vtkWin32OpenGLRenderWin:217    ERR| vtkWin32OpenGLRenderWindow (000002DAF5AC0390): wglMakeCurrent failed in MakeCurrent(), error: The requested resource is 
in use.

2021-08-14 15:06:15.168 (   1.764s) [                ]vtkWin32OpenGLRenderWin:217    ERR| vtkWin32OpenGLRenderWindow (000002DAF5AC0390): wglMakeCurrent failed in MakeCurrent(), error: The requested resource is 
in use.

thanks!

devmessias · August 14, 2021, 8:10pm

Note: In our case we can’t initialize the vtk window instance.

iren.Initialize()
iren.Start()

Thus, we can’t use the VTK TimerEvents.

Paulo_Carvalho · August 15, 2021, 12:47pm

Hello, Filipi,

I also have issues in my program with libraries that are not thread-safe. APIs can be thread unsafe for many reasons: usage of static information (e.g. random number generators); access to shared resources or simply old or ill-written code making overuse of global objects. I use mutexes to force the multiple threads to call thread-unsafe functions one at a time. Below is a straightforward example in C++ that illustrates the use of a mutex and a lock objects. Python certainly has methods and classes to define critical sections in your code:

/** This is in a .cpp file */

std::mutex mutexObjectiveFunction;  //a static or global mutex object.

(...)

/** This function is called by multiple threads. */
void doSomethingCoolMT() {

    // a lock object used as a green/red light to control access to
    // thread-unsafe code.
    std::unique_lock<std::mutex> lck (mutexObjectiveFunction, std::defer_lock);

    (...) //thread-safe code

	//Compute FFT (this is a critical section)
	lck.lock();                   //
	spectral::forward( tmp, sum ); //fftw crashes when called simultaneously
	lck.unlock();                 //

    (...) //thread-safe code

	//Compute RFFT (this is also a critical section)
	lck.lock();                            //
	spectral::backward( rfftResult, tmp ); //fftw crashes when called simultaneously
	lck.unlock();                          //

    (...) //thread-safe code
}

regards,

Paulo

devmessias · August 15, 2021, 1:43pm

Hi @Paulo_Carvalho thanks for answreing us.

You’re right. We can use mutex in python to avoid unsafe calls. But we still don’t know how to swap opengl contexts between different threads in python vtk.

Paulo_Carvalho · August 15, 2021, 2:05pm

So, may I ask what is the reason for multiple OpenGL contexts in the same program? I mean, what kind of effect or feature you need to achieve? Maybe it’s perfectly doable with a single context without having to manage multiple ones. I’ve never needed that. My project opens multiple rendering windows simultaneosly without the need for that.

Like one of the answers in your referenced topics:

As long as the OpenGL context is touched from only one thread at a time, you should not run into any problems.

That said, you likely actually need just one OpenGL context and all you need to do is to encase thread-unsafe VTK calls in critical sections. My two cents on this.

take care,

Paulo

devmessias · August 15, 2021, 5:11pm

So, may I ask what is the reason for multiple OpenGL contexts in the same program? I mean, what kind of effect or feature you need to achieve?

I’m currently working in a streaming system for VTK/FURY for my Google Sumer of Code project. More info can be found here Python GSoC - Post #1 - A Stadia-like system for data visualization - demvessias's Blog. The main point is that until now we must have this behavior. But maybe I missed something.

Thanks

filsilva · August 15, 2021, 9:01pm

Thanks @Paulo_Carvalho,

Indeed it would be great if we could use the same OpenGL context in different threads. However, this seems to not work well on Windows. This is because on Windows just one thread may have the same context in place (it does not matter if it is thread safe).

We have updated the example here to include a mutex lock (just to make sure that this is not caused by a mutex) and removing and further calls to vtk in the main thread. So the idea here is to demonstrate that if we initialize vtk in the main thread and stop using it there and just start using it in another thread (like in a timer), vtk for windows will raise the mentioned error.

from threading import Timer, Lock
import numpy as np
import vtk
import time

mutex = Lock()

class IntervalTimer:
    def __init__(self, seconds, callback, *args, **kwargs):
        """Implements a object with the same behavior of setInterval from Js

        Parameters
        ----------
        seconds : float
            A postive float number. Represents the total amount of
            seconds between each call
        callback : function
            The function to be called
        *args : args
            args to be passed to callback
        **kwargs : kwargs
            kwargs to be passed to callback

        """
        self._timer = None
        self.seconds = seconds
        self.callback = callback
        self.args = args
        self.kwargs = kwargs
        self.is_running = False
        self.start()

    def _run(self):
        self.is_running = False
        self.start()
        self.callback(*self.args, **self.kwargs)

    def start(self):
        """Start the timer"""
        if self.is_running:
            return

        self._timer = Timer(self.seconds, self._run)
        self._timer.daemon = True
        self._timer.start()
        self.is_running = True

    def stop(self):
        """Stop the timer"""
        if self._timer is None:
            return

        self._timer.cancel()
        if self._timer.is_alive():
            self._timer.join()
        self.is_running = False
        self._timer = None


State = {'in_request': False}


def callback(*args, **kwargs):
    mutex.acquire()

    print('init callback')
    window2image_filter = kwargs['window2image_filter']
    if State['in_request']:
        return

    State['in_request'] = True
    window2image_filter.Update()
    window2image_filter.Modified()
    vtk_image = window2image_filter.GetOutput()
    vtk_array = vtk_image.GetPointData().GetScalars()
    # num_components = vtk_array.GetNumberOfComponents()

    w, h, _ = vtk_image.GetDimensions()
    np_arr = np.frombuffer(vtk_array, dtype='uint8')
    if np_arr is None:
        State['in_request'] = False
        return

    # do some stuff

    print('callback finished') 
    State['in_request'] = False
    mutex.release()

colors = vtk.vtkNamedColors()
bkg = map(lambda x: x / 255.0, [26, 51, 102, 255])
colors.SetColor("BkgColor", *bkg)

cylinder = vtk.vtkCylinderSource()
cylinder.SetResolution(8)

cylinderMapper = vtk.vtkPolyDataMapper()
cylinderMapper.SetInputConnection(cylinder.GetOutputPort())


cylinderActor = vtk.vtkActor()
cylinderActor.SetMapper(cylinderMapper)
cylinderActor.GetProperty().SetColor(colors.GetColor3d("Tomato"))


ren = vtk.vtkRenderer()
renWin = vtk.vtkRenderWindow()
renWin.SetOffScreenRendering(1)
renWin.AddRenderer(ren)

ren.AddActor(cylinderActor)
ren.SetBackground(colors.GetColor3d("BkgColor"))


window2image_filter = vtk.vtkWindowToImageFilter()
window2image_filter.SetInput(renWin)
ren.ResetCamera()
ren.GetActiveCamera().Zoom(1.5)
renWin.Render()


timer = IntervalTimer(
        1,
        callback,
        *[],
        **{'window2image_filter': window2image_filter}
    )
print('Timer created') 


time.sleep(1000)

We create a vtk window with offscreen rendering and no iren, do some initialization in the Main Thread. Next, we run a timer that will use some vtk functions in a new thread. This works on Linux and Mac, but gives an error on Windows.

In the links posted by @devmessias it seems that there are ways to cleanup the context in a thread before using it in another, however just cleaning them do not work, we may need to initialize a new context with same properties (or linked to the previous context). Again this seems to be a problem applied just for windows because the context is linked to the thread and can not be accessed from other threads even if you are using mutex.

dr_ppetrov · August 24, 2021, 9:59am

Just my opinion, took a look at the link you shared. The architecture is extremelly complicated. Also this:
"
Sharing data between process

We want to avoid any kind of unnecessary duplication of data or expensive copy/write actions. We can achieve this economy of computational resources using the multiprocessing module from python.
"
Trust me, you better have unnecessary copies rather than try to do shared memory operations all over the place.

Usually the most time consuming part is IO and processing (like vtk filter). Those operation you can always do in a background working thread with some clear continuation on the main thread where the rendering takes place. Since you have a network application after rendering you need yet another background task to send the data over the wire. Again, rendering is not your bottleneck and creating multithreaded rendering is just a big mess on windows or not, your IO, processing and netstack is what should not be blocking the rest stays blocking on the main thread.

devmessias · August 24, 2021, 1:54pm

Hi @dr_ppetrov . Thanks for your answer.

We are not trying to use two or more threads to improve the rendering and reduce the time. We just to want swap the contexts between the threads.

We already avoiding unnecessary copies using SharedMemory and RawArrays. All the heavy stuff happens without copying. But we want to do that not to avoiding copies but to avoid blocking the main thread

I agree with you: rendering is not our bottleneck. But the reason which we want to do that is just to have a non-blocking behavior inside of a jupyter thread not to improve the performance.

Here is small video about this behavior and how it works on a linux or MacOs machine. We can change a vtk instance in the main thread and call other stuffs without blocking.

.

Solving that on windows will also help us to solve another set of problems that we are facing today,

Doing just a mutex is not working on windows when we call

window2image_filter.Update()
window2image_filter.Modified()
vtk_image = window2image_filter.GetOutput()
vtk_array = vtk_image.GetPointData().GetScalars()

Thanks

dr_ppetrov · August 25, 2021, 10:43am

Oh well, you do need some sort of continuation out of the jupyter thread. I am afraid I have no knowledge of jupyter . I have only done it using Qt and I can provide a complete example. Anyhow, as you found out, any call to VTK that will trigger OpneGL call must be on the main thread. For similar reason you cannot have Qt GUI on any other thread but the main thread.

devmessias · August 25, 2021, 12:38pm

Hi all. We have talked with @ken-martin from Kitware about this, maybe in the future we can have a feature to deal with that.

First, it seems that we did not described very well our problem. I’ll describe better now and provide a temporary workaround that @filsilva and I agree to use for a while. Maybe this temporary workaround can be useful for someone in the future grasping in similar problem.

For our WebRTC streaming we have at least two processes A (CLIENT) and B (SERVER).
https://user-images.githubusercontent.com/6979335/121934889-33ff1480-cd1e-11eb-89a4-562fbb953ba4.png

B deals with the video encoding and web server. B encodes a frame buffer information

inside of a shared memory resource created by the process A . Thus, all the heavy stuffy has no side-effect on A. We’re not trying to improve the performance from A.

A deals with all the vtk stuff and other type of commands that a user can request through a command or script to change the vtk instance. Therefore, the vtk rendering shouldn’t block the main thread.

A has at least 3 threads: T1, T2 and T3.

T1 : VTK thread (main thread)

T2: Extracts the frame buffer from the VTK instance and writes into the shared memory resource Image

T3: Deals with the user interaction from B. This is not important for us now, but this thread will also perform vtk calls.

The main point here is that we’re not calling renderWindow.Start() on T1 , thus a vtk window will not appear to block T1. But T2 must perform the following code

//...
// check if the resource is busy, using lock or other approach like a global var
window2image_filter.Update()
window2image_filter.Modified()
vtk_image = window2image_filter.GetOutput()
vtk_array = vtk_image.GetPointData().GetScalars()
w, h, _ = vtk_image.GetDimensions()
np_arr = np.frombuffer(vtk_array, dtype='uint8')
//writes np_arr into the shared memory
```

This will work very well on Linux and MacOs, the video shows the behavior inside of a linux machine

A similar behavior is expected runing a usual python script.

But in Windows we have a problem that is how to release a context in T1 and make T2 the current context

In VTK it seems the only way to release a context inWindows is calling the Clean method from a vtkWin32OpenGLRenderWindow instance.
VTK/vtkWin32OpenGLRenderWindow.cxx at a94916ed1f471f17616630f2caeb6f77918dc5a9 · Kitware/VTK · GitHub

However this will also will invoke CleanUpRenderers. Thus, T2 will not be able to extract the frame buffer from the VTK instance.

Here our temporary workaround to this problem using pyopengl

from threading import Timer, Lock
import numpy as np
import vtk
import time
import threading
from OpenGL.WGL import wglMakeCurrent

mutex = Lock()


def save_image(window2image_filter):
    mutex.acquire()
	// swap
    renWin.MakeCurrent()

    window2image_filter = window2image_filter

    window2image_filter.Update()
    window2image_filter.Modified()
    vtk_image = window2image_filter.GetOutput()
    vtk_array = vtk_image.GetPointData().GetScalars()

    w, h, _ = vtk_image.GetDimensions()
    np_arr = np.frombuffer(vtk_array, dtype='uint8')

    print("Max value in buffer: ",np.max(np_arr))

    mutex.release()

colors = vtk.vtkNamedColors()

cylinder = vtk.vtkCylinderSource()

cylinderMapper = vtk.vtkPolyDataMapper()
cylinderMapper.SetInputConnection(cylinder.GetOutputPort())

cylinderActor = vtk.vtkActor()
cylinderActor.SetMapper(cylinderMapper)
cylinderActor.GetProperty().SetColor(colors.GetColor3d("Tomato"))

ren = vtk.vtkRenderer()
renWin = vtk.vtkRenderWindow()
renWin.SetOffScreenRendering(1)
renWin.AddRenderer(ren)

ren.AddActor(cylinderActor)
ren.SetBackground(colors.GetColor3d("BkgColor"))


window2image_filter = vtk.vtkWindowToImageFilter()
window2image_filter.SetInput(renWin)
ren.ResetCamera()
ren.GetActiveCamera().Zoom(1.5)
renWin.Render()
//cleanup
wglMakeCurrent(renWin.GetGenericDisplayId(),None)

thread_a = threading.Thread(target=save_image, args=(window2image_filter,))
thread_a.start()
time.sleep(10)

// do other stuff here

thread_a.join()
```

We could also use a Queue of functions in python to avoid the blocking behavior doing all the stuff, including the VTK in T2.
like this

gist.github.com

https://gist.github.com/devmessias/aade53a647cae602aa2398cd0decba18

vtk_queue_py.py

from queue import Queue
import queue
from threading import Thread
import numpy as np
import time
import vtk
import threading


queue_of_functions = Queue()

This file has been truncated. show original

But our project will be also used for people with low knowledge about computer programming stuff. Will be hard to explain to them how this Queue Thread approach works and how they should use that.

In addition, we want to solve that to understand how to solve another set of problems that we’re facing today

thanks for discussing this problem with us @dr_ppetrov and @Paulo_Carvalho.

dr_ppetrov · September 10, 2021, 2:23pm

Hi Bruno,

I am afraid to say, at least for me, things are not any more clear. For starters, I am looking at the architecture image you shared and I cannot get my head around it: it shows a client and a server together on one host machine yet user-A is on another machine I assume through web-browser.

But what I find more problematic is this:

T1 and T3 must be one the same thread, frankly regardless of the OS and then T2 is completely unnecessary. You do understand that the function ‘save_image’ is not saving any image right? It is simply extracting the buffer which is a plain read memory operation taking less than 1ms, so why not simply block on it? It is not like you are doing any processing, IO or networking !?

Looking back at your ‘save_image’ if I were you will execute it on the main GUI thread all the way till that line:
vtk_image = window2image_filter.GetOutput()
only from that point on you can pass execution to some background thread assuming vtk_mage is allocated on the heap.

and finally this:

so you are not building an end product? Is it some sort of middleware/framework ?

devmessias · September 10, 2021, 6:36pm

I don’t know how this will be possible without blocking the main thread. We must have a timer to check if the Queue has any element inside and perform the related user interaction.

I know that T2 doesn’t “save an image” and takes just a small amount of time to perform it with an exception to high resolutions. But as I said I not trying to improve the performance. In addition python has GIL, there is no reason to use multithreading to improve the performance.

This is just a decision about user experience: we don’t want to block anything. We could avoid threads and use asyncio in python. With asnycio everything will happen at the same thread but this will give a different behavior. With the coroutine approach after start the streaming system any sync-computation will block the streaming.

stream.start()
while True:
    // do some stuff here

if we want a non-blocking behavior you must create async statements, like this

stream.start()
while True:
    async.sleep(1)
    // do some stuff here

I don’t like this approach. However, I’m using the asyncio approach if the streaming system is running in a Windows machine outside of WSL.