The vtk wheel from pypi does not work on a machine that has a GPU without a display connected. A common problem is when people use trame with VTK/ParaView for remote rendering and run their server process in a cloud machine that provisions a GPU without any screen. Sometimes, the GPUs themselves do not even have ports to plug in a monitor. In these scenarios, VTK could be more adaptable by falling back to a suitable OpenGL context backend at runtime.
In order for VTK to use the GPU, users are asked to uninstall the vtk
wheel and install another “vtk-egl”:
pip install "vtk-egl" --extra-index-url https://wheels.vtk.org
I did some research into why vtk-egl is not distributed in pypi. I think the reason has to do with being unable to bundle libEGL.so
along with vtk wheel in pypi because libEGL
is part of the linux graphics driver stack and comes with nvidia. I think pypi also doesn’t allow bundling arbitrary shared libraries in the wheel and all libs must be built from source. We could distribute mesa’s libEGL but that’s not great for nvidia gpus.
If I can adapt VTK to not link with libEGL at compile time, there wouldn’t be a need to have an entirely separate wheel. Imagine running a single command pip install vtk
, on your laptop or in the cloud and VTK will automatically use X when it can open a display or EGL when a display is not available.
As I started looking deeper into this, the first problem was that the existing OpenGL/Context loader in VTK glew
could not handle both EGL and GLX in the same build (issue vtk/vtk#18547)
@meak worked on that issue for a while and he came up with a nice solution. Michael changed VTK to use glad2
instead of glew
. glad2 is a neat tool that generates custom OpenGL/Vulkan/WGL/EGL/GLX dynamic loaders. I’ve worked with glad in the past and liked the simplicity of it.
I’ve picked up @meak’s commits from https://gitlab.kitware.com/michael.migliore/vtk/-/commit/cea5e2cd7c938c7770151fbafee2823857f8322d and made few more changes in my use-glad branch that extend the dynamic loading for glx and opengl as well.
Here, I’m emulating a headless machine by running the vtkProbeOpenGLVersion executable inside a docker container with/without an X display or GPU and it seems to do the right thing.
-
No X display and no GPU
Expects
vtkEGLRenderWindow
with Mesa drivers# emulate a disconnected display for the docker container. $ xhost - access control enabled, only authorized clients can connect # emulate no x and no gpu $ docker run --rm -it -v$PWD:/vtk kitware/vtk:ci-fedora39-20240721 /bin/bash $ ./vtk/in_docker_build/bin/vtkProbeOpenGLVersion-9.3 2024-07-30 22:24:36.855 ( 0.002s) [ 7FD571B46C40]vtkRuntimeOpenGLRenderW:85 WARN| Unable to open X display (null) Class: vtkEGLRenderWindow succeeded in finding a working OpenGL OpenGL vendor string: Mesa OpenGL renderer string: llvmpipe (LLVM 17.0.6, 256 bits) OpenGL version string: 4.5 (Compatibility Profile) Mesa 23.3.6
-
X display without GPU
Expects
vtkXOpenGLRenderWindow
with Mesa drivers# emulate a connected display for the docker container. $ xhost + access control disabled, clients can connect from any host # emulate no gpu $ docker run --rm -it -v /tmp/.X11-unix:/tmp/.X11-unix:Z -e DISPLAY=$DISPLAY -v$PWD:/vtk kitware/vtk:ci-fedora39-20240721 /bin/bash $ ./vtk/in_docker_build/bin/vtkProbeOpenGLVersion-9.3 Class: vtkXOpenGLRenderWindow succeeded in finding a working OpenGL server glx vendor string: NVIDIA Corporation server glx version string: 1.4 server glx extensions: GLX_EXT_visual_info GLX_EXT_visual_rating GLX_EXT_import_context GLX_SGIX_fbconfig GLX_SGIX_pbuffer GLX_SGI_video_sync GLX_SGI_swap_control GLX_EXT_swap_control GLX_EXT_swap_control_tear GLX_EXT_texture_from_pixmap GLX_EXT_buffer_age GLX_ARB_create_context GLX_ARB_create_context_profile GLX_EXT_create_context_es_profile GLX_EXT_create_context_es2_profile GLX_ARB_create_context_no_error GLX_ARB_create_context_robustness GLX_NV_delay_before_swap GLX_EXT_stereo_tree GLX_EXT_libglvnd GLX_ARB_context_flush_control GLX_NV_robustness_video_memory_purge GLX_NV_multigpu_context GLX_ARB_multisample GLX_NV_float_buffer GLX_ARB_fbconfig_float GLX_EXT_framebuffer_sRGB GLX_NV_copy_image GLX_NV_copy_buffer client glx vendor string: Mesa Project and SGI client glx version string: 1.4 glx extensions: GLX_ARB_context_flush_control GLX_ARB_create_context GLX_ARB_create_context_no_error GLX_ARB_create_context_profile GLX_ARB_create_context_robustness GLX_ARB_fbconfig_float GLX_ARB_framebuffer_sRGB GLX_ARB_get_proc_address GLX_ARB_multisample GLX_EXT_create_context_es2_profile GLX_EXT_create_context_es_profile GLX_EXT_framebuffer_sRGB GLX_EXT_import_context GLX_EXT_texture_from_pixmap GLX_EXT_visual_info GLX_EXT_visual_rating GLX_MESA_query_renderer GLX_SGIX_fbconfig GLX_SGIX_pbuffer OpenGL vendor string: Mesa OpenGL renderer string: llvmpipe (LLVM 17.0.6, 256 bits) OpenGL version string: 4.5 (Core Profile) Mesa 23.3.6
-
GPU without X display
Expects
vtkEGLRenderWindow
with nvidia driver$ xhost - access control enabled, only authorized clients can connect $ docker run --rm -it -v /tmp/.X11-unix:/tmp/.X11-unix:Z -v $PWD:/vtk -e DISPLAY=$DISPLAY --runtime=nvidia --gpus=all -e NVIDIA_DRIVER_CAPABILITIES=graphics,display,compute kitware/vtk:ci-fedora39-20240721 /bin/bash $ ./vtk/in_docker_build/bin/vtkProbeOpenGLVersion-9.3 Authorization required, but no authorization protocol specified 2024-07-30 vtkRuntimeOpenGLRenderWindowFactory.cxx:85 WARN| Unable to open X display :1 Authorization required, but no authorization protocol specified Authorization required, but no authorization protocol specified Authorization required, but no authorization protocol specified Class: vtkEGLRenderWindow succeeded in finding a working OpenGL OpenGL vendor string: NVIDIA Corporation OpenGL renderer string: NVIDIA RTX A2000 Laptop GPU/PCIe/SSE2 OpenGL version string: 4.6.0 NVIDIA 555.42.02
-
Finally, with a GPU and X display (this is the most common usecase on a PC)
Expects
vtkXOpenGLRenderWindow
with nvidia driver$ xhost + access control disabled, clients can connect from any host $ docker run --rm -it -v /tmp/.X11-unix:/tmp/.X11-unix:Z -v $PWD:/vtk -e DISPLAY=$DISPLAY --runtime=nvidia --gpus=all -e NVIDIA_DRIVER_CAPABILITIES=graphics,display,compute kitware/vtk:ci-fedora39-20240721 /bin/bash $ ./vtk/in_docker_build/bin/vtkProbeOpenGLVersion-9.3 Class: vtkXOpenGLRenderWindow succeeded in finding a working OpenGL server glx vendor string: NVIDIA Corporation server glx version string: 1.4 server glx extensions: GLX_EXT_visual_info GLX_EXT_visual_rating GLX_EXT_import_context GLX_SGIX_fbconfig GLX_SGIX_pbuffer GLX_SGI_video_sync GLX_SGI_swap_control GLX_EXT_swap_control GLX_EXT_swap_control_tear GLX_EXT_texture_from_pixmap GLX_EXT_buffer_age GLX_ARB_create_context GLX_ARB_create_context_profile GLX_EXT_create_context_es_profile GLX_EXT_create_context_es2_profile GLX_ARB_create_context_no_error GLX_ARB_create_context_robustness GLX_NV_delay_before_swap GLX_EXT_stereo_tree GLX_EXT_libglvnd GLX_ARB_context_flush_control GLX_NV_robustness_video_memory_purge GLX_NV_multigpu_context GLX_ARB_multisample GLX_NV_float_buffer GLX_ARB_fbconfig_float GLX_EXT_framebuffer_sRGB GLX_NV_copy_image GLX_NV_copy_buffer client glx vendor string: NVIDIA Corporation client glx version string: 1.4 glx extensions: GLX_ARB_get_proc_address GLX_ARB_multisample GLX_EXT_visual_info GLX_EXT_visual_rating GLX_EXT_import_context GLX_SGI_video_sync GLX_SGIX_fbconfig GLX_SGIX_pbuffer GLX_SGI_swap_control GLX_EXT_swap_control GLX_EXT_swap_control_tear GLX_EXT_buffer_age GLX_ARB_create_context GLX_ARB_create_context_profile GLX_NV_float_buffer GLX_ARB_fbconfig_float GLX_EXT_texture_from_pixmap GLX_EXT_framebuffer_sRGB GLX_NV_copy_image GLX_NV_copy_buffer GLX_EXT_create_context_es_profile GLX_EXT_create_context_es2_profile GLX_ARB_create_context_no_error GLX_ARB_create_context_robustness GLX_NV_delay_before_swap GLX_EXT_stereo_tree GLX_ARB_context_flush_control GLX_NV_robustness_video_memory_purge GLX_NV_multigpu_context OpenGL vendor string: NVIDIA Corporation OpenGL renderer string: NVIDIA RTX A2000 Laptop GPU/PCIe/SSE2 OpenGL version string: 4.5.0 NVIDIA 555.42.02
Many unit tests pass with EGL, so I’m close to creating an MR.
- Is it a good idea to remove the
VTK_OPENGL_HAS_EGL
build setting? It does not make sense anymore because VTK will no longer rely on find_package(OpenGL) to discover EGL at compile time. Can we remove it and act as if EGL always exists? - Can the Utility target
VTK::opengl
also be removed? All it’s targets are now dynamically discovered, except OSMesa. that should be trivial to implement.