Deriving coordinates for display (pixel) values

I have a few related questions. The first one is the main question I am trying to work through.
I am trying to obtain (as exact as it is possible) the coordinates for each pixel in a display(). I know how to convert z-buffer information into distance from camera. But I would like to pair this information with a set of coordinates (cartesian, spherical or cylindrical).

I believe this requires going from display(or view) coordinates to world coordinates. I have not found a way to do this yet. I know that through vtkrenderer I can get access to DisplayToWorld() function but I have not found any example on how to do this. Also, is this the fastest way? Controlling for the fact that not all pixels on the display may have valid world coordinates, would there be some transformation using matrix multiplications that could be used?

I appreciate any insights or guidance in this matter.


vtkRenderers DisplayToWorld() is quite straightforward. You pass a vtkVector3d with the display coordinates and get another vtkVector3d with the world coordinates.

You can use a simple code like this (untested):

world_coords = renderer.DisplayToWorld( display_coords )



Thanks Paulo! Yes… it would seem that this allows to process vectors one at a time and I was hoping for something that would allow me to process an entire set of vectors without a loop (just using matrix multiplication). However, I did using a loop and it seemed pretty fast.


To vectorize your computation, you can:

  1. Call your vtkCamera’s GetCompositeProjectionTransformMatrix() method ( to get a 4x4 matrix P (a 3D transform in homogeneous coordinate system). P combines the projection and view transforms;
  2. Invert P to get P’;
  3. Define the view-to-display matrix V:
M = [nx/2,    0, 0, nx/2,
        0, ny/2, 0, ny/2,
        0,    0, 1,    0,
        0,    0, 0,    1]

where nx and ny are the width and height of the display port in number of pixels;
4) Invert V to get V’;
5) Do x = P’ * V’ * c, where: * is the matrix multiplication, c is the matrix with the many input display coordinates and x is the matrix with the many output world coordinates. You have to fill the c matrix with the XY screen coordinates as n vector-columns: [X, Y, 0, 1], where n is the number of samples. So, the shape of c is 4 rows by n columns. x will have the same shape.
6) Normalize x: x[i,:] /= x[3,:], i = 0,1,2.

As a side note: here’s a tip for newcomers: the etiquette for a collaborative forum is that you mark the useful answer as a solution. This serves three purposes: 1) Signal to others which answer solves the posed problem; 2) Signal to the forum that the question was solved and, more importantly: 3) Motivate people to stop, read your question and elaborate an answer.


I had developed something like you had proposed, however, one of the difficulties I have seen is getting the projection matrix inverted. My understanding is that within VTK you should be able to do this (one of the instance methods of the 4x4Matrix is Invert(), yet when I have used this I do not get any result back (or None) which I interpret as VTK not being able to invert the matrix ( I guess this might be the case for instance if you are visualizing a terrain and the display coordinates correspond to world points that are not on the terrain). Any thoughts on this?

Yes,… I realized that and started doing this.


What to you mean by “not being able to invert”? Maybe your input matrix is singular, then the Invert() method won’t work. Computing the inverse of a singular matrix is akin to a division by zero. Ill-conditioned (near-singular) matrices also result in invalid inverses.

Anyway, there are several methods out there to invert matrices. Some are more robust, some are more accurate, some are faster… I don’t know how it is implemented in VTK, but computational linear algebra is not very trivial as there are very specialized libraries to handle that reliably.