10. Write about perspective image transformation.

April 30, 2013 legend

>Table of contents

A perspective transformation (also called an imaging transformation) projects 3D points onto a
plane. Perspective transformations play a central role in image processing because they provide
an approximation to the manner in which an image is formed by viewing a 3D world. These
transformations are fundamentally different, because they are nonlinear in that they involve
division by coordinate values.

Figure 10 shows a model of the image formation process. The camera coordinate system (x, y, z)
has the image plane coincident with the xy plane and the optical axis (established by the center
of the lens) along the z axis. Thus the center of the image plane is at the origin, and the centre of
the lens is at coordinates (0.0, λ). If the camera is in focus for distant objects, λ is the focal length
of the lens. Here the assumption is that the camera coordinate system is aligned with the world
coordinate system (X, Y, Z).

Let (X,Y,Z) be the world co-ordinates of any point in a 3-D scene, as shown in the Fig 10. We assume throughout the following discussion that Z>λ ; that is all points of interest lie in front of the lens.The first step is to obtain a relationship that gives the coordinates (x,y) of the projection of the point (X,Y,Z) onto the image plane.This is easily accomplished by the use of similar triangles. With reference to Fig 10,

Fig 10 Basic model of the imaging process The camera coordinate system (x, y, z) is aligned with the world coordinate system (X, Y, Z)

Where the negative signs in front of X and Y indicate that image points are actually inverted, as the geometry of Fig 10 shows.
The image-plane coordinates of the projected 3-D point follow directly from above equations

These equations are nonlinear because they involve division by the variable Z. Although we could use them directly as shown, it is often convenient to express them in linear matrix form, for rotation, translation and scaling. This is easily accomplished by dividing the first three homogeneous coordinates by the fourth. A point in the cartesian world coordinate system may be expressed in vector form as

and its homogeneous counterpart is

If we define the perspective transformation matrix as

The product PW_h yields a vector denoted C_h

_{C_h=PW_h}

The element of c_h is the camera coordinates in homogeneous form. As indicated, these
coordinates can be converted to Cartesian form by dividing each of the first three components of
ch by the fourth. Thus the Cartesian of any point in the camera coordinate system are given in
vector form by

The first two components of c are the (x, y) coordinates in the image plane of a projected 3-D

point (X, Y, Z). The third component is of no interest in terms of the model in Fig. 10. As shown
next, this component acts as a free variable in the inverse perspective transformation

The inverse perspective transformation maps an image point back into 3-D.

w_h=P^-1C_h

Where P^-1 is

Suppose that an image point has coordinates (x_o, y_o, 0), where the 0 in the z location simply

indicates that the image plane is located at z = 0. This point may be expressed in homogeneous
vector form as

or, in Cartesian coordinates

This result obviously is unexpected because it gives Z = 0 for any 3-D point. The problem here is

caused by mapping a 3-D scene onto the image plane, which is a many-to-one transformation.
The image point (x₀, y₀) corresponds to the set of collinear 3-D points that lie on the line passing
through (x_o, y_o, 0) and (0, 0, λ). The equation of this line in the world coordinate system; that is,

Equations above show that unless something is known about the 3-D point that generated an

image point (for example, its Z coordinate) it is not possible to completely recover the 3-D point
from its image. This observation, which certainly is not unexpected, can be used to formulate the
inverse perspective transformation by using the z component of ch as a free variable instead of 0.
Thus, by letting