... stored in 4 × 4 [homogeneous matrices](https://www.heuristic42.com/9/rendering/matrices/#homogeneous-matrices) and applied
by multiplying vertex coordinates. By pre-multiplying and combining transformations, a vertex
can be transformed to a desired space with a single operation. World space is common to all
instances and provides view-independent relative positions to objects in the scene.
With the scene defined, a viewer is necessary to relate a 2D image to the 3D world. Thus, a
virtual camera model is defined, somewhat implicitly in some cases, and is split into two parts:
a view and projection.
**Eye space**, or camera space, is given by the [camera](/8/rendering/cameras/)’s view, commonly position and direction.
The transformation can be thought of as moving the entire world/scene to provide the viewer
location and rotation, which despite appearances is mathematically inexpensive. This analogy
works for other transforms but may be more intuitive here. Like the previous spaces, eye space
is linear and infinite, with viewing volume bounds or projection yet to be defined.
**Clip space** exists after a projection has been applied and is 4D. A typical projection for a
virtual environment is a perspective projection, which computes intersections between a plane
and lines to each vertex. An alternative interpretation is that the x and y coordinates are
divided by $z$, due to the following normalization, to shrink distant objects in the image. Clip
space defines bounds for a viewing volume, where triangles outside can be discarded and those
intersecting can be clipped. Clipping is particularly necessary for triangles that cross the $w = 0$
plane where points would otherwise cause a division by zero when normalized. Perspective-correct vertex interpolation must be performed in clip space, before normalization.
**Normalized device coordinates** (NDC) are a result of the perspective divide, where
4D homogeneous vectors are normalized, $\mathsf{NDC} = \mathsf{clip}_{xyz}/\mathsf{clip}_w$, after the perspective transformation to clip space. The 3D components of visible points commonly range between -1 and 1. This really depends on the projection matrix. A notable case is a difference in the $z$ range between the projections created by OpenGL and DirectX software. To undo a perspective divide when transforming from NDC to eye space, simply transform by the inverse projection matrix and re-normalize with the same divide, rather than attempt to compute and scale by $w$.
**Image space** linearly scales NDC to the pixel dimensions. For example, $\frac{\mathsf{NDC_{xy}} + 1}{2} \mathsf{resolution}$ This is where rasterization would be performed. Alternatively, points in image space can be projected into 3D rays in world space (from the near to the far clipping planes for example), multiplying by the inverse projection and view matrices for raytracing.
3 or 4 × 4 [homogeneous matrices](https://www.heuristic42.com/9/rendering/matrices/#homogeneous-matrices) and applied
by multiplying vertex coordinates. By pre-multiplying and combining transformations, a vertex
can be transformed to a desired space with a single operation. World space is common to all
instances and provides view-independent relative positions to objects in the scene.
With the scene defined, a viewer is necessary to relate a 2D image to the 3D world. Thus, a
virtual camera model is defined, somewhat implicitly in some cases, and is split into two parts:
a view and projection.
**Eye space**, or camera space, is given by the [camera](/8/rendering/cameras/)’s view, commonly position and direction.
The transformation can be thought of as moving the entire world/scene to provide the viewer
location and rotation, which despite appearances is mathematically inexpensive. This analogy
works for other transforms but may be more intuitive here. Like the previous spaces, eye space
is linear and infinite, with viewing volume bounds or projection yet to be defined.
**Clip space** exists after a projection matrix has been applied.
As in the name, it is here that geometry is clipped to the viewing volume.
Orthographic projections are supported with the same operations but the purpose of clips space is much more apparent when discussing [perspective projections](/11/rendering/matrices/projection/#perspective).
It is actually a 4D space and the next step to *NDC* is a *perspective divide*, dividing transformed vectors by $w$.
While the vector space is 4D, $w$ is linearly dependent on $z$ so it is not a *basis*.
Clipping is necessary otherwise a vertex positions near or crossing the $w = 0$ boundary will produce incorrect results for perspective projections. This is due to a divide by zero or negatives that invert $x$ and $y$ coordinates.
One convenience is the projection had been applied so there is no need to calculate viewing volume planes in eye space were clipping to happen earlier.
The viewing volume to clip to is always $-w \geq x \geq w$, $-w \geq y \geq w$ and $-w \geq z \geq w$.
Another convenience is the space is linear so regular vector operations can, and must,
be performed before the perspective divide.
For example, $z$ becomes non-linear in *NDC*, although that's more of a depth buffer feature.
Perspective-correct vertex interpolation must be performed in clip space, before normalization.
The downside is it takes some extra extra maths to work with $w$. See [Clip Space](/27/rendering/spaces/clip_space/).
**Normalized device coordinates** (NDC) are a result of the [perspective divide](/11/rendering/matrices/projection/#clip-space-and-the-perspective-divide), where
4D homogeneous vectors are normalized, $\mathsf{NDC} = \mathsf{clip}_{xyz}/\mathsf{clip}_w$, after the perspective transformation to clip space. The 3D components of visible points commonly range between -1 and 1. This really depends on the projection matrix. A notable case is a difference in the $z$ range between the projections created by OpenGL and DirectX software. To undo a perspective divide when transforming from NDC to eye space, simply transform by the inverse projection matrix and re-normalize with the same divide, rather than attempt to compute and scale by $w$.
**Image space** linearly scales NDC to the pixel dimensions. For example, $\frac{\mathsf{NDC_{xy}} + 1}{2} \mathsf{resolution}$ This is where rasterization would be performed. Alternatively, points in image space can be projected into 3D rays in world space (from the near to the far clipping planes for example), multiplying by the inverse projection and view matrices for raytracing.
... stored in 4 × 4 [homogeneous matrices](https://www.heuristic42.com/9/rendering/matrices/#homogeneous-matrices) and applied
by multiplying vertex coordinates. By pre-multiplying and combining transformations, a vertex
can be transformed to a desired space with a single operation. World space is common to all
instances and provides view-independent relative positions to objects in the scene.
With the scene defined, a viewer is necessary to relate a 2D image to the 3D world. Thus, a
virtual camera model is defined, somewhat implicitly in some cases, and is split into two parts:
a view and projection.
**Eye space**, or camera space, is given by the [camera](/8/rendering/cameras/)’s view, commonly position and direction.
The transformation can be thought of as moving the entire world/scene to provide the viewer
location and rotation, which despite appearances is mathematically inexpensive. This analogy
works for other transforms but may be more intuitive here. Like the previous spaces, eye space
is linear and infinite, with viewing volume bounds or projection yet to be defined.
**Clip space** exists after a projection has been applied and is 4D. A typical projection for a
virtual environment is a perspective projection, which computes intersections between a plane
and lines to each vertex. An alternative interpretation is that the x and y coordinates are
divided by $z$, due to the following normalization, to shrink distant objects in the image. Clip
space defines bounds for a viewing volume, where triangles outside can be discarded and those
intersecting can be clipped. Clipping is particularly necessary for triangles that cross the $w = 0$
plane where points would otherwise cause a division by zero when normalized. Perspective-correct vertex interpolation must be performed in clip space, before normalization.
**Normalized device coordinates** (NDC) are a result of the perspective divide, where
4D homogeneous vectors are normalized, $\mathsf{NDC} = \mathsf{clip}_{xyz}/\mathsf{clip}_w$, after the perspective transformation to clip space. The 3D components of visible points commonly range between -1 and 1. This really depends on the projection matrix. A notable case is a difference in the $z$ range between the projections created by OpenGL and DirectX software. To undo a perspective divide when transforming from NDC to eye space, simply transform by the inverse projection matrix and re-normalize with the same divide, rather than attempt to compute and scale by $w$.
**Image space** linearly scales NDC to the pixel dimensions. For example, $\frac{\mathsf{NDC_{xy}} + 1}{2} \mathsf{resolution}$ This is where rasterization would be performed. Alternatively, points in image space can be projected into 3D rays in world space (from the near to the far clipping planes for example), multiplying by the inverse projection and view matrices for raytracing.
3 or 4 × 4 [homogeneous matrices](https://www.heuristic42.com/9/rendering/matrices/#homogeneous-matrices) and applied
by multiplying vertex coordinates. By pre-multiplying and combining transformations, a vertex
can be transformed to a desired space with a single operation. World space is common to all
instances and provides view-independent relative positions to objects in the scene.
With the scene defined, a viewer is necessary to relate a 2D image to the 3D world. Thus, a
virtual camera model is defined, somewhat implicitly in some cases, and is split into two parts:
a view and projection.
**Eye space**, or camera space, is given by the [camera](/8/rendering/cameras/)’s view, commonly position and direction.
The transformation can be thought of as moving the entire world/scene to provide the viewer
location and rotation, which despite appearances is mathematically inexpensive. This analogy
works for other transforms but may be more intuitive here. Like the previous spaces, eye space
is linear and infinite, with viewing volume bounds or projection yet to be defined.
**Clip space** exists after a projection matrix has been applied.
As in the name, it is here that geometry is clipped to the viewing volume.
Orthographic projections are supported with the same operations but the purpose of clips space is much more apparent when discussing [perspective projections](/11/rendering/matrices/projection/#perspective).
It is actually a 4D space and the next step to *NDC* is a *perspective divide*, dividing transformed vectors by $w$.
While the vector space is 4D, $w$ is linearly dependent on $z$ so it is not a *basis*.
Clipping is necessary otherwise a vertex positions near or crossing the $w = 0$ boundary will produce incorrect results for perspective projections. This is due to a divide by zero or negatives that invert $x$ and $y$ coordinates.
One convenience is the projection had been applied so there is no need to calculate viewing volume planes in eye space were clipping to happen earlier.
The viewing volume to clip to is always $-w \geq x \geq w$, $-w \geq y \geq w$ and $-w \geq z \geq w$.
Another convenience is the space is linear so regular vector operations can, and must,
be performed before the perspective divide.
For example, $z$ becomes non-linear in *NDC*, although that's more of a depth buffer feature.
Perspective-correct vertex interpolation must be performed in clip space, before normalization.
The downside is it takes some extra extra maths to work with $w$. See [Clip Space](/27/rendering/spaces/clip_space/).
**Normalized device coordinates** (NDC) are a result of the [perspective divide](/11/rendering/matrices/projection/#clip-space-and-the-perspective-divide), where
4D homogeneous vectors are normalized, $\mathsf{NDC} = \mathsf{clip}_{xyz}/\mathsf{clip}_w$, after the perspective transformation to clip space. The 3D components of visible points commonly range between -1 and 1. This really depends on the projection matrix. A notable case is a difference in the $z$ range between the projections created by OpenGL and DirectX software. To undo a perspective divide when transforming from NDC to eye space, simply transform by the inverse projection matrix and re-normalize with the same divide, rather than attempt to compute and scale by $w$.
**Image space** linearly scales NDC to the pixel dimensions. For example, $\frac{\mathsf{NDC_{xy}} + 1}{2} \mathsf{resolution}$ This is where rasterization would be performed. Alternatively, points in image space can be projected into 3D rays in world space (from the near to the far clipping planes for example), multiplying by the inverse projection and view matrices for raytracing.