by Richard Russell, August 2015
Note that the code in this tutorial requires Windows 8.1 or Windows 10
This tutorial is closely based on Microsoft's Direct 3D 11 Tutorial 4 but with the code translated from C++ to BBC BASIC for Windows. You should refer to the original for a detailed explanation of the code.
In the previous tutorial, we successfully rendered a triangle in the centre of our application window. We haven't paid much attention to the vertex positions that we have picked in our vertex buffer. In this tutorial, we will delve into the details of 3D positions and transformation.
The outcome of this tutorial will be a 3D object rendered to screen. Whereas previous tutorials focused on rendering a 2D object onto a 3D world, here we show a 3D object.
The source files, libraries etc. may be downloaded from here.
In the previous tutorial, the vertices of the triangle were placed strategically to perfectly align themselves on the screen. However, this will not always be the case. Thus, we need a system to denote objects in 3D space and a system to display them.
World space is a space shared by every object in the scene. It is used to define spatial relationship between objects that we wish to render. The X axis goes to our right; the Y axis goes up; and the Z axis goes forward, the same direction as we are facing.
View space, sometimes called camera space, is similar to world space in that it is typically used for the entire scene. However, in view space, the origin is at the viewer or camera. The view direction (where the viewer is looking) defines the positive Z axis. An “up” direction defined by the application becomes the positive Y axis.
Projection space refers to the space after applying projection transformation from view space. In this space, visible content has X and Y coordinates ranging from -1 to 1, and Z coordinate ranging from 0 to 1.
Screen space is often used to refer to locations in the frame buffer. Because frame buffer is usually a 2D texture, screen space is a 2D space. The top-left corner is the origin with coordinates (0, 0). The positive X goes to right and positive Y goes down. For a buffer that is w pixels wide and h pixels high, the most lower-right pixel has the coordinates (w - 1, h - 1).
Transformation is most commonly used to convert vertices from one space to another. In 3D computer graphics, there are logically three such transformations in the pipeline: world, view, and projection transformation. Individual transformation operations such as translation, rotation, and scaling are covered in the next tutorial.
World transformation converts vertices from object space to world space. It usually consists of one or more scaling, rotation, and translation, based on the size, orientation, and position we would like to give to the object. Every object in the scene has its own world transformation matrix. This is because each object has its own size, orientation, and position.
After vertices are converted to world space, view transformation converts those vertices from world space to view space. Recall from earlier discussion that view space is what the world appears from the viewer's (or camera's) perspective. In view space, the viewer is located at origin looking out along the positive Z axis.
Projection transformation converts vertices from 3D spaces, such as world and view spaces, to projection space. One of the parameters that defines the transformation is called the field-of-view (FOV). FOV denotes which objects are visible from a particular position, while looking in a particular direction. In computer graphics, the FOV is contained in a view frustum; the view frustum is defined by 6 planes in 3D. Two of these planes are parallel to the XY plane, these are called the near-Z and far-Z planes. The other four planes are defined by the viewer's horizontal and vertical field of view.
In the previous tutorial, we wrote a program that renders a single triangle to screen. When we create the vertex buffer, the vertex positions that we use are directly in projection space so that we don't have to perform any transformation. Now that we have an understanding of 3D space and transformation, we are going to modify the program so that the vertex buffer is defined in object space, as it should be. Then, we will modify our vertex shader to transform the vertices from object space to projection space.
Since we started representing things in three dimensions, we have changed the flat triangle from the previous tutorial to a cube. This will allow us to demonstrate these concepts much clearer.
REM Create vertex buffer: DIM vertices{(7)} = SimpleVertex{} FOR v% = 0 TO 7 READ x, y, z, r, g, b, a vertices{(v%)}.Pos.x% = FN_f4(x) vertices{(v%)}.Pos.y% = FN_f4(y) vertices{(v%)}.Pos.z% = FN_f4(z) vertices{(v%)}.Color.x% = FN_f4(r) vertices{(v%)}.Color.y% = FN_f4(g) vertices{(v%)}.Color.z% = FN_f4(b) vertices{(v%)}.Color.w% = FN_f4(a) NEXT v% DATA -1.0, 1.0, -1.0, 0.0, 0.0, 1.0, 1.0 DATA 1.0, 1.0, -1.0, 0.0, 1.0, 0.0, 1.0 DATA 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0 DATA -1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0 DATA -1.0, -1.0, -1.0, 1.0, 0.0, 1.0, 1.0 DATA 1.0, -1.0, -1.0, 1.0, 1.0, 0.0, 1.0 DATA 1.0, -1.0, 1.0, 1.0, 1.0, 1.0, 1.0 DATA -1.0, -1.0, 1.0, 0.0, 0.0, 0.0, 1.0
On a cube, many triangles will be sharing the same vertex and it would be a waste of space to redefine the same points over and over again. As such, there is a method to specify just the eight points, and then let Direct3D know which points to pick for a triangle. This is done through an index buffer. An index buffer will contain a list, which will refer to the index of vertices in the buffer, to specify which points to use in each triangle. The code below shows which points make up each of our triangles:
REM Create index buffer: DIM indices{(5,5)} = WORD{} FOR r% = 0 TO DIM(indices{()},1) FOR c% = 0 TO DIM(indices{()},2) READ indices{(r%,c%)}.l& NEXT c% NEXT r% DATA 3,1,0,2,1,3 DATA 0,5,4,1,5,0 DATA 3,4,7,0,4,3 DATA 1,6,5,2,6,1 DATA 2,7,6,3,7,2 DATA 6,4,5,7,4,6 bd.Usage% = D3D11_USAGE_DEFAULT bd.ByteWidth% = DIM(WORD{}) * (DIM(indices{()},1)+1) * (DIM(indices{()},2)+1) bd.BindFlags% = D3D11_BIND_INDEX_BUFFER bd.CPUAccessFlags% = 0 InitData.pSysMem% = indices{(0,0)} SYS ID3D11Device.CreateBuffer%, pd3dDevice%, bd{}, InitData{}, ^pIndexBuffer% TO hr% IF hr% <> 0 OR pIndexBuffer% = 0 ERROR 100, "ID3D11Device::CreateBuffer (index) failed: "+STR$~hr% REM Set index buffer: SYS ID3D11DeviceContext.IASetIndexBuffer%, pImmediateContext%, pIndexBuffer%, \ \ DXGI_FORMAT_R16_UINT, 0
In our vertex shader from the previous tutorial, we take the input vertex position and output the same position without any modification. We can do this because the input vertex position is already defined in projection space. Now, because the input vertex position is defined in object space, we must transform it before outputting from the vertex shader. We do this with three steps: transform from object to world space, transform from world to view space, and transform from view to projection space. A vector is transformed by multiplying the vector by a matrix. In HLSL, this is done using the mul() intrinsic function. Our variable declaration and new vertex shader are shown below:
// // Vertex Shader // VS_OUTPUT VS( float4 Pos : POSITION, float4 Color : COLOR ) { VS_OUTPUT output = (VS_OUTPUT)0; output.Pos = mul( Pos, World ); output.Pos = mul( output.Pos, View ); output.Pos = mul( output.Pos, Projection ); output.Color = Color; return output; }
We have updated our vertex shader to transform using matrices, but we also need to define three matrices in our program:
REM Define the three transformation matrices: DIM mWorld(3,3), mView(3,3), mProjection(3,3)
In addition to the matrices, we also need an ID3D11Buffer object that represents the constant buffer into which they will be transferred:
REM Create the constant buffer: DIM ConstantBuffer{mWorld{}=XMMATRIX{},mView{}=XMMATRIX{},mProjection{}=XMMATRIX{}} bd.Usage% = D3D11_USAGE_DEFAULT bd.ByteWidth% = DIM(ConstantBuffer{}) bd.BindFlags% = D3D11_BIND_CONSTANT_BUFFER bd.CPUAccessFlags% = 0 SYS ID3D11Device.CreateBuffer%, pd3dDevice%, bd{}, NULL, ^pConstantBuffer% TO hr% IF hr% <> 0 OR pConstantBuffer% = 0 ERROR 100, "ID3D11Device::CreateBuffer (constant) failed: "+STR$~hr%
The next thing that we need to do is come up with three matrices that we will use to do the transformation. We would like to set up our camera so that it is situated at [0 1 -5], looking at the point [0 1 0]. We can call PROC_MatrixLookAtLH() to conveniently compute a view matrix for us using the up vector [0 1 0] since we would like the +Y direction to always stay at top:
REM Initialize the view matrix: DIM Eye(2), At(2), Up(2) Eye() = 0.0, 1.0, -5.0 : At() = 0.0, 0.0, 0.0 : Up() = 0.0, 1.0, 0.0 PROC_MatrixLookAtLH(mView(), Eye(), At(), Up())
Finally, to come up with a projection matrix, we call PROC_MatrixPerspectiveFovLH(), with a 90 degree vertical field of view (pi/2), an aspect ratio of 640/512 which is from our back buffer size, and near and far Z at 0.1 and 110, respectively. This means that anything closer than 0.1 or further than 110 will not be visible on the screen:
REM Initialize the projection matrix: PROC_MatrixPerspectiveFovLH(mProjection(), PI/2, Width%/Height%, 0.01, 100)
We have the matrices, and now we must write them to the constant buffer when rendering so that the GPU can read them. Also, because matrices are arranged differently in memory in C++ and HLSL, we must transpose the matrices before updating them:
PROC_MatrixTranspose(ConstantBuffer{}, ConstantBuffer.mWorld{}, mWorld()) PROC_MatrixTranspose(ConstantBuffer{}, ConstantBuffer.mView{}, mView()) PROC_MatrixTranspose(ConstantBuffer{}, ConstantBuffer.mProjection{}, mProjection()) SYS ID3D11DeviceContext.UpdateSubresource%, pImmediateContext%, pConstantBuffer%, \ \ 0, NULL, ConstantBuffer{}, 0, 0