Réponses:
Oui, vous pouvez ajouter un vecteur dans le cas d'une traduction. La raison d'utiliser une matrice revient à avoir un moyen uniforme de gérer différentes transformations combinées.
Par exemple, la rotation est généralement effectuée à l'aide d'une matrice (consultez le commentaire @MickLH pour connaître d'autres moyens de gérer les rotations), afin de traiter plusieurs transformations (rotation / translation / mise à l'échelle / projection, etc.) de manière uniforme. vous devez les encoder dans une matrice.
Eh bien, plus techniquement parlant; une transformation mappe un point / vecteur sur un autre point / vecteur.
p` = T(p);
où p` est le point transformé et T (p) la fonction de transformation.
Étant donné que nous n'utilisons pas de matrice, nous devons le faire pour combiner plusieurs transformations:
p1 = T (p);
p final = M (p1);
Une matrice ne peut pas seulement combiner plusieurs types de transformations en une seule matrice (par exemple, affine, linéaire, projective).
Using a matrix gives us the opportunity to combine chains of transformations and then batch multiply them. This saves us a ton of cycles usually by the GPU (thanks to @ChristianRau for pointing it out).
Tfinal = T * R * P; // translaterotateproject
pfinal = Tfinal*p;
It's also good to point out that GPUs and even some CPUs are optimized for vector operations; CPUs using SIMD and GPUs being data driven parallel processors by design, so using matrices fits perfectly with hardware acceleration (actually, GPUs were designed to fit matrix/vector operations).
If all you are ever going to do is move along a single axis and never apply any other transformation then what you are suggesting is fine.
The real power of using a matrix is that you can easily concatenate a series of complex operations together, and apply the same series of operations to multiple objects.
Most cases aren't that simple and if you rotate you object first, and want to transform along its local axes instead of the world axes you'll find you can't simply add 10 to one of the numbers and have it work out correctly.
To succinctly answer the "why" question, it's because a 4x4 matrix can describe rotation, translation, and scaling operations all at once. Being able to describe any of these in a consistent manner simplifies a lot of things.
Different kinds of transformations can be more simply represented with a different mathematical operations. As you note, translation can be done just by adding. Uniform scaling by multiplying by a scalar. But an appropriately crafted 4x4 matrix can do anything. So using 4x4's consistently makes code and interfaces much simpler. You pay some complexity in understanding these 4x4's, but then lots of things get easier and faster because of it.
the reason to use a 4x4 matrix is so that the operation is a linear transformation. this is an example of homogeneous coordinates. The same thing is done in the 2d case (using a 3x3 matrix). The reason for using homogeneous coordinates is so that all 3 geometric tansformations can be done using one operation; otherwise one would need to do a 3x3 matrix multiply and a 3x3 matrix addition (for the translation). this link from cegprakash is useful.
Translations cannot be represented by 3D matrices
A simple argument is that translation can take the origin vector:
0
0
0
away from the origin, say to x = 1
:
1
0
0
But that would require a matrix such that:
| a b c | |0| |1|
| d e f | * |0| = |0|
| g h i | |0| |0|
But that is impossible.
Another argument is the Singular Value Decomposition theorem, which says that every matrix can be made up with two rotation and one scaling operation. No translations there.
Why matrices can be used?
Many modeled objects (e.g. a car chassis) or part of modeled objects (e.g. a car tire, a driving wheel) are solids: the distances between vertexes never change.
The only transformations we want to do to on them are rotations and translations.
Matrix multiplication can encode both rotations and translations.
Rotation matrices have explicit formulas, e.g.: a 2D rotation matrix for angle a
is of form:
cos(a) -sin(a)
sin(a) cos(a)
There are analogous formulas for 3D, but note that 3D rotations take 3 parameters instead of just 1.
Translations are less trivial and will be discussed later. They are the reason we need 4D matrices.
Why is it cool to use matrices?
Because the composition of multiple matrices can be pre-calculated by matrix multiplication.
E.g., if we are going to translate one thousand vectors v
of our car chassis with matrix T
and then rotate with matrix R
, instead of doing:
v2 = T * v
and then:
v3 = R * v2
for each vector, we can pre-calculate:
RT = R * T
and then do just one multiplication for every vertex:
v3 = RT * v
Even better: if we then want place the vertexes of tire and driving wheel relative to the car, we just multiply the previous matrix RT
by the matrix relative to the car itself.
This naturally leads to maintaining a stack of matrices:
How adding one dimension solves the problem
Let's consider the case from 1D to 2D which is easier to visualize.
A matrix in 1D is just one number, and as we've seen in 3D it can't do a translation, only a scaling..
But if we add the extra dimension as:
| 1 dx | * |x| = | x + dx |
| 0 1 | |1| | 1 |
and we then forget about the new extra dimension, we get:
x + dx
as we wanted.
This 2D transformation is so important that it has a name: shear transformation.
It is cool to visualize this transformation:
Note how every horizontal line (fixed y
) is just translated.
We just happened to take the line y = 1
as our new 1D line, and translate it with a 2D matrix.
Things are analogous in 3D, with 4D shear matrices of the form:
| 1 0 0 dx | | x | | x + dx |
| 0 1 0 dy | * | y | = | y + dy |
| 0 0 1 dz | | z | | z + dz |
| 0 0 0 1 | | 1 | | 1 |
And our old 3D rotations / scaling are now of form:
| a b c 0 |
| d e f 0 |
| g h i 0 |
| 0 0 0 1 |
This Jamie King video tutorial is also worth watching.
Affine space
Affine space is the space generated by all our 3D linear transformations (matrix multiplications) together with the 4D shear (3D translations).
If we multiply a shear matrix and a 3D linear transformation, we always get something of the form:
| a b c dx |
| d e f dy |
| g h i dz |
| 0 0 0 1 |
This is the most general possible affine transformation, which does 3D rotation / scaling and translation.
One important property is that if we multiply 2 affine matrices:
| a b c dx | | a2 b2 c2 dx2 |
| d e f dy | * | d2 e2 f2 dy2 |
| g h i dz | | g2 h2 i2 dz2 |
| 0 0 0 1 | | 0 0 0 1 |
we always get another affine matrix of form:
| a3 b3 c3 (dx + dx2) |
| d3 e3 f3 (dy + dy2) |
| g3 h3 i3 (dz + dz2) |
| 0 0 0 1 |
Mathematicians call this property closure, and is required to define a space.
For us, it means that we can keep doing matrix multiplications to per-calculate final transformations happily, which is why use used matrices in the first place, without ever getting more general 4D linear transformations which are not affine.
Frustum projection
But wait, there is one more important transformation that we do all the time: glFrustum
, which makes an object 2x further, appear 2x smaller.
First get some intuition about glOrtho
vs glFrustum
at: https://stackoverflow.com/questions/2571402/explain-the-usage-of-glortho/36046924#36046924
glOrtho
can be done just with translations + scaling , but how can we implement glFrustum
with matrices?
Suppose that:
z = -1
that is a square of length 2z = -2
If only we allowed more general 4-vectors of type:
(x, y, z, w)
with w != 0
, and in addition we identify every (x, y, z, w)
with (x/w, y/w, z/w, 1)
, then a frustum transformation with the matrix would be:
| 1 0 0 0 | | x | | x | | x / -z |
| 0 1 0 0 | * | y | = | y | identified to | y / -z |
| 0 0 1 0 | | z | | z | | -1 |
| 0 0 -1 0 | | w | | -z | | 0 |
If we throw away z
and w
at the end, we get:
x_proj = x / -z
y_proj = y / -z
which is exactly what we wanted! We can verify that for some values, e.g.:
z == -1
, exactly on the plane we're projecting to, x_proj == x
and y_proj == y
.z == -2
, then x_proj = x/2
: objects are half size.Note how the glFrustum
transform is not of affine form: it cannot be implemented just with rotations and translations.
The mathematical "trickery" of adding the w
and dividing by it is called homogeneous coordinates
See also: related Stack Overflow question: https://stackoverflow.com/questions/2465116/understanding-opengl-matrices
See this video to understand the concepts of model, view and projection.
4x4 matrices are not just used for translating a 3D object. But also for various other purposes.
See this to understand how the vertices in the world are represented as 4D Matrices and how they are transformed.