Pourquoi utilisons-nous des matrices 4x4 pour transformer des objets en 3D?


36

Pour traduire un vecteur de 10 unités dans la direction X, pourquoi devons-nous utiliser une matrice?

enter image description here

Nous pouvons simplement ajouter 10 au tapis [0] [0], et nous avons aussi le même résultat.

Réponses:


28

Oui, vous pouvez ajouter un vecteur dans le cas d'une traduction. La raison d'utiliser une matrice revient à avoir un moyen uniforme de gérer différentes transformations combinées.

Par exemple, la rotation est généralement effectuée à l'aide d'une matrice (consultez le commentaire @MickLH pour connaître d'autres moyens de gérer les rotations), afin de traiter plusieurs transformations (rotation / translation / mise à l'échelle / projection, etc.) de manière uniforme. vous devez les encoder dans une matrice.

Eh bien, plus techniquement parlant; une transformation mappe un point / vecteur sur un autre point / vecteur.

p` = T(p); 

où p` est le point transformé et T (p) la fonction de transformation.

Étant donné que nous n'utilisons pas de matrice, nous devons le faire pour combiner plusieurs transformations:

p1 = T (p);

p final = M (p1);

Une matrice ne peut pas seulement combiner plusieurs types de transformations en une seule matrice (par exemple, affine, linéaire, projective).

Using a matrix gives us the opportunity to combine chains of transformations and then batch multiply them. This saves us a ton of cycles usually by the GPU (thanks to @ChristianRau for pointing it out).

Tfinal = T * R * P; // translaterotateproject

pfinal = Tfinal*p;

It's also good to point out that GPUs and even some CPUs are optimized for vector operations; CPUs using SIMD and GPUs being data driven parallel processors by design, so using matrices fits perfectly with hardware acceleration (actually, GPUs were designed to fit matrix/vector operations).


yes, i know that matrix is useful for rotation. but every tutorial guide me using matrix to do such simple calculation :D
ngoaho91

1
Saying rotation can "only" be done with a matrix is incorrect, off the top of my head Quaternions and Trigonometry would work just fine also
MickLH

17
And even more than that, once you have rotation and translation both as 4x4 matrices, you can just multiply them and have the combined transformation in one single matrix without the need to transform every vertex by a thousands of different transformations using different constructs. The fact that a 4x4 matrix is overkill for a single translation or a single rotation is outweight by the fact that you usually don't just transform a vertex by single translation or a single rotation.
Chris says Reinstate Monica

1
@concept3d Yeah, I know, the answer is good. Yet the even bigger advantage gained from the uniform way of using a matrix is not only uniformity, but representation of an entire chain of transformations in a single operation. While that might have been implied, I found it unclear and important enough to mention it explicitly. But the answer was still good anyway, it wasn't a critique.
Chris says Reinstate Monica

1
Yes, trig calculates the rotational matrix, but vector math actually "rotates" the points using the trig-infused dataset. When I said trigonometry, I was implying using it directly, not through a matrix, to generate some simple things.
MickLH

6

If all you are ever going to do is move along a single axis and never apply any other transformation then what you are suggesting is fine.

The real power of using a matrix is that you can easily concatenate a series of complex operations together, and apply the same series of operations to multiple objects.

Most cases aren't that simple and if you rotate you object first, and want to transform along its local axes instead of the world axes you'll find you can't simply add 10 to one of the numbers and have it work out correctly.


5

To succinctly answer the "why" question, it's because a 4x4 matrix can describe rotation, translation, and scaling operations all at once. Being able to describe any of these in a consistent manner simplifies a lot of things.

Different kinds of transformations can be more simply represented with a different mathematical operations. As you note, translation can be done just by adding. Uniform scaling by multiplying by a scalar. But an appropriately crafted 4x4 matrix can do anything. So using 4x4's consistently makes code and interfaces much simpler. You pay some complexity in understanding these 4x4's, but then lots of things get easier and faster because of it.


2
This should have been the selected answer.
Engineer

4

the reason to use a 4x4 matrix is so that the operation is a linear transformation. this is an example of homogeneous coordinates. The same thing is done in the 2d case (using a 3x3 matrix). The reason for using homogeneous coordinates is so that all 3 geometric tansformations can be done using one operation; otherwise one would need to do a 3x3 matrix multiply and a 3x3 matrix addition (for the translation). this link from cegprakash is useful.


2
You should elaborate. A succinct explanation is better than only linking to wikipedia.
Seth Battin

3

Translations cannot be represented by 3D matrices

A simple argument is that translation can take the origin vector:

0
0
0

away from the origin, say to x = 1:

1
0
0

But that would require a matrix such that:

| a b c |   |0|   |1|
| d e f | * |0| = |0|
| g h i |   |0|   |0|

But that is impossible.

Another argument is the Singular Value Decomposition theorem, which says that every matrix can be made up with two rotation and one scaling operation. No translations there.

Why matrices can be used?

Many modeled objects (e.g. a car chassis) or part of modeled objects (e.g. a car tire, a driving wheel) are solids: the distances between vertexes never change.

The only transformations we want to do to on them are rotations and translations.

Matrix multiplication can encode both rotations and translations.

Rotation matrices have explicit formulas, e.g.: a 2D rotation matrix for angle a is of form:

cos(a) -sin(a)
sin(a)  cos(a)

There are analogous formulas for 3D, but note that 3D rotations take 3 parameters instead of just 1.

Translations are less trivial and will be discussed later. They are the reason we need 4D matrices.

Why is it cool to use matrices?

Because the composition of multiple matrices can be pre-calculated by matrix multiplication.

E.g., if we are going to translate one thousand vectors v of our car chassis with matrix T and then rotate with matrix R, instead of doing:

v2 = T * v

and then:

v3 = R * v2

for each vector, we can pre-calculate:

RT = R * T

and then do just one multiplication for every vertex:

v3 = RT * v

Even better: if we then want place the vertexes of tire and driving wheel relative to the car, we just multiply the previous matrix RT by the matrix relative to the car itself.

This naturally leads to maintaining a stack of matrices:

  • calculate chassis matrix
  • multiply by tire matrix (push)
  • remove tire matrix (pop)
  • multiply by driving wheel matrix (push)
  • ...

How adding one dimension solves the problem

Let's consider the case from 1D to 2D which is easier to visualize.

A matrix in 1D is just one number, and as we've seen in 3D it can't do a translation, only a scaling..

But if we add the extra dimension as:

| 1 dx | * |x|  = | x + dx |
| 0  1 |   |1|    |      1 |

and we then forget about the new extra dimension, we get:

x + dx

as we wanted.

This 2D transformation is so important that it has a name: shear transformation.

It is cool to visualize this transformation:

Image source.

Note how every horizontal line (fixed y) is just translated.

We just happened to take the line y = 1 as our new 1D line, and translate it with a 2D matrix.

Things are analogous in 3D, with 4D shear matrices of the form:

| 1 0 0 dx |   | x |   | x + dx |
| 0 1 0 dy | * | y | = | y + dy |
| 0 0 1 dz |   | z |   | z + dz |
| 0 0 0  1 |   | 1 |   |      1 |

And our old 3D rotations / scaling are now of form:

| a b c 0 |
| d e f 0 |
| g h i 0 |
| 0 0 0 1 |

This Jamie King video tutorial is also worth watching.

Affine space

Affine space is the space generated by all our 3D linear transformations (matrix multiplications) together with the 4D shear (3D translations).

If we multiply a shear matrix and a 3D linear transformation, we always get something of the form:

| a b c dx |
| d e f dy |
| g h i dz |
| 0 0 0  1 |

This is the most general possible affine transformation, which does 3D rotation / scaling and translation.

One important property is that if we multiply 2 affine matrices:

| a b c dx |   | a2 b2 c2 dx2 |
| d e f dy | * | d2 e2 f2 dy2 |
| g h i dz |   | g2 h2 i2 dz2 |
| 0 0 0  1 |   |  0  0  0   1 |

we always get another affine matrix of form:

| a3 b3 c3 (dx + dx2) |
| d3 e3 f3 (dy + dy2) |
| g3 h3 i3 (dz + dz2) |
|  0  0  0          1 |

Mathematicians call this property closure, and is required to define a space.

For us, it means that we can keep doing matrix multiplications to per-calculate final transformations happily, which is why use used matrices in the first place, without ever getting more general 4D linear transformations which are not affine.

Frustum projection

But wait, there is one more important transformation that we do all the time: glFrustum, which makes an object 2x further, appear 2x smaller.

First get some intuition about glOrtho vs glFrustum at: https://stackoverflow.com/questions/2571402/explain-the-usage-of-glortho/36046924#36046924

glOrtho can be done just with translations + scaling , but how can we implement glFrustum with matrices?

Suppose that:

  • our eye is at the origin, looking at -z
  • the screen (near plane) is at z = -1 that is a square of length 2
  • the far plane of the frustum is at z = -2

If only we allowed more general 4-vectors of type:

(x, y, z, w)

with w != 0, and in addition we identify every (x, y, z, w) with (x/w, y/w, z/w, 1), then a frustum transformation with the matrix would be:

| 1 0  0 0 |   | x |   |  x |               | x / -z |
| 0 1  0 0 | * | y | = |  y | identified to | y / -z |
| 0 0  1 0 |   | z |   |  z |               |     -1 |
| 0 0 -1 0 |   | w |   | -z |               |      0 |

If we throw away z and w at the end, we get:

  • x_proj = x / -z
  • y_proj = y / -z

which is exactly what we wanted! We can verify that for some values, e.g.:

  • if z == -1, exactly on the plane we're projecting to, x_proj == x and y_proj == y.
  • if z == -2, then x_proj = x/2: objects are half size.

Note how the glFrustum transform is not of affine form: it cannot be implemented just with rotations and translations.

The mathematical "trickery" of adding the w and dividing by it is called homogeneous coordinates

See also: related Stack Overflow question: https://stackoverflow.com/questions/2465116/understanding-opengl-matrices


@Downvoters, please explain so I can learn and improve.
Ciro Santilli 新疆改造中心法轮功六四事件

Personally I think this is just long and rambling, the part that addresses the original question isn't anything new (that isn't well covered by other answers) and the rest of it is irrelevant, making it really hard to wade through.
Josh

@JoshPetrie thanks for feedback! I think those who don't already understand would be more likely to understand from my answer, as it is more exemplified and visual. If you find specific errors or points which are completely irrelevant, point them so I can improve. Cheers.
Ciro Santilli 新疆改造中心法轮功六四事件

As I said, I think most of the answer is irrelevant. The question asks "why use 4x4 matrices, why can't we just add?" The answer to that is well covered with an explaination like "yes, you can add, but a matrix lets you translate/rotate/scale as well, but due to how matrix math works a 3x3 can't encode the translate, but a 4x4 one can." If you cover that at all in this wall of text, it's very hard to find. The rest of it is a primer on matrix math that was not asked about, and while it would probably be fine as an answer to another question, I don't think it's a good fit with this question.
Josh

1
I appreciated the attention to detail. To address the previous user's concern, the answer should be rearranged to start at "Translations cannot be represented by 3D matrices". This answers the immediate question posed and the OP can continue on to the further well written and enthusiastic details provided; those finer details being what I'm interested in here so I may be biased, but this is certainly not "rambling".
dskinner

1

See this video to understand the concepts of model, view and projection.

4x4 matrices are not just used for translating a 3D object. But also for various other purposes.

See this to understand how the vertices in the world are represented as 4D Matrices and how they are transformed.


1
This doesn't actually answer the OP question.
concept3d

Edited. Sounds good?
cegprakash
En utilisant notre site, vous reconnaissez avoir lu et compris notre politique liée aux cookies et notre politique de confidentialité.
Licensed under cc by-sa 3.0 with attribution required.