As long as it is coherent
No contradictions
No ambiguities
It is nice if it is useful
or at least beautiful
We use capital letters to represent matrices: \(A, B\) and \(C.\)
Their components are represented with upper case letters and two
subindices for row and column,
like \(A_{ij}\) and \(B_{12}.\)
Please notice that in \(B_{12}\) the sub-index has two parts: row 1 and column 2.
In case of ambiguity we can use commas and write \(B_{1,2}.\)
The dimensions of a matrix are the number of its rows and the number of columns.
For a matrix of 3 rows and 4 columns we write the dimension as \(3× 4.\)
As long as they have the same size
If we have two matrices \(A\) and \(B\) of the same dimensions \(m\times n\), then we can add them in the natural way:
\[ A+B= \begin{bmatrix} A_{11} + B_{11} & A_{12} + B_{12}\\ A_{21} + B_{21} & A_{22} + B_{22} \end{bmatrix} \]
There is a matrix equivalent to 0
(which one?)
If a set of equations is represented in matrix form as \(y=Ax\),
and we change the scale of all the equations
(for example if we use micro-liters instead of liters),
then the equations are still valid.
But now the matrix \(A\) has to change by a scale factor, let’s say \(α\).
The new matrix \(B\) will have components \(B_{ij}=α A_{ij}\).
This operation is called multiplication by a scalar. We write
\[ α A = \begin{bmatrix} α A_{11} & α A_{12} \\ α A_{21} & α A_{22} \end{bmatrix} \]
In this context a scalar is any single number, real or complex.
Naturally, we can change the scale whenever we want, so the multiplication by a scalar is commutative: \[α A = A α\]
We multiply everything by the same number
\[ A = \begin{bmatrix} A_{11} & A_{12}\\ A_{21} & A_{22} \end{bmatrix} \qquad αA = \begin{bmatrix} αA_{11} & αA_{12}\\ αA_{21} & αA_{22} \end{bmatrix} \]
This single number \(α\) is called a scalar
(Think of it as a change of scale)
Something easy and useful is to “turn the matrix sideways”
This is called transposition
If \(𝐀∈ ℝ^{m× n},\) then we can build a new matrix \(𝐀ᵀ∈ ℝ^{n× m}\) such that each component of \(𝐀ᵀ\) is defined by \[(Aᵀ)_{ij} = A_{ji}\]
This is called \(𝐀\) transposed and is written \(𝐀ᵀ\)
One practical application of transposition is to write column vectors in the text.
Now we can write \(𝐛=(b_1,…,b_n)ᵀ\) in one line, instead of \[ 𝐛= \begin{bmatrix} b_{1} \\ \vdots\\ b_{n} \end{bmatrix} \]
In general \(𝐀≠𝐀ᵀ.\)
If \(𝐀\) is square, then sometimes \(𝐀=𝐀ᵀ\)
In that case we say that \(𝐀\) is symmetrical.
\[ 𝐚+𝐛= \begin{bmatrix} a_{1} + b_{1} \\ a_{2} + b_{2} \end{bmatrix} \qquad α𝐚= \begin{bmatrix} αa_{1} \\ αa_{2} \end{bmatrix} \]
We use lower case letters to represent vectors, such as \(a\) and \(b.\)
The components of the vector are written with the same letter and a sub-index, such as \(a_i\) and \(b_n.\)
Draw vector before and after sum, and after scalar multiplication
A vector \(𝐚=(a_1,a_2)\) corresponds to a point in the plane.
The distance between the point \((0,0)\) and \((a_1,a_2)\) is called the magnitude of the vector.
(Sometimes we say size or length)
We write it as \(\Vert 𝐚 \Vert\) and we call it “the norm of the vector”
Since ancient times it is known that \(a^2+b^2=c^2.\)
The sum of the squares is the square of the diagonal.
Using Pythagoras theorem we find the length/size/magnitude of a vector
\[\Vert 𝐚\Vert=\sqrt{a_1^2+a_2^2}\] or in general \[\Vert 𝐚\Vert=\sqrt{a_1^2+\cdots +a_n^2}\]
\[ \begin{align} y_1 & = a_{1,1}\,x_1 + a_{1,2}\,x_2\\ y_2 & = a_{2,1}\,x_1 + a_{2,2}\,x_2 \end{align} \] is the same as \[ \begin{bmatrix} y_1 \\ y_2 \end{bmatrix}= \begin{bmatrix} a_{1,1} & a_{1,2}\\ a_{2,1} & a_{2,2} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \]
\[ \begin{align} y_1 & = a_{1,1}\,x_1 + a_{1,2}\,x_2\\ y_2 & = a_{2,1}\,x_1 + a_{2,2}\,x_2 \end{align} \] is the same as \[ y_1 = \begin{bmatrix} a_{1,1} & a_{1,2}\\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}\\ y_2 = \begin{bmatrix} a_{2,1} & a_{2,2}\\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \]
\[ y_1 = a_{1,1}\,x_1 + a_{1,2}\,x_2\\ \]
is a row vector times a column vector
\[ y_1 = \begin{bmatrix} a_{1,1} & a_{1,2}\\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} a_{1,1} \\ a_{1,2}\\ \end{bmatrix}^T \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \]
The formula of each equation has the same shape: \[y=a_1 x_1 + a_2 x_2\]
This is called dot product.
If \(𝐚=(a_1,a_2)^T\) and \(𝐛=(b_1,b_2)^T\) then their dot product is \[𝐚 ⋅ 𝐛 = a_1 b_1 + a_2 b_2 = \sum_i a_i b_i\]
This is a way to combine two vectors that will be very useful later.
It is good to remember that the dot product can be written using matrix multiplication \[𝐚⋅𝐛 = 𝐚ᵀ𝐛= 𝐛ᵀ𝐚\]
Note: sometimes this is called internal product
If the vector \(𝐱\) is \((a,0)\) then it is located on the horizontal axis.
If we want to rotate it by 90 degrees, we will get the vector \(𝐱'=(0,a).\)
Rotating it again we get \(𝐱''=(-a,0)\)
If the vector \(𝐲\) is \((a,b),\) then we can write it as \(𝐲= (a,0) + (0,b)\) and each part can rotate separately.
The part \((a,0)\) becomes \((0,a),\) and \((0,b)\) becomes \((-b,0).\)
Therefore \(𝐲'=(-b,a).\) The same reasoning shows that rotating again by 90 degrees will result in \(𝐲''=(-a,-b).\)
We can draw some conclusions from this exercise.
First, changing the signs of each component of a vector will give us a new vector of the same size, but pointing in the opposite direction. This should not be very surprising.
The second conclusion is that the dot product of two perpendicular vectors is zero. Indeed we can see that \[𝐱⋅𝐱'=(a,0)⋅ (0,a)= 0a + 0a = 0\] and \[𝐲⋅𝐲'=(a, b)⋅ (-b, a)= ab - ab = 0.\]
This result is not only limited to vectors of the same magnitude.
If the vector \(𝐳\) is parallel to \(𝐲\) but with different length, then \(𝐳 = α 𝐲\) for some number \(α.\)
Then \(𝐳\) should be also orthogonal to \(𝐲.\) Indeed, we have \[𝐳⋅𝐲'= (α𝐲)⋅𝐲'= α(𝐲⋅ 𝐲') = 0.\]
This motivates the following characterization: two vectors \(𝐮\) and \(𝐯\) are said to be orthogonal if and only if their dot product is 0. The word orthogonal literally means right angle, and is a fancy way of saying perpendicular.
Notice that this implies that the vector \(𝟎\) is always orthogonal to any other vector.
vectors in polar coordinates
each vector can also be represented on polar coordinates: \(\mathbf{a}=(a_1,a_2)=(a\cos(\theta),a\sin(\theta))\) and \(\mathbf{b}=(b_1,b_2)=(b\cos(\phi),b\sin(\phi))\)
\(\mathbf{a}=(a_1,a_2)=(a\cos(\theta),a\sin(\theta))\) and \(\mathbf{b}=(b_1,b_2)=(b\cos(\phi),b\sin(\phi))\) so the dot product is also
\[\mathbf{a}\cdot\mathbf{b} =a\cos(\theta)b\cos(\phi) + a\sin(\theta)b\sin(\phi)=a b \cos(\theta-\phi)\]
the dot product is a number proportional to the length of each vector and to the cosine of the angle between them.
If two vectors are perpendicular to each other, their dot product is zero.