A vector is also called a directed quantity. It is a geometric object that has both magnitude and direction and satisfies the parallelogram law. The concept opposite to a vector is a scalar, which only has magnitude and in most cases has no direction. We usually use a line segment with an arrow to represent a vector. Vectors in the Cartesian plane are shown below. It should be noted that a vector expresses magnitude and direction, and it does not specify a start point or end point, so the same vector can be drawn at any position. For example, the two vectors
There are many algebraic ways to represent a vector. For a vector in two-dimensional space, the following notations are all acceptable:
The size of a vector is called its norm, and it is a scalar. For a vector in two-dimensional space, its norm can be calculated with the formula below:
Notice that
Vectors of the same dimension can be added to produce a new vector. The operation is to add the corresponding components of the vectors, as shown below:
Vector addition satisfies the parallelogram law. In other words, if vectors
A vector
We can use NumPy arrays to represent vectors. Vector addition can be done by adding two arrays, and scalar multiplication of a vector can be done by multiplying an array by a scalar, so we will not repeat that here.
The dot product is one of the most important operations between two vectors. The operation is to multiply corresponding components and then add the products together, so the result of a dot product is a scalar. Geometrically, it is equal to the product of the norms of the two vectors multiplied by the cosine of the angle between them.
Suppose we use a three-dimensional vector to represent a user's preference for comedy, romance, and action movies. We use numbers from 1 to 5 to represent the degree of liking, where 5 means like very much, 4 means like, 3 means neutral, 2 means dislike, and 1 means strongly dislike. Then the vector below represents a user who likes comedy very much, strongly dislikes romance, and feels neutral about action movies.
Now suppose two movies are released. One is a romantic comedy, and the other is a comedy action movie. We also represent these two movies with 3-dimensional vectors, as shown below:
If we need to recommend one movie to this user, which one should we recommend? We can compute the dot product of the user vector
Some people may say that even by looking at it, the movie represented by vector dot function, and the norm of a vector can be calculated with the norm function in NumPy's linalg module, as shown below.
u = np.array([5, 1, 3])
m1 = np.array([4, 5, 1])
m2 = np.array([5, 1, 5])
print(np.dot(u, m1) / (np.linalg.norm(u) * np.linalg.norm(m1))) # 0.7302967433402214
print(np.dot(u, m2) / (np.linalg.norm(u) * np.linalg.norm(m2))) # 0.9704311900788593In two-dimensional space, the cross product of two vectors is defined like this:
For three-dimensional space, the cross product of two vectors is itself a vector:
Since the result of a cross product is a vector, the results of
In NumPy, we can use the cross function to calculate the cross product of vectors, as shown below.
print(np.cross(u, m1)) # [-14 7 21]
print(np.cross(m1, u)) # [ 14 -7 -21]A determinant is usually written as
Determinants come from vectors, so what determinants explain is really the properties of vectors.
- If one row or one column of
$\small{det(\boldsymbol{A})}$ is all0, then$\small{det(\boldsymbol{A}) = 0}$ . - If one row or one column of
$\small{det(\boldsymbol{A})}$ has a common factor$\small{k}$ , then$\small{k}$ can be taken out, and$\small{det(\boldsymbol{A}) = k \cdot det(\boldsymbol{A^{'}})}$ . - If every element in one row or one column of
$\small{det(\boldsymbol{A})}$ is the sum of two numbers, this determinant can be split into the sum of two determinants. - If two rows or two columns are proportional, then
$\small{det(\boldsymbol{A}) = 0}$ . - If swapping two rows or two columns gives
$\small{det(\boldsymbol{A^{'}})}$ , then$\small{det(\boldsymbol{A}) = -det(\boldsymbol{A^{'}})}$ . - If we add
$\small{k}$ times one row or one column to another row or column, the value of the determinant does not change. - If we swap rows and columns in a determinant, the value of the determinant does not change.
- The determinant of the product of square matrices
$\small{\boldsymbol{A}}$ and$\small{\boldsymbol{B}}$ is equal to the product of their determinants, that is$\small{det(\boldsymbol{A}\boldsymbol{B}) = det(\boldsymbol{A})det(\boldsymbol{B})}$ . In a special case, if every row of a matrix is multiplied by the constant$\small{r}$ , the determinant becomes$\small{r^n}$ times the original one. - If
$\small{\boldsymbol{A}}$ is an invertible matrix, then$\small{det(\boldsymbol{A}^{-1}) = (det(\boldsymbol{A}))^{-1}}$ .
The general formula of an
For a second-order determinant, the formula above becomes:
For a third-order determinant, the formula above becomes:
Higher-order determinants can be expanded into several lower-order determinants by cofactors, as shown below:
Here,
A matrix is a rectangular array made by arranging a series of elements. The elements in a matrix can be numbers, symbols, or mathematical formulas. Matrices can perform addition, subtraction, scalar multiplication, transpose, matrix multiplication, and other operations, as shown below.
One operation worth mentioning is matrix multiplication. It is defined only when the number of columns of the first matrix
For example:
Matrix multiplication satisfies associativity and distributivity over matrix addition:
Associativity:
Left distributive law:
Right distributive law:
Matrix multiplication does not satisfy the commutative law. In general, the product
One basic application of matrix multiplication is in systems of linear equations. A system of linear equations can be written in vector form as:
where
Matrices are also a convenient way to express linear transformations. If the discussion above feels hard to understand, I recommend the video series The Essence of Linear Algebra, which can give you a better understanding of linear algebra.
NumPy provides a module for linear algebra and also a matrix type for representing matrices. Of course, we can also use a two-dimensional array to represent a matrix. Officially, using the matrix class is not recommended. It is better to use two-dimensional arrays, and the matrix class may be removed in future versions. Still, with these wrapped classes and functions, we can perform many matrix operations very easily.
We can create matrix objects with the following code:
m1 = np.matrix('1 2 3; 4 5 6')
m1Note: The
matrixconstructor can accept either an array-like object or a string.
Output:
matrix([[1, 2, 3],
[4, 5, 6]])
m2 = np.asmatrix(np.array([[1, 1], [2, 2], [3, 3]]))
m2Note: The
asmatrixfunction can also be replaced with thematfunction. They are actually the same function.
Output:
matrix([[1, 1],
[2, 2],
[3, 3]])
m1 * m2Output:
matrix([[14, 14],
[32, 32]])
Note: Be careful about the difference between multiplication of
matrixobjects andndarrayobjects. The*operator for amatrixobject means matrix multiplication. If you want to do matrix multiplication with two two-dimensional arrays, you should use the@operator or thematmulfunction, not the*operator.
Useful properties of matrix objects are shown below.
| Property | Description |
|---|---|
A |
Get the corresponding ndarray object |
A1 |
Get the flattened ndarray object |
I |
The inverse matrix of an invertible matrix |
T |
The transpose of the matrix |
H |
The conjugate transpose of the matrix |
shape |
The shape of the matrix |
size |
The number of elements in the matrix |
The methods of matrix objects are almost the same as the methods of ndarray objects that we talked about earlier, so I will not repeat them here.
NumPy's linalg module contains a group of standard matrix decomposition operations, and functions such as inverse and determinant. The table below lists some commonly used linear-algebra-related functions in numpy and linalg.
| Function | Description |
|---|---|
diag |
Return the diagonal elements of a square matrix as a one-dimensional array, or turn a one-dimensional array into a square matrix with zeros off the diagonal |
matmul |
Matrix multiplication |
trace |
Compute the sum of diagonal elements |
norm |
Compute the norm of a matrix or vector |
det |
Compute the determinant |
matrix_rank |
Compute the rank of a matrix |
eig |
Compute eigenvalues and eigenvectors |
inv |
Compute the inverse of a non-singular square matrix |
pinv |
Compute the Moore-Penrose generalized inverse |
qr |
QR decomposition |
svd |
Singular value decomposition |
solve |
Solve the linear system |
lstsq |
Compute the least-squares solution of |
Below we simply try several of the functions above. Let us first try finding an inverse matrix.
m3 = np.array([[1., 2.], [3., 4.]])
m4 = np.linalg.inv(m3)
m4Output:
array([[-2. , 1. ],
[ 1.5, -0.5]])
np.around(m3 @ m4)Note: The
aroundfunction rounds array elements. By default, the number of decimal places is0.
Output:
array([[1., 0.],
[0., 1.]])
Note: A matrix multiplied by its inverse matrix gives the identity matrix.
Let us compute the determinant:
m5 = np.array([[1, 3, 5], [2, 4, 6], [4, 7, 9]])
np.linalg.det(m5)Output:
2
Let us compute the rank of the matrix:
np.linalg.matrix_rank(m5)Output:
3
Let us solve a linear system:
For the system above, we can write it in matrix form as:
The condition for a system of linear equations to have a unique solution is that the rank of the coefficient matrix
A = np.array([[1, 2, 1], [3, 7, 2], [2, 2, 1]])
b = np.array([8, 23, 9]).reshape(-1, 1)
print(np.linalg.matrix_rank(A))
print(np.linalg.matrix_rank(np.hstack((A, b))))Note: When reshaping an array object, if one argument is
-1, the number of elements in that dimension is computed automatically from the total number of elements and the other dimension sizes.
Output:
3
3
np.linalg.solve(A, b)Output:
array([[1.],
[2.],
[3.]])
Note: The result above shows that the solution to the linear system is
$\small{x_1 = 1, x_2 = 2, x_3 = 3}$ .
There is another way to solve the linear system. You can stop and think about why:
np.linalg.inv(A) @ bOutput:
array([[1.],
[2.],
[3.]])
Besides arrays, NumPy also wraps a data type for doing polynomial operations. A polynomial is a sum of products of coefficients and integer powers of a variable, in the form:
Before NumPy 1.4, we could use the poly1d type to represent polynomials. It is still available now, but NumPy officially provides the newer numpy.polynomial module. In addition to ordinary power-series polynomials, it also supports Chebyshev polynomials, Laguerre polynomials, and others.
Create a poly1d object, for example
p1 = np.poly1d([3, 2, 1])
p2 = np.poly1d([1, 2, 3])
print(p1)
print(p2)Output:
2
3 x + 2 x + 1
2
1 x + 2 x + 3
Get the coefficients of a polynomial
print(p1.coefficients)
print(p2.coeffs)Output:
[3 2 1]
[1 2 3]
Four basic operations between two polynomials
print(p1 + p2)
print(p1 * p2)Output:
2
4 x + 4 x + 4
4 3 2
3 x + 8 x + 14 x + 8 x + 3
Substitute a value of
print(p1(3))
print(p2(3))Output:
34
18
Differentiate and integrate a polynomial
print(p1.deriv())
print(p1.integ())Output:
6 x + 2
3 2
1 x + 1 x + 1 x
Find the roots of a polynomial
For example, given
p3 = np.poly1d([1, 3, 2])
print(p3.roots)Output:
[-2. -1.]
If we use the Polynomial class from the numpy.polynomial module to represent polynomial objects, then the corresponding operations are as follows:
from numpy.polynomial import Polynomial
p3 = Polynomial((2, 3, 1))
print(p3) # print the polynomial
print(p3(3)) # let x=3 and evaluate the polynomial
print(p3.roots()) # compute the roots of the polynomial
print(p3.degree()) # get the degree of the polynomial
print(p3.deriv()) # differentiate
print(p3.integ()) # compute the indefinite integralOutput:
2.0 + 3.0·x + 1.0·x²
20.0
[-2. -1.]
2
3.0 + 2.0·x
0.0 + 2.0·x + 1.5·x² + 0.33333333·x³
The Polynomial class also has a class method named fit, which can find the least-squares solution of a polynomial. The so-called least-squares solution is to use the least-squares method to find the coefficients of the function that best matches the data by minimizing the sum of squared errors. Suppose the polynomial is
For example, suppose we want to use collected historical data of monthly income and online-shopping spending to build a prediction model, so that we can predict a person's online-shopping spending amount from that person's monthly income. The income and spending data we collected are stored in the two arrays below.
x = np.array([
25000, 15850, 15500, 20500, 22000, 20010, 26050, 12500, 18500, 27300,
15000, 8300, 23320, 5250, 5800, 9100, 4800, 16000, 28500, 32000,
31300, 10800, 6750, 6020, 13300, 30020, 3200, 17300, 8835, 3500
])
y = np.array([
2599, 1400, 1120, 2560, 1900, 1200, 2320, 800, 1650, 2200,
980, 580, 1885, 600, 400, 800, 420, 1380, 1980, 3999,
3800, 725, 520, 420, 1200, 4020, 350, 1500, 560, 500
])We can first draw a scatter plot to see whether the two groups of data have a positive correlation or a negative correlation. Positive correlation means larger values in array x also match larger values in array y, while negative correlation means larger values in array x match smaller values in array y.
import matplotlib.pyplot as plt
plt.figure(dpi=120)
plt.scatter(x, y, color='blue')
plt.show()Output:
If we want to study the correlation of the two groups of data quantitatively, we can compute the covariance or correlation coefficient. The corresponding NumPy functions are cov and corrcoef.
np.corrcoef(x, y)Output:
array([[1. , 0.92275889],
[0.92275889, 1. ]])
Note: The correlation coefficient is a value between
-1and1. The closer it is to1, the stronger the positive correlation is. The closer it is to-1, the stronger the negative correlation is. If it is close to0, it means the two groups of data do not have an obvious correlation. The correlation coefficient between monthly income and online-shopping spending above is0.92275889, which means they have a strong positive correlation.
From the work above, we confirm that there is a strong positive correlation between income and online-shopping spending, so we use these data to build a regression model and find a straight line that can fit these data points well. Here we can use the fit method mentioned above. The code is shown below.
from numpy.polynomial import Polynomial
Polynomial.fit(x, y, deg=1).convert().coefNote:
deg=1means the highest power in the regression model is first degree, so the model has the form$\small{y=ax+b}$ . If you want a model like$\small{y=ax^2+bx+c}$ , you need to setdeg=2, and so on.
Output:
array([-2.94883437e+02, 1.10333716e-01])
According to the output above, our regression equation should be
import matplotlib.pyplot as plt
plt.scatter(x, y, color='blue')
plt.scatter(x, 0.110333716 * x - 294.883437, color='red')
plt.plot(x, 0.110333716 * x - 294.883437, color='darkcyan')
plt.show()Output:
If we do not use the fit method of the Polynomial type, we can also do the same thing with the polyfit function provided by NumPy. Interested readers can study that by themselves.
Note: Some images in this chapter come from Wikipedia.






