I have a question in the step of getting data projected into new basis. We learnt that if we want to project a point or vector of points into new vector, we calculate the dot product of the two vectors, and if the vector that we want to project the data on it is orthonormal, we can simple multiply the target vector by the projected vector.
In PCA, we multiply the data matrix by the covariance matrix to project data into the new basis, so my question is, what makes covariance matrix orthonormal that we directly multiply it by the data? why didn’t we calculate the dot product???

First thing first, is that you get the eigenvector W of the corvariance matrix, this is related with the SVD for matrix: X = SW’ = UDV’
where, the score: S = UD in SVD; the loading: W = V in SVD
SVD guarantees that all the eigenvectors are orthogonal to each other, which means they’re the components (or new axis) in geometry we want.

First from the defination of convariance matrix, you should notice that it’s symmetric, then its eigenvectors should be orthogonal, here’s a link gives the detail about why:

Orthonormal means orthogonal + each vector has norm 1. You can easily make an orthogonal set of vectors become orthonormal simply by normalizing their length to 1.

That link was helpful to explain how eigen vectors of symmetric matrix are orthogonal, but still stuck of being orthonormal.
I learnt that we can make orthogonal vectors orthonormal easily, but the tutorial deals with the fact that eigen vectors of the symmetric matrix are orthonormal (while transporting data points from old basis to new basis (new space), we multiply data matrix directly by the eigen vectors of covariance matrix to get the points in the new space, S = XW).
So is there a proof that the eigen vectors of symmetric matrix are orthonormal? Or I misunderstood the tutorial?

The proof I think you would be looking for is how SVD(Singular Value Decomposition) can get you eigenvectors. This handout does a good job I think explaining it in the context of the W1D5 tutorial: https://www.cns.nyu.edu/~david/handouts/svd.pdf

But I think the videos in the tutorial briefly mention this intuition in thinking how a matrix is just a linear transformation.

With SVD you can decompose a matrix into three parts, two matrices that perform a rotational transformation and one scalar (or stretch) transformation, similar to the image below