Math for ML Notes

Posted on June 2, 2019
Tags: machinelearning

Notes on Math for ML

Gaussian transformation are linear transformations that result in row-echelon form

np.linalg.inv(x) 

An matrix \[ A_{m \times n} :: \mathbb{R}^{n} \rightarrow \mathbb{R}^{m} \]

Given a vectorspace V, a set of vectors Q that can build any element in this vector space is the “generating set” V = span(Q) This says Q spans V.

Note there can be multiple generating sets that span V that hold dependent or independent vectors.
We call the minimal such set, the basis and it is easy to see the basis is linearly independent.
\[\widetilde{A}\] \(\widetilde{A}\) is the minimal set that spans V
\(\widetilde{A}\) is the maximal set that is linearly independent.

Note minimal and maximal are not unique.
The basis is not unique.
There can be multiple basis but they will all have the same number of elements.
\[ \{ \begin{bmatrix}1 \\0 \end{bmatrix}, \begin{bmatrix}0 \\1 \end{bmatrix} \}\qquad Canonical\ Basis\] \[\{ \begin{bmatrix}1 \\0 \end{bmatrix}, \begin{bmatrix}1 \\1 \end{bmatrix} \}\qquad Another\ Basis\]

0.0.0.1 Common misunderstanding with linear (in)dependence:

Despite the fact that one can envision \(\begin{bmatrix}1 \\0 \end{bmatrix}\) as being a component of \(\begin{bmatrix}1 \\1 \end{bmatrix}\) , these two basis vectors are independent.

Analogy:

0.0.0.2 Change of Basis

Let \(A :: V \rightarrow W\)
\(B, \widetilde{B} \in basis(V)\)
\(C, \widetilde{C} \in basis(W)\)
\(S :: \widetilde{B} \rightarrow B\)
\(T :: \widetilde{C} \rightarrow C\)

Goal is to design a change of basis for the matrix \(A\) which we define as
\(\widetilde{A} :: \widetilde{B} \rightarrow \widetilde{C}\)

Predefined Theorem: Each vector of one basis can be defined as a linear combination of another basis. Using Thm we can design linear combinations:

kernel or null space of a matrix or linear transformation is the preimage or set of vectors that hits the zero vector.

Matrices are called linear mapping or vectorspace homomorphism or linear transformation Matrices are linear maps which can be classified as

Finite-dim vector spaces V and W are is Basis Change

Affine subspace or Linear manifold

1 Geometric intrepetation

Norm on a vector space := a function from vectorspace to reals

2 Matrix Decomposition

Triangular matrix * Determinant = product diagonal

3 Continuous optimization

4 Linear regression

5 my old courses

5.1 Lin Reg

  • Decision: \(\frac{1}{2} || y' - y ||^2\)

5.2 0-1 Loss

  • Loss: \(\frac{1}{n}\sum_n sigmoid(-f(y))\)

$$ log loss

  • Decision:

5.3 Perception

  • Outputs : 0 or 1
  • Loss: \(-(y'-y)(w^Tx)\)
  • Derivative: \(-(y'-y)x\)

5.4 Logistic Regression

  • Outputs: -1 to 1
  • Loss: \(-log(P(X=y'))\)
    • \(-(y'log(y)+(1-y')(log(1-y)))\)

5.5 Fixed basis

  • Decision: \(\phi_i(x) = e^{-(x-r_i)^2}\)
    • \(\phi(x)\) is bounded between 0 and 1
    • inversely proportionaly to \(dist(x,r_i)\) meaning the closer \(x\) is to the basis, the closer \(\phi_i(x)\) is to 1

\[ \phi(x) = \begin{bmatrix} e^{-x-r_1}^2 \\ e^{-x-r_2}^2 \\ e^{-x-r_3}^2 \\ e^{-x-r_4}^2\end{bmatrix} \tag{4 basis}\]