Represents a linear relationship between items.

Equation

We assume $x_0 = 1$, and so our equation is:

$$ h_\theta(x) = \theta_0 + \theta_1x_1 + \theta_2x_2 + ... + \theta_nx_n

Vectorized

We can vectorize the equation by creating the following vectors:

$\theta$ is a vector of length $n+1$ with $\theta$ values from $\theta_0\rightarrow \theta_n$
$x$ is a vector of length $n+1$ with values from $x_0\rightarrow x_n$

$$ \theta = \begin{bmatrix} \theta_0 \\ \theta_1 \\ \theta_2 \\ \vdots \\ \theta_n \\ \end{bmatrix} \in \mathbb{R}^{n+1} \quad x = \begin{bmatrix} x_0 \\ x_1 \\ x_2 \\ \vdots \\ x_n \\ \end{bmatrix} \in \mathbb{R}^{n+1} $$

We multiply the inverse of $\theta$ with $x$ to get the same result as the Equation above:

$$ \begin{bmatrix} \theta_0 & \theta_1 & \theta_2 & ... & \theta_n \end{bmatrix} \times \begin{bmatrix} x_0 \\ x_1 \\ x_2 \\ \vdots \\ x_n \\ \end{bmatrix} = \theta_0 + \theta_1x_1 + \theta_2x_2 + ... + \theta_nx_n $$

The simplified equation is:

$$ h_\theta(x) = \theta^Tx $$

If you are given a training set $X$ where each row is the equivalent of $x^T$ (the vector from above), then you can do a bulk application of the hypothesis to each item using matrix-vector multiplication:

$$ H = X\theta $$

<aside> 💡 You will see this applied in the vectorized versions of Cost Function and Gradient Descent.

</aside>