Represents a linear relationship between items.

Equation

We assume $x_0 = 1$, and so our equation is:

$$ h_\theta(x) = \theta_0 + \theta_1x_1 + \theta_2x_2 + ... + \theta_nx_n

$$

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/7e4b0c8e-4eb3-46ae-ac86-3680aa94ce52/linear_regression.png

Vectorized

We can vectorize the equation by creating the following vectors:

$$ \theta = \begin{bmatrix} \theta_0 \\ \theta_1 \\ \theta_2 \\ \vdots \\ \theta_n \\ \end{bmatrix} \in \mathbb{R}^{n+1} \quad x = \begin{bmatrix} x_0 \\ x_1 \\ x_2 \\ \vdots \\ x_n \\ \end{bmatrix} \in \mathbb{R}^{n+1} $$

We multiply the inverse of $\theta$ with $x$ to get the same result as the Equation above:

$$ \begin{bmatrix} \theta_0 & \theta_1 & \theta_2 & ... & \theta_n \end{bmatrix} \times \begin{bmatrix} x_0 \\ x_1 \\ x_2 \\ \vdots \\ x_n \\ \end{bmatrix} = \theta_0 + \theta_1x_1 + \theta_2x_2 + ... + \theta_nx_n $$

The simplified equation is:

$$ h_\theta(x) = \theta^Tx $$

If you are given a training set $X$ where each row is the equivalent of $x^T$ (the vector from above), then you can do a bulk application of the hypothesis to each item using matrix-vector multiplication:

$$ H = X\theta $$

<aside> 💡 You will see this applied in the vectorized versions of Cost Function and Gradient Descent.

</aside>