Finds the cost associated with the Linear Hypothesis.

<aside> ❓ May also work for other linear hypotheses.

</aside>

Equation

  1. Find squared difference between hypothesis result and actual value $(h_\theta(x^{(i)}) - y^{(i)})^2$
  2. Sum together all squared differences divided by two for simplicity of finding derivative
  3. Divide by $m$ (number of training set items) to find the mean

$$ \text{Cost}(h_\theta(x),y)= \frac{1}{2} (h_\theta(x^{(i)}) - y^{(i)})^2 $$

The complete cost function is:

$$ J(\theta) = \frac{1}{2m}\sum_{i=1}^m (h_\theta(x^{(i)}) - y^{(i)})^2 $$

Below is a graph of some different $\theta$ values, and their associated costs. On the right is graphed the cost with respect to $\theta$ (you can see that the hypothesis is a perfect fit when $\theta$ is equal to 1).

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/06ca1ce4-49bd-4c4e-b4c3-86c6c7d5b58f/cost_function_example.png

Vectorized

Given a matrix $X$ where each column is an item of a training set and each row contains the variables $x_0 \rightarrow x_n$ (where $x_0=1$), you can calculate the cost across every item in a single step by using the following formula (which uses the Linear Hypothesis).

$$ J(\theta) = \frac{1}{2m} (X\theta - y)^T(X\theta - y) $$

Python

Using squared error cost function and the Linear Hypothesis:

import numpy as np

def costFunction(X, y, theta):
    m = len(y)
    err = X @ theta - y
    return ((1 / (2 * m)) * (np.transpose(err) @ err).item((0, 0)))

<aside> 💡 You can view a Jupyter Notebook using costFunction here.

</aside>

MATLAB

Using squared error cost function, and Linear Hypothesis: