A modified cost function since the Linear Regression Cost Function is not convex, and so can not be used with Gradient Descent to find the best parameters.

Equation

If $y=1$, find the negative log of the hypothesis, or if $y=0$ find the negative log of 1 minus hypothesis

$$ \text{Cost}(h_\theta(x),y)= \bigg\{ \begin{array}{rl} -\log{(h_\theta(x))} & \text{if } y = 1 \\ -\log{(1 - h_\theta(x))} & \text{if } y = 0 \\ \end{array} $$

The equivalent equation written in a single line is:

$$ \text{Cost}(h_\theta(x),y) = -y\log(h_\theta(x)) - (1-y)\log(1-h_\theta(x)) $$

The complete cost function is:

$$ J(\theta) = -\frac{1}{m} \left[ \sum_{i=1}^m y^{(i)}\log(h_\theta(x^{(i)})) + (1-y^{(i)})\log(1-h_\theta(x^{(i)})) \right] $$

Takeaways from the graph:

$$ \begin{array}{ll} \text{Cost}(h_\theta(x),y) = 0 & \text{if } h_\theta(x) = y \\ \text{Cost}(h_\theta(x),y) \rightarrow \infty & \text{if } y=0 \text{ and } h_\theta(x) \rightarrow 1 \\ \text{Cost}(h_\theta(x),y) \rightarrow \infty & \text{if } y=1 \text{ and } h_\theta(x) \rightarrow 0 \\ \end{array} $$

Vectorized

TODO

$$ h = g(X\theta) \\ J(\theta) = \frac{1}{m} \times -y^Tlog(h) - (1-y)^Tlog(1-h) $$

Python

TODO:

import numpy as np

def costFunction(X, y, theta):
    # TODO

<aside> 💡 You can view a Jupyter Notebook using costFunction here.

</aside>

MATLAB

TODO: