Finds the optimal $\theta$ values for a linear regression problem.
Pros:
Cons:
<aside> 💡 Begin considering gradient descent at $n=10,000$ features.
</aside>
$$ \theta = (X^TX)^{-1}X^Ty $$
def normalEquation(X, y):
Xt = np.transpose(X)
return pinv(Xt @ X) @ Xt @ y
<aside>
💡 You can view a Jupyter Notebook using normalEquation
here.
</aside>
function theta = normalEquation (X, y)
theta = pinv(X' * X) * X' * y;
end
<aside>
💡 You can view the code example for normalEquation
with added comments here.
</aside>