Symbols and notation used throughout the notes:


Basic notation

$$ \begin{array}{ll} m &= \text{number of training examples} \\ n &= \text{number of features} \\ x &= \text{feature (input variable)} \\ x^{(i)} &= \text{feature of $i$th training set} \\ x_j &= \text{$j$th feature} \\ y &= \text{target (output variable} \\ y^{(i)} &= \text{target of $i$th training set} \\ h &= \text{hypothesis (maps $x$ to $y$} \\ \theta &= \text{parameter (of $h$)} \\ \alpha &= \text{learning rate} \\ \lambda &= \text{regularization parameter} \\

\end{array} $$

Neural networks

$$ \begin{array}{ll}

a_i^{(j)} &= \text{activation of unit $i$ in layer $j$} \\ \Theta^{(j)} &= \text{matrix of weights controlling mapping from layer $j$ to $j+1$} \\ L &= \text{total number of layers in network} \\ s_l &= \text{number of units (w/o bias unit) in layer $l$} \\ K &= \text{number of output units/classes} \end{array} $$