Recommendations:
Generally some poorly used terms are:
Code Symbol | Math Symbol | Definition | Dimensions |
---|---|---|---|
X | X | the input matrix (Independent variables) | (numExamples, inputLayerSize) |
Y | Y | is the label (Classification) matrix | |
y(i) | y(i) | Predictor variable (Dependent variable) for the i(th) entry | (numExamples, outputLayerSize) |
W | Wl | weights Vector for the l(th) layer | (inputLayerSize, hiddenLayerSize) |
z2 | z[2] | Layer 2 activation Function | (numExamples, hiddenLayerSize) |
a2 | a[2] | Layer 2 activity | (numExamples, hiddenLayerSize) |
z3 | z[3] | Layer 3 activation Function | (numExamples, outputLayerSize) |
J | J | Cost function | (1, outputLayerSize) |
W | W | the weight Matrix for the l(th) layer | weight |
b(l) | b(l) | the bias vector for the l(th) layer | bias |
L | the Loss function | Loss function | |
m | m | number of examples in the Training dataset | |
n(subx) | n(subx) | input size | |
n(suby) | n(suby) | number of Classifications (Output Size) | |
𝑦̂ | 𝑦̂ | y-Hat is the Predictor variable vector (Dependent variables) which can also be denoted as a(number of layers) |