Overview#
Cost function is the average of the Loss function of the entire Training dataset.The desire is to find the parameters 𝑤 𝑎𝑛𝑑 𝑏 that minimize the overall Cost function.
Where:
- w is the weight a Vector
- b is the bias a Scalar
- L is the Loss function
- m is number of examples in the Training dataset
- 𝑦̂ is the Predictor variable output vector. It can also be denoted a^(number of layers)
- y is the truth from Training dataset
Cost function sum of Loss functions over your training dataset plus some model complexity penalty.
A loss function is a part of a Cost function which is a type of an objective function.
Cost function is generally represented by "J" and
The entire concept of "Training a Artificial Neural network is minimizing the Cost function Normally, you must optimise the Training dataset and the weights on the synapses as you will NOT be able to control over the input data
Common Examples:#
- Mean Squared Error (MSE): MSE(θ)=1N∑Ni=1(f(xi|θ)−yi)2MSE(θ)=1N∑i=1N(f(xi|θ)−yi)2
- SVM cost function: SVM(θ)=‖θ‖2+C∑Ni=1ξiSVM(θ)=‖θ‖2+C∑i=1Nξi
(there are additional constraints connecting ξiξi with CC and with Training dataset) - J = ∑1/2(y-yHat)exp(2)