How does back propagation work

Backpropagation explained, how neural networks learn :)

Dec 30
How does back propagation work Aipril

Backpropagation (or back-propagation) is a fundamental technique in neural network learning. But how does it work?, in this article we will investigate the essence of back-propagation.

Back-propagation how does it work and why does it work?

The original paper from David E. Rumelhart, Geoffrey E. Hinton & Ronald J. William states The procedure repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector. As a result of the weight adjustments, internal ‘hidden’ units which are not part of the input or output come to represent important features of the task domain, and the regularities in the task are captured by the interactions of these units. The ability to create useful new features distinguishes back-propagation from earlier, simpler methods such as the perceptron-convergence procedure. https://www.nature.com/articles/323533a0

Imagine standing on the top of a hill in a 900000 dimensional universe overlooking a hilly landscape and you want to reach the bottom of a valley as low (above sea level) as possible; How would you proceed?, well you could look at which direction would take you closer to your goal and take a first step in this direction (and then repeat this). At least this is how the back-propagation algorithm would try to do this;

The method described above is called also 'gradient decent'. In the context of neural networks the hill height is the cost function you try to minimize (a low error means a good accuracy of your network) and your position on the hill in the example is the ' vector' of weight values that form the 'strength' of connections in your current network (in the example there were 900000 weights hence the 900000 dimensions) back propagation looks at the gradient in every 'direction' and determines which step is a step in the 'right direction'; back propagation cannot 'see' the optimal solution in the distance. It tries to make a small improvement of the current solution and by repeating this reach a good or optimal solution in the end.

Backpropagation takes into account the sensitivity of changing a weight on the final cost function (the error term). So neural connections with a bigger effect get a proportional bigger adjustment and neural connections with a small effect receive smaller adjustments.

An effect of the adjustment of the weights by back propagation is that neurons that fire together, also wire together. For instance in image recognition the neurons that 'see' an number '8' become strong connected to the neurons in the next layer that classify the number or concept '8' (or think of the number/concept ' 8').

This is the way the network learns and stores information/knowledge.

Backpropagation looks at the sensitivity of connections (on the final error) starting with the last layer and then going backwards to the first layer ; The calculated sensitivities are then used in a forward pass to adjust all the weights in the network. The idea is intuitively that this is a nudge in the right direction of decreasing the error term (or cost function) and therefore an increase in the accuracy of the network;

references: http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html

Next Post Previous Post