Backpropagation and stochastic gradient descent
Backpropagation, or the backward propagation of errors, is the most commonly used supervised learning algorithm for adapting the connection weights.
Considering the error or the cost as a function of the weights W and b, a local minimum of the cost function can be approached with a gradient descent, which consists of changing weights along the negative error gradient:
Here, is the learning rate, a positive constant defining the speed of a descent.
The following compiled function updates the variables after each feedforward run:
g_W = T.grad(cost=cost, wrt=W) g_b = T.grad(cost=cost, wrt=b) learning_rate=0.13 index = T.lscalar() train_model = theano.function( inputs=[index], outputs=[cost,error], updates=[(W, W - learning_rate * g_W),(b, b - learning_rate * g_b)], givens={ x: train_set_x[index * batch_size: (index + 1) * batch_size], y: train_set_y[index * batch_size: (index + 1) * batch_size] } )
The input variable...