Single-layer linear model
The simplest model is the linear model, where for each class c
, the output is a linear combination of the input values:

This output is unbounded.
To get a probability distribution, pi
, that sums to 1, the output of the linear model is passed into a softmax function:

Hence, the estimated probability of class c
for an input x
is rewritten with vectors:

Translated in Python with:
batch_size = 600 n_in = 28 * 28 n_out = 10 x = T.matrix('x') y = T.ivector('y') W = theano.shared( value=numpy.zeros( (n_in, n_out), dtype=theano.config.floatX ), name='W', borrow=True ) b = theano.shared( value=numpy.zeros( (n_out,), dtype=theano.config.floatX ), name='b', borrow=True ) model = T.nnet.softmax(T.dot(x, W) + b)
The prediction for a given input is given by the most probable class (maximum probability):
y_pred = T.argmax(model...