Single-layer linear model
The simplest model is the linear model, where for each class c
, the output is a linear combination of the input values:
data:image/s3,"s3://crabby-images/17aef/17aef1a6e8cfb9740b16b9f1a7a687cd5461a6e3" alt=""
This output is unbounded.
To get a probability distribution, pi
, that sums to 1, the output of the linear model is passed into a softmax function:
data:image/s3,"s3://crabby-images/c4f89/c4f89316094c2d3aa0d8bb13875de7d927a3e003" alt=""
Hence, the estimated probability of class c
for an input x
is rewritten with vectors:
data:image/s3,"s3://crabby-images/91a55/91a5561d6d4a618d98668d5a873ae7d32485cd87" alt=""
Translated in Python with:
batch_size = 600 n_in = 28 * 28 n_out = 10 x = T.matrix('x') y = T.ivector('y') W = theano.shared( value=numpy.zeros( (n_in, n_out), dtype=theano.config.floatX ), name='W', borrow=True ) b = theano.shared( value=numpy.zeros( (n_out,), dtype=theano.config.floatX ), name='b', borrow=True ) model = T.nnet.softmax(T.dot(x, W) + b)
The prediction for a given input is given by the most probable class (maximum probability):
y_pred = T.argmax(model, axis=1)
In this model with a single linear...