Stacked recurrent networks
To stack recurrent networks, we connect the hidden layer of the following recurrent network, to the input of the preceding recurrent network:
When the number of layers is one, our implementation is a recurrent network as in the previous chapter.
First we implement dropout in our simple RNN model:
def model(inputs, _is_training, params, batch_size, hidden_size, drop_i, drop_s, init_scale, init_H_bias): noise_i_for_H = get_dropout_noise((batch_size, hidden_size), drop_i) i_for_H = apply_dropout(_is_training, inputs, noise_i_for_H) i_for_H = linear.model(i_for_H, params, hidden_size, hidden_size, init_scale, bias_init=init_H_bias) # Dropout noise for recurrent hidden state. noise_s = get_dropout_noise((batch_size, hidden_size), drop_s) def step(i_for_H_t, y_tm1, noise_s): s_lm1_for_H = apply_dropout(_is_training,y_tm1, noise_s) return T.tanh(i_for_H_t + linear.model(s_lm1_for_H, params...