Deep transition recurrent network
Contrary to stacked recurrent network, a deep transition recurrent network consists of increasing the depth of the network along the time direction, by adding more layers or micro-timesteps inside the recurrent connection.
To illustrate this, let us come back to the definition of a transition/recurrent connection in a recurrent network: it takes as input the previous state and the input data
at time step t, to predict its new state
.
In a deep transition recurrent network (figure 2), the recurrent transition is developed with more than one layer, up to a recurrency depth L: the initial state is set to the output of the last transition:
![](https://static.packt-cdn.com/products/9781786465825/graphics/B05525_10_11.jpg)
Furthermore, inside the transition, multiple states or steps are computed:
![](https://static.packt-cdn.com/products/9781786465825/graphics/B05525_10_06.jpg)
The final state is the output of the transition:
![](https://static.packt-cdn.com/products/9781786465825/graphics/B05525_10_07.jpg)