Yoshua Bengio is a Canadian computer scientist.He is known for his work on artificial neural networks and deep learning. His main research ambition is to understand principles of learning that yield intelligence. Yoshua has been co-organizing the Learning Workshop with Yann Le Cun, with whom he has also created the International Conference on Representation Learning (ICLR), since the year 1999. Yoshua has also organized or co-organized numerous other events, principally the deep learning workshops and symposia at NIPS and ICML since 2007.
The article talks about TwinNet i.e. Twin Networks, a method for training RNN architectures to better model the future in its internal state, which are supervised by another RNN modelling the future in reverse order.
Recurrent Neural Networks (RNNs) are the basis of state-of-art models for generating sequential data such as text and speech and are usually trained by teacher forcing. This corresponds to optimizing one-step ahead prediction. At present, there is no explicit bias toward planning in the training objective, the model may prefer to focus on the most recent tokens instead of capturing subtle long-term dependencies, which could contribute to global coherence. Local correlations are usually stronger than long-term dependencies and thus end up dominating the learning signal. This results obtained from this are, samples from RNNs tend to exhibit local coherence but lack meaningful global structure.
Recent efforts to address this problem have involved augmenting RNNs with external memory, with unitary or hierarchical architectures, or with explicit planning mechanisms. Parallel efforts aim to prevent overfitting on strong local correlations by regularizing the states of the network, by applying dropout or penalizing various statistics. To solve this, the paper proposes TwinNet, a simple method for regularizing a recurrent neural network that encourages modeling those aspects of the past that are predictive of the long-term future.
This paper presents a simple technique which enables generative recurrent neural networks to plan ahead. A backward RNN is trained to generate a given sequence in a reverse order. The states of the forward model are implicitly forced to predict cotemporal states of the backward model. The paper empirically shows that this approach achieves 9% relative improvement for a speech recognition task, and also achieves significant improvement on a COCO caption generation task.
Overall, the model is driven by the intuition:
(a) The backward hidden states contain a summary of the future of the sequence
(b) To predict the future more accurately, the model will have to form a better representation of the past.
The paper also demonstrates the success of the TwinNet approach experimentally, through several conditional and unconditional generation tasks that include speech recognition, image captioning, language modelling, and sequential image generation.
Overall Score: 21/30
Average Score: 7/10
The reviewers stated that the paper presents a novel approach to regularize RNNs and give results on different datasets indicating wide range of application. However, based on our results, they said that further experimentation and extensive hyperparameter search is needed. Overall, the paper is detailed, simple to implement and positive empirical results support the described approach.
The reviewers have also pointed out a few limitations which include:
The effect of twin net as a regularizer can be examined against other regularization strategies for comparison purposes.