Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Are Recurrent Neural Networks capable of warping time?

Save for later
  • 2 min read
  • 07 May 2018

article-image
Can Recurrent neural networks warp time?’ is authored by Corentin Tallec and Yann Ollivier to be presented at ICLR 2018.

This paper explains that plain RNNs cannot account for warpings, leaky RNNs can account for uniform time scalings but not irregular warpings, and gated RNNs can adapt to irregular warpings.

Gating mechanism of LSTMS (and GRUs) to time invariance / warping

What problem is the paper trying to solve?


In this paper, that authors prove that learnable gates in a recurrent model formally provide quasi-invariance to general time transformations in the input data. Further, the authors try to recover part of the LSTM architecture from a simple axiomatic approach. This leads to a new way of initializing gate biases in LSTMs and GRUs. Experimentally, this new chrono initialization is shown to greatly improve learning of long term dependencies, with minimal implementation effort.

Paper summary


The authors have derived the self loop feedback gating mechanism of recurrent networks from first principles via a postulate of invariance to time warpings. Gated connections appear to regulate the local time constants in recurrent models. With this in mind, the chrono initialization, a principled way of initializing gate biases in LSTMs, has been introduced. Experimentally, chrono initialization is shown to bring notable benefits when facing long term dependencies.

Key takeaways

  • In this paper, the authors show that postulating invariance to time transformations in the data (taking invariance to time warping as an axiom) necessarily leads to a gate-like mechanism in recurrent models.
  • The paper provides precise prescriptions on how to initialize gate biases depending on the range of time dependencies to be captured.
  • The empirical benefits of the new initialization on both synthetic and real world data have been tested.
  • The authors also observed a substantial improvement with long-term dependencies, and slight gains or no change when short-term dependencies dominate.

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime

Reviewer comments summary


Overall Score: 25/30

Average Score: 8

According to a reviewer, the core insight of the paper is the link between recurrent network design and its effect on how the network reacts to time transformations. This insight is simple, elegant and valuable, as per the reviewer. A minor complaint highlighted is that there are an unnecessarily large number of paragraph breaks, which make reading slightly jarring.

Recurrent neural networks and the LSTM architecture

Build a generative chatbot using recurrent neural networks (LSTM RNNs)

How to recognize Patterns with Neural Networks in Java