François Chollet, creator of Keras on TensorFlow 2.0 and Keras integration, tricky design decisions in Deep Learning, and more

TensorFlow 2.0 was made available in October. One of the major highlights of this release was the integration of Keras into TensorFlow. Keras is an open-source deep-learning library that is designed to enable fast, user-friendly experimentation with deep neural networks. It serves as an interface to several deep learning libraries, most popular of which is TensorFlow, and it was integrated into TensorFlow main codebase in TensorFlow 2.0.

In September, Lex Fridman, Research scientist at MIT popularly known for his podcasts, spoke to François Chollet, who is the author of Keras on Keras, Deep Learning, and the Progress of AI. In this post, we have tried to highlight François’ views on the Keras and TensorFlow 2.0 integration, early days of Keras and the importance of design decisions for building deep learning models. We recommend the full podcast that’s available on Fridman’s YouTube channel.

Want to build Neural Networks?

[box type="shadow" align="" class="" width=""]If you want to build multiple neural network architectures such as CNN, RNN, LSTM in Keras, we recommend you to read Neural Networks with Keras Cookbook by V Kishore Ayyadevara. This book features over 70 recipes such as object detection and classification, building self-driving car applications, understanding data encoding for image, text and recommender systems and more. [/box]

Early days of Keras and how it was integrated into TensorFlow

I started working on Keras in 2015, says Chollet. At that time Caffe was the popular deep learning library, based on C++ and was popular for building Computer Vision projects. Chollet was interested in Recurrent Neural Networks (RNNs) which was a niche topic at that time. Back then, there was no good solution or reusable open-source implementation of RNNs and LSTMs, so he decided to build his own and that’s how Keras started. “It was going to be mostly around RNNs and LSTMs and the models would be defined by Python code, which was going against mainstream,” he adds.

Later, he joined Google’s research team working on image classification. At that time, he was exposed to the early internal version of Tensorflow - which was an improved version of Theano. When Tensorflow was released in 2015, he refactored Keras to run on TensorFlow. Basically he was abstracting away all the backend functionality into one module so that the same codebase could run on top of multiple backends.

A year later, the TensorFlow team requested him to integrate the Keras API into TensorFlow more tightly. They build a temporary TensorFlow-only version of Keras that was in tf.contrib for a while. Then they finally moved to TensorFlow Core in 2017.

TensorFlow 2.0 gives both usability and flexibility to Keras

Keras has been a very easy-to-use high-level interface to do deep learning. However, it lacked in flexibility - Keras framework was not the optimal way to do things compared to just writing everything from scratch. TensorFlow 2.0 offers both usability and flexibility to Keras. You have the usability of the high-level interface but you have the flexibility of the lower-level interface. You have this spectrum of workflows where you can get more or less usability and flexibility, the trade-offs depending on your needs. It's very flexible, easy to debug, and powerful but also integrates seamlessly with higher-level features up to classic Keras workflows.

“You have the same framework offering the same set of APIs that enable a spectrum of workflows that are more or less high level and are suitable for you know profiles ranging from researchers to data scientists and everything in between,” says Chollet.

Design decisions are especially important while integrating Keras with Tensorflow

“Making design decisions is as important as writing code”, claims Chollet. A lot of thought and care is taken in coming up with these decisions, taking into account the diverse user base of TensorFlow - small-scale production users, large-scale production users, startups, and researchers. Chollet says, “A lot of the time I spend on Google is actually discussing design. This includes writing design Docs, participating in design review meetings, etc.”

Making a design decision is about satisfying a set of constraints but also trying to do so in the simplest way possible because this is what can be maintained and expanded in the future. You want to design APIs that are modular and hierarchical so that they have an API surface that is as small as possible. You want this modular hierarchical architecture to reflect the way that domain experts think about the problem.

On the future of Keras and TensorFlow. What’s going to happen in TensorFlow 3.0?

Chollet says that he’s really excited about developing even higher-level APIs with Keras. He’s also excited about hyperparameter tuning by automated machine learning. He adds, “The future is not just, you know, defining a model, it's more like an automatic model.”

Limits of deep learning wrt function approximators that try to generalize from data

Chollet emphasizes that “Neural Networks don't generalize well, humans do.”

Deep Learning models are like huge parametric and differentiable models that go from an input space to an output space, trained with gradient descent. They are learning a continuous geometric morphing from an input vector space to an output space. As this is done point by point; a deep neural network can only make sense of points in space that are very close to things that it has already seen in string data. At best it can do the interpolation across points.

However, that means in order to train your network you need a dense sampling of the input, almost a point-by-point sampling which can be very expensive if you're dealing with complex real-world problems like autonomous driving or robotics. In contrast to this, you can look at very simple rules algorithms. If you have a symbolic rule it can actually apply to a very large set of inputs because it is abstract, it is not obtained by doing a point by point mapping.

Deep learning is really like point by point geometric morphings. Meanwhile, abstract rules can generalize much better. I think the future is which can combine the two.

Chollet also talks about self-improving Artificial General Intelligence, concerns about short-term and long-term threats in AI, Program synthesis, Good test for intelligence and more. The full podcast is available on Lex’s YouTube channel.

If you want to implement neural network architectures in Keras for varied real-world applications, you may go through our book Neural Networks with Keras Cookbook.