What is Meta Learning?

Meta Learning, an original concept of cognitive psychology, is now applied to machine learning techniques. If we go by the social psychology definition, meta learning is the state of being aware of and taking control of one's own learning. Similar concepts, when applied to the machine learning theory states that a meta learning algorithm uses prior experience to change certain aspects of an algorithm, such that the modified algorithm is better than the original algorithm. To explain in simple terms, meta-learning is how the algorithm learns how to learn.

Meta Learning: Making a versatile AI agent

Current AI Systems excel at mastering a single skill, playing Go, holding human-like conversations, predicting a disaster, etc. However, now that AI and machine learning is possibly being integrated in everyday tasks, we need a single AI system to solve a variety of problems. Currently, a Go Player, will not be able to navigate the roads or find new places. Or an AI navigation controller won’t be able to hold a perfect human-like conversation. What machine learning algorithms need to do is develop versatility – the capability of doing many different things. Versatility is achieved by intelligent amalgamation of Meta Learning along with related techniques such as reinforcement learning (finding suitable actions to maximize a reward), transfer learning (re-purposing a trained model for a specific task on a second related task), and active learning (learning algorithm chooses the data it wants to learn from). Such different learning techniques provides an AI agent with the brains to do multiple tasks without the need to learn every new task from scratch. Thereby making it capable of adapting intelligently to a wide variety of new, unseen situations.

Apart from creating versatile agents, recent researches also focus on using meta learning for hyperparameter and neural network optimization, fast reinforcement learning, finding good network architectures and for specific cases such as few-shot image recognition.

Using Meta Learning, AI agents learn how to learn new tasks by reusing prior experience, rather than examining each new task in isolation.

Various approaches to Meta Learning algorithms

A wide variety of approaches come under the umbrella of Meta-Learning. Let's have a quick glance at these algorithms and techniques:

Algorithm Learning (selection)

Algorithm selection or learning, selects learning algorithms on the basis of characteristics of the instance. For example, you have a set of ML algos (Random Forest, SVM, DNN), data sets as the instances and the error rate as the cost metric. Now, the goal of Algorithm Selection is to predict which machine learning algorithm will have a small error on each data set.

Hyper-parameter Optimization

Many machine learning algorithms have numerous hyper-parameters that can be optimized. The choice of selecting these hyper-parameters for learning algorithms determines how well the algorithm learns. A recent paper, "Evolving Deep Neural Networks", provides a meta learning algorithm for optimizing deep learning architectures through evolution.

Ensemble Methods

Ensemble methods usually combine several models or approaches to achieve better predictive performance. There are 3 basic types – Bagging, Boosting, and Stacked Generalization.

In Bagging, each model runs independently and then aggregates the outputs at the end without preference to any model.

Boosting refers to a group of algorithms that utilize weighted averages to make weak learners into stronger learners. Boosting is all about “teamwork”.

Stacked generalization, has a layered architecture. Each set of base-classifiers is trained on a dataset. Successive layers receive as input the predictions of the immediately preceding layer and the output is passed on to the next layer. A single classifier at the topmost level produces the final prediction.

Dynamic bias selection

In Dynamic Bias selection, we adjust the bias of the learning algorithm dynamically to suit the new problem instance. The performance of a base learner can trigger the need to explore additional hypothesis spaces, normally through small variations of the current hypothesis space. The bias selection can either be a form of data variation or a time-dependent feature.

Inductive Transfer

Inductive transfer describes learning using previous knowledge from related tasks. This is done by transferring meta-knowledge across domains or tasks; a process known as inductive transfer. The goal here is to incorporate the meta-knowledge into the new learning task rather than matching meta-features with a meta-knowledge base.

Adding Enhancements to Meta Learning algorithms

Supervised meta-learning: When the meta-learner is trained with supervised learning. In supervised learning we have both input and output variables and the algorithm learns the mapping function from the input to the output.

RL meta-learning: This algorithm talks about using standard deep RL techniques to train a recurrent neural network in such a way that the recurrent network can then implement its own Reinforcement learning procedure.

Model-agnostic meta-learning: MAML trains over a wide range of tasks, for a representation that can be quickly adapted to a new task, via a few gradient steps. The meta-learner seeks an initialization that is not only useful for adapting to various problems, but also can be adapted quickly.

The ultimate goal of any meta learning algorithm and its variations is to be fully self-referential. This means it can automatically inspect and improve every part of its own code. A regenerative meta learning algorithm, on the lines of how a lizard regenerates its limbs, would not only blur the distinction between the different variations as described above but will also lead to better future performance and versatility of machine learning algorithms.