Artificial neural networks
ANNs take advantage of the concept of DL. They are an abstract representation of the human nervous system, which contains a collection of neurons that communicate with each other through connections called axons.
Warren McCulloch and Walter Pitts proposed the first artificial neuron model in 1943 in terms of a computational model of nervous activity. This model was followed by another proposed by John von Neumann, Marvin Minsky, Frank Rosenblatt (the so-called perceptron), and many others.
The biological neurons
Look at the brain's architecture for inspiration. Neurons in the brain are called biological neurons. They are unusual–looking cells, mostly found in animal brains, consisting of cortexes. The cortex itself is composed of a cell body, containing the nucleus and most of the cell's complex components. There are many branching extensions called dendrites, plus one very long extension called the axon.
Near its extremity, the axon splits off into many branches called telodendria and at the top of these branches are minuscule structures called synaptic terminals (or simply synapses), which connect to the dendrites of other neurons. Biological neurons receive short electrical impulses called signals from other neurons, and in response, they fire their own signals:
In biology, a neuron is composed of the following:
- A cell body or soma
- One or more dendrites, whose responsibility it is to receive signals from other neurons
- An axon, which in turn conveys the signals generated by the same neuron to the other connected neurons
The neuron's activity alternates between sending a signal (active state) and rest/receiving signals from other neurons (inactive state). The transition from one phase to another is caused by the external stimuli, represented by signals that are picked up by the dendrites. Each signal has an excitatory or inhibitory effect, conceptually represented by a weight associated with the stimulus.
A neuron in idle state accumulates all the signals it has received until it reaches a certain activation threshold.
The artificial neuron
Based on the concept of biological neurons, the term and the idea of artificial neurons arose, and they have been used to build intelligent machines for DL-based predictive analytics. This is the key idea that inspired ANNs. Similarly to biological neurons, the artificial neuron consists of the following:
- One or more incoming connections, with the task of collecting numerical signals from other neurons: each connection is assigned a weight that will be used to consider each signal sent
- One or more output connections that carry the signal to the other neurons
- An activation function, which determines the numerical value of the output signal, based on the signals received from the input connections with other neurons, and suitably collected from the weights associated with each received signal, and the activation threshold of the neuron itself:
The output, that is, the signal that the neuron transmits, is calculated by applying the activation function, also called the transfer function, to the weighted sum of the inputs. These functions have a dynamic range between -1 and 1, or between 0 and 1. Many activation functions differ in terms of complexity and output. Here, we briefly present the three simplest forms:
- Step function: Once we fix the threshold value x (for example, x = 10), the function will return zero, or one if the mathematical sum of the inputs is at, above, or below the threshold value.
- Linear combination: Instead of managing a threshold value, the weighted sum of the input values is subtracted from a default value. We will have a binary outcome that will be expressed by a positive (+b) or negative (-b) output of the subtraction.
- Sigmoid: This produces a sigmoid curve, which is a curve with an S trend. Often, the sigmoid function refers to a special case of the logistic function.
From the simplest forms used in the prototyping of the first artificial neurons, we move to more complex forms that allow greater characterization of the functioning of the neuron:
- Hyperbolic tangent function
- Radial basis function
- Conic section function
- Softmax function:
Choosing proper activation functions (also weights initialization) is key to making a network perform at its best and to obtain good training. These topics are under a lot of research, and studies indicate marginal differences in terms of output quality if the training phase is carried out properly.
Note
There is no rule of thumb in the field of neural networks. It all depends on your data and in what form you want the data to be transformed, after passing through the activation function. If you want to choose a particular activation function, you need to study the graph of the function to see how the result changes with respect to the values given to it.