You're reading from Deep Learning with TensorFlow Explore neural networks and build intelligent systems with Python

Product type Paperback

Published in Mar 2018

Publisher Packt

ISBN-13 9781788831109

Length 484 pages

Edition 2nd Edition

Languages

Python

Tools

TensorFlow

Concepts

Deep Learning

Authors (2):

Giancarlo Zaccone

Md. Rezaul Karim

View More author details

Table of Contents (13) Chapters

Preface

1. Getting Started with Deep Learning

2. A First Look at TensorFlow FREE CHAPTER

3. Feed-Forward Neural Networks with TensorFlow

4. Convolutional Neural Networks

5. Optimizing TensorFlow Autoencoders

6. Recurrent Neural Networks

7. Heterogeneous and Distributed Computing

8. Advanced TensorFlow Programming

9. Recommendation Systems Using Factorization Machines

10. Reinforcement Learning

Other Books You May Enjoy

Leave a review – let other readers know what you think

Index

Artificial neural networks

ANNs take advantage of the concept of DL. They are an abstract representation of the human nervous system, which contains a collection of neurons that communicate with each other through connections called axons.

Warren McCulloch and Walter Pitts proposed the first artificial neuron model in 1943 in terms of a computational model of nervous activity. This model was followed by another proposed by John von Neumann, Marvin Minsky, Frank Rosenblatt (the so-called perceptron), and many others.

The biological neurons

Look at the brain's architecture for inspiration. Neurons in the brain are called biological neurons. They are unusual–looking cells, mostly found in animal brains, consisting of cortexes. The cortex itself is composed of a cell body, containing the nucleus and most of the cell's complex components. There are many branching extensions called dendrites, plus one very long extension called the axon.

Near its extremity, the axon splits off into many branches called telodendria and at the top of these branches are minuscule structures called synaptic terminals (or simply synapses), which connect to the dendrites of other neurons. Biological neurons receive short electrical impulses called signals from other neurons, and in response, they fire their own signals:

Figure 7: Working principles of biological neurons.

In biology, a neuron is composed of the following:

A cell body or soma
One or more dendrites, whose responsibility it is to receive signals from other neurons
An axon, which in turn conveys the signals generated by the same neuron to the other connected neurons

The neuron's activity alternates between sending a signal (active state) and rest/receiving signals from other neurons (inactive state). The transition from one phase to another is caused by the external stimuli, represented by signals that are picked up by the dendrites. Each signal has an excitatory or inhibitory effect, conceptually represented by a weight associated with the stimulus.

A neuron in idle state accumulates all the signals it has received until it reaches a certain activation threshold.

The artificial neuron

Based on the concept of biological neurons, the term and the idea of artificial neurons arose, and they have been used to build intelligent machines for DL-based predictive analytics. This is the key idea that inspired ANNs. Similarly to biological neurons, the artificial neuron consists of the following:

One or more incoming connections, with the task of collecting numerical signals from other neurons: each connection is assigned a weight that will be used to consider each signal sent
One or more output connections that carry the signal to the other neurons
An activation function, which determines the numerical value of the output signal, based on the signals received from the input connections with other neurons, and suitably collected from the weights associated with each received signal, and the activation threshold of the neuron itself:
Figure 8: Artificial neuron model.

The output, that is, the signal that the neuron transmits, is calculated by applying the activation function, also called the transfer function, to the weighted sum of the inputs. These functions have a dynamic range between -1 and 1, or between 0 and 1. Many activation functions differ in terms of complexity and output. Here, we briefly present the three simplest forms:

Step function: Once we fix the threshold value x (for example, x = 10), the function will return zero, or one if the mathematical sum of the inputs is at, above, or below the threshold value.
Linear combination: Instead of managing a threshold value, the weighted sum of the input values is subtracted from a default value. We will have a binary outcome that will be expressed by a positive (+b) or negative (-b) output of the subtraction.
Sigmoid: This produces a sigmoid curve, which is a curve with an S trend. Often, the sigmoid function refers to a special case of the logistic function.

From the simplest forms used in the prototyping of the first artificial neurons, we move to more complex forms that allow greater characterization of the functioning of the neuron:

Hyperbolic tangent function
Radial basis function
Conic section function
Softmax function:
Figure 9: The most commonly used artificial neuron model transfer functions. a. step function. b. linear function c. computed sigmoid function with values between 0 and 1. d. sigmoid function with computed values between -1 and 1.

Choosing proper activation functions (also weights initialization) is key to making a network perform at its best and to obtain good training. These topics are under a lot of research, and studies indicate marginal differences in terms of output quality if the training phase is carried out properly.

Note

There is no rule of thumb in the field of neural networks. It all depends on your data and in what form you want the data to be transformed, after passing through the activation function. If you want to choose a particular activation function, you need to study the graph of the function to see how the result changes with respect to the values given to it.