21. December 2016

Deep Learning

For the artificial intelligence the so-called deep learning bears much potential for progress

It is a method for processing information which is drawn from huge bulks of data and it detects connections between the different kinds information. For this purpose, multi-layer neural networks, deep networks, are used. The depth of the architecture allows drawing information from the data layer by layer so that the data can be presented in a simplified and more abstract way. Currently, deep learning is used mainly for image and speech recognition.

Opposed to a normal neural network a deep network consists of multiple layers between the input and output layer, the so-called hidden layers.

Each hidden layer is trained by common methods like backpropagation. The input layer is the input vector of the first hidden layer. For the following layers it is always the output layer of the previous hidden layer which is used as input vector. The output layer of the last hidden layer is then the output layer of the whole deep network. The separate training of each layer prevents problems like vanishing gradients or overfitting.

Stacked autoencoders, deep belief networks (stacked restricted Boltzman machines) and convolutional networks are the best known examples of deep networks.

Fig. architecture of the deep network