AlexNet

AlexNet is a convolutional neural network.

Network features:

  • 60M parameters
  • 8 layers (5 convolutional + 3 fully connected)
  • use ReLU as activation function
  • use LRN for normalization

Architecture

Parameters

Layer Activation Shape Filter Shape Weights Biases Parameters
Input 3x224x224 - 0 0 0
Conv1 96x55x55 11x11 34,848 96 34,944
Pool1 96x27x27 - 0 0 0
Conv2 256x27x27 5x5 614,400 256 614,656
Pool2 256x13x13 - 0 0 0
Conv3 384x13x13 3x3 884,736 384 885,120
Conv4 384x13x13 3x3 1,327,104 384 1,327,488
Conv5 256x13x13 3x3 884,736 256 884,992
Pool3 256x6x6 - 0 0 0
FC1 4096x1x1 - 37,748,736 4096 37,752,832
FC2 4096x1x1 - 16,777,216 4096 16,781,312
FC3 1000x1x1 - 4,096,000 1000 4,097,000
Total         62,378,344

See [5] for explanations about parameters calculations.

Activation Function

ReLU is used after each convolutional and fully connected layers. AlexNet is one of the first neural network to use this activation function. The authors chose ReLu because “Deep convolutional neural networks with ReLUs train several times faster than their equivalents with tanh units” [1].

Dropout

In order to reduce overfitting, dropout are used after the two first fully connected layers. This technique presented by Hinton et al. [2] randomly deactivate each neuron of a layer (here the probability to be deactivated is 0.5). The deactivated neurons do not take part of the forward pass and the back propagation. In this way, a neuron cannot rely on the presence of another neuron, avoiding complex co-adaptions on the training data. At each new training input, the architecture of the network is changed but the weights are shared between all these achitectures. During the test step, all the neurons are used but the output is multiply by 0.5 to take the “geometric mean of the predictive distributions produced by the exponentially-many dropout networks” [1].

See also [3].

Local Response Normalization (LRN)

The LRN, introduced in the AlexNet paper [1], is used after ReLu to normalized values to avoid overgrowth. There are two types of LRN: Inter-Channel LRN and Intra-Channel LRN. The first one is used in AlexNet. For more explanations see [4].

Bibliography

results matching ""

    No results matching ""