AlexNet

AlexNet is a convolutional neural network.

Network features:

60M parameters
8 layers (5 convolutional + 3 fully connected)
use ReLU as activation function
use LRN for normalization

Architecture

Parameters

Layer	Activation Shape	Filter Shape	Weights	Biases	Parameters
Input	3x224x224	-	0	0	0
Conv1	96x55x55	11x11	34,848	96	34,944
Pool1	96x27x27	-	0	0	0
Conv2	256x27x27	5x5	614,400	256	614,656
Pool2	256x13x13	-	0	0	0
Conv3	384x13x13	3x3	884,736	384	885,120
Conv4	384x13x13	3x3	1,327,104	384	1,327,488
Conv5	256x13x13	3x3	884,736	256	884,992
Pool3	256x6x6	-	0	0	0
FC1	4096x1x1	-	37,748,736	4096	37,752,832
FC2	4096x1x1	-	16,777,216	4096	16,781,312
FC3	1000x1x1	-	4,096,000	1000	4,097,000
Total					62,378,344

See [5] for explanations about parameters calculations.

Activation Function

ReLU is used after each convolutional and fully connected layers. AlexNet is one of the first neural network to use this activation function. The authors chose ReLu because “Deep convolutional neural networks with ReLUs train several times faster than their equivalents with tanh units” [1].

Dropout

In order to reduce overfitting, dropout are used after the two first fully connected layers. This technique presented by Hinton et al. [2] randomly deactivate each neuron of a layer (here the probability to be deactivated is 0.5). The deactivated neurons do not take part of the forward pass and the back propagation. In this way, a neuron cannot rely on the presence of another neuron, avoiding complex co-adaptions on the training data. At each new training input, the architecture of the network is changed but the weights are shared between all these achitectures. During the test step, all the neurons are used but the output is multiply by 0.5 to take the “geometric mean of the predictive distributions produced by the exponentially-many dropout networks” [1].

Local Response Normalization (LRN)

The LRN, introduced in the AlexNet paper [1], is used after ReLu to normalized values to avoid overgrowth. There are two types of LRN: Inter-Channel LRN and Intra-Channel LRN. The first one is used in AlexNet. For more explanations see [4].

AlexNet

AlexNet

Architecture

Parameters

Activation Function

Dropout

Local Response Normalization (LRN)

Bibliography

results matching ""

No results matching ""