GoogLeNet

GoogLeNet, also called Inception-v1, is a particular incarnation of the Inception architecture proposed by Szegedy et al. in 2015 [1].

Network features:

  • 22 layers
  • use inception modules
  • use bottleneck before 3x3 and 5x5 convolutions

Architecture

Convolution

As explained by the authors in their paper [1] all the convolution are followed by a rectified linear activation (ReLU).

Convolution and ReLu

Inception module

The inception module is designed to allow CNN to benefit from multi-level feature extraction by implementing filters at various sizes (e.g. 1x1, 3x3, 5x5) in the same layer of the network. This allow the network to capture information at various scales and complexities. [3]

Inception module

Depth concatenation

The output of all filters and the pooling layer concatenated along the channel dimension before fed to the next layer. This concatenation ensures that the subsequent layers can access features extracted at different scales. [3]

Depth concatenation

Global average pooling

The global average, used at the end of the network, averages each 7x7 feature map into 1x1. The difference between the global average and a fully connected layer is the number of weights. If a fully connected layer is used on a 1024x7x7 feature maps then there are 1024x7x7x1024 weights (cf. the figure below).

Fully connected layer

If a global average pooling is used on the same bloc, there is 0 weight (cf. the figure below).

Global average pooling

Bibliography

results matching ""

    No results matching ""