ResNet

ResNet, for Residual Network, is a convolutional neural network that learns residual functions.

Network features:

  • ? parameters
  • 50 layers (49 convolutional + 1 fully connected)
  • use ReLU as activation function
  • use batch normalization for normalization
  • use bottleneck before 3x3 convolutions
  • use skip connections

Architecture

The architecture of the ResNet-50 is presented here.

Residual Learning

Unlike to networks that use convolutions without shortcut connections, the residual network learns the residual function. The figure below presents a classical convolutional block on the left and a residual block on the right.

Residual Learning

Let $x$ be the input a block and $H(x)$ its output. For the block on the left, the network is learning the output directly $H(x)$ but for the block on the right, the network is actually learning $F(x)$ which is the residual $F(x) = H(x) - x$.

For more information see [2].

Shortcut Connection

Also called skip connection, the shorcut connection is used to avoid the vanishing gradient problem [3]. This connection, represented by the right arrow on the figure below, simply perform an identity mapping: the input of the residual block is added (element-wise) to its output.

Shorcut connection

The shorcut connections propagate larger gradient to the first layers and allow them to learn as fast as the last layers. With these type of connection, it is possible to use deeper networks.

There are two types of shortcut connection: identity and projection shorcut connection. The identity shorcut connection is the one presented in the figure above. The input of the block is directly added to the output of the convolutions. The projection shorcut connection, presented in the figure below, is used when the input size of the block is different than the output of the convolutions. Since element-wise addition needs elements of the same size, a 1x1 convolution is used on the input to increase its size.

Shorcut connection

A residual block with an identity shortcut connection is called identity block and the one with a projection shortcut connection is called convolutional block.

Bibliography

results matching ""

    No results matching ""