Among them, MS is the spatial attention module and F represents the operation of the convolutional layer, while F' denotes the features obtained after passing through the spatial attention module. The CBAM module is integrated into the operation of each potential residual block in ResNet50. In the classical ResNet model, when downsampling is necessary, a convolution operation with a 1x1 kernel and a stride of 2 is performed, which inevitably leads to information loss. Therefore, the convolution operation with a 1x1 kernel and a stride of 2 will be avoided in this study. Figure 3 describes the new residual block structure, which is modified from the original structures (a) and (b). The downsampling 1x1 convolution is also changed to a 3x3 layer convolution with the introduction of the CBAM module. When downsampling is necessary, the mapping component directly adjusts the size of the convolution kernel from 1x1 to 2x2.

ResNet50 with CBAM: Enhanced Feature Extraction and Downsampling

原文地址: https://www.cveoy.top/t/topic/oI0g 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录