Realtime Multi-Person Pose Estimation with CPM using VGG19 or MobileNet

This code defines a model for Realtime Multi-Person Pose Estimation based on Convolutional Pose Machines (CPM) using either VGG19 or MobileNet as the backbone. The model can estimate the poses of multiple people in real-time.

The code defines several functions:

make_stages: Builds CPM stages from a dictionary.
make_vgg19_block: Builds a VGG19 block from a dictionary.
get_model: Creates the whole CPM model.

The get_model function takes a trunk argument which determines whether to use VGG19 or MobileNet as the backbone. It then builds the model using make_vgg19_block and make_stages functions.

The rtpose_model class is a subclass of nn.Module that defines the forward pass of the model. It takes an input x and passes it through the block0 for preprocessing. The output is then passed through multiple CPM stages and the final output is the estimated poses. The code also saves some intermediate results for computing the loss function.

Finally, the model weights are initialized using a normal distribution.

Realtime Multi-Person Pose Estimation with CPM using VGG19 or MobileNet