RuntimeError Errors in loading state_dict for GPT2LMHeadModel size mismatch for transformerwteweight copying a param with shape torchSize21128 768 from checkpoint the shape in current model is

This error occurs when the size of the weight matrices in the saved model checkpoint do not match the size of the corresponding weight matrices in the current model being loaded. Specifically, in this case, the error is indicating that the size of the "wte.weight" and "lm_head.weight" matrices in the saved checkpoint is different from the size of these matrices in the current GPT2LMHeadModel being loaded.

To resolve this error, you will need to ensure that the size of these weight matrices in the saved checkpoint matches the size of the corresponding weight matrices in the current model being loaded. One way to do this is to modify the current model's weight matrices to match the size of the saved checkpoint's weight matrices before loading the state_dict. Another option is to modify the saved checkpoint's weight matrices to match the size of the current model's weight matrices before loading the state_dict