解决PyTorch报错: RuntimeError: The size of tensor a (4) must match the size of tensor b (128) at non-singleton dimension 2

在执行PyTorch模型时，经常会遇到各种报错信息，例如RuntimeError: The size of tensor a (4) must match the size of tensor b (128) at non-singleton dimension 2。这通常意味着在执行操作时，两个张量的维度不匹配。

在这个例子中，报错信息指出在执行self.pe[:x.size(0), :]这一行代码时，x的第二个维度大小为4，而self.pe的第二个维度大小为128，两者不匹配。

为了解决这个问题，可以采取以下步骤：

检查输入数据维度: 首先，需要检查一下输入的batch_x的维度是否正确，如果不正确，需要进行调整。
检查self.pe维度: 其次，需要检查一下self.pe的维度是否正确，需要和batch_x的维度匹配。
修改self.pe维度: 如果以上两个方面都没有问题，那么可以考虑修改代码。根据报错信息，可以看出是在self.pe的第一维上进行了切片操作，所以我们可以尝试修改self.pe的维度，使得它的第一维大小与batch_x的第二个维度大小相同。具体来说，可以将self.pe的维度修改为(batch_size, seq_len, d_model)，其中batch_size为batch_x的第一个维度大小，seq_len为batch_x的第二个维度大小，d_model为模型的隐藏层维度。

修改后的PositionalEncoding类代码如下：

class PositionalEncoding(nn.Module):
    def __init__(self, d_model, dropout=0.1, max_len=5000):
        super(PositionalEncoding, self).__init__()
        self.dropout = nn.Dropout(p=dropout)

        pe = torch.zeros(max_len, d_model)
        position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
        div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
        pe[:, 0::2] = torch.sin(position * div_term)
        pe[:, 1::2] = torch.cos(position * div_term)
        pe = pe.unsqueeze(0)
        self.register_buffer('pe', pe)

    def forward(self, x):
        # x.shape: (batch_size, seq_len, d_model)
        x = x + self.pe[:, :x.size(1), :]
        return self.dropout(x)

在修改后的代码中，我们通过self.register_buffer来将self.pe注册为模型参数，这样它就可以自动被保存和加载。同时，我们在forward函数中也进行了相应的修改，使得self.pe的第一维大小与x的第二个维度大小相同。这样，就可以避免维度不匹配的问题。

通过以上步骤，您就可以解决RuntimeError: The size of tensor a (4) must match the size of tensor b (128) at non-singleton dimension 2错误，并继续进行模型训练。

解决PyTorch报错: RuntimeError: The size of tensor a (4) must match the size of tensor b (128) at non-singleton dimension 2