要解决这个错误,需要修改zero_state函数中的hiddencell的形状。在原始代码中,hiddencell的形状是(3-D, 3-D),但应该是(2-D, 2-D)。在LSTM层的初始化中,num_layers参数设置为2,这表示有2层LSTM。因此,hiddencell应该是一个2维的张量,而不是3维的张量。

以下是修改后的代码:

class RNNClassifier(nn.Module):
    def __init__(self, vocab_size, embedding_dim, hidden_dim, label_size, padding_idx):
        super(RNNClassifier, self).__init__()
        self.vocab_size = vocab_size
        self.embedding_dim = embedding_dim
        self.hidden_dim = hidden_dim
        self.label_size = label_size
        self.num_layers = 2  # change the number of layers here
        self.dropout = nn.Dropout(0.5)  # add dropout layer

        # Embedding Layer
        self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx=padding_idx)
        self.embedding_dropout = nn.Dropout(0.1)  # add embedding dropout layer

        # LSTM Layer
        self.lstm = nn.LSTM(embedding_dim, hidden_dim, num_layers=self.num_layers, batch_first=True)
        self.lstm_dropout = nn.Dropout(0.1)  # add lstm dropout layer

        # Fully Connected Layer
        self.fc = nn.Linear(hidden_dim, label_size)
        self.fc_dropout = nn.Dropout(0.1)  # add fc dropout layer

    def zero_state(self, batch_size):
        hidden = torch.zeros(self.num_layers, batch_size, self.hidden_dim)
        cell = torch.zeros(self.num_layers, batch_size, self.hidden_dim)
        return hidden, cell

    def forward(self, text):
        # text shape = [batch_size, seq_len]
        # Embedding
        emb = self.embedding(text)  # shape = [batch_size, seq_len, embedding_dim]
        emb = self.embedding_dropout(emb)  # apply dropout on embedding
        emb = torch.mean(emb, dim=1)  # mean pooling over time step

        # LSTM Layer
        h0, c0 = self.zero_state(text.size(0))  # shape = [num_layers, batch_size, hidden_dim]
        output, (hn, cn) = self.lstm(emb.unsqueeze(1), (h0, c0))  # add unsqueeze to convert 2-D to 3-D
        output = self.lstm_dropout(output)  # apply dropout on output of lstm

        # Fully Connected Layer
        output = torch.mean(output, dim=1)  # mean pooling over time step
        output = self.fc(output)  # pass through fully connected layer
        output = self.fc_dropout(output)  # apply dropout on output of fc layer

        return output

forward函数中,对于LSTM层的输入,需要使用unsqueeze函数将emb的维度从2维扩展到3维,以匹配hiddencell的维度。

解决RNN分类器错误: RuntimeError: For unbatched 2-D input, hx and cx should also be 2-D but got (3-D, 3-D) tensors

原文地址: https://www.cveoy.top/t/topic/o9vq 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录