PyTorch RNN Sentiment Analysis: Implementation and Evaluation
This assignment focuses on using PyTorch to implement Recurrent Neural Networks (RNNs) for the sentiment analysis task. Sentiment analysis aims to classify sentences (input) into specific sentiments (output labels), including 'positive', 'negative', and 'neutral'.
We will utilize the benchmark SST dataset for this assignment. The SST dataset is downloaded from the torchtext package, and preprocessing is done to build a vocabulary and split the dataset into training, validation, and test sets. This initial code snippet is provided and does not need modification.
import copy
import torch
from torch import nn
from torch import optim
import torchtext
from torchtext import data
from torchtext import datasets
TEXT = data.Field(sequential=True, batch_first=True, lower=True)
LABEL = data.LabelField()
# load data splits
train_data, val_data, test_data = datasets.SST.splits(TEXT, LABEL)
# build dictionary
TEXT.build_vocab(train_data)
LABEL.build_vocab(train_data)
# hyperparameters
vocab_size = len(TEXT.vocab)
label_size = len(LABEL.vocab)
padding_idx = TEXT.vocab.stoi['<pad>']
embedding_dim = 128
hidden_dim = 128
# build iterators
train_iter, val_iter, test_iter = data.BucketIterator.splits(
(train_data, val_data, test_data),
batch_size=32)
- Defining Training and Evaluation Functions:
The following code defines the training and evaluation functions for our sentiment analysis model.
def train(model, iterator, optimizer, criterion):
model.train()
epoch_loss = 0
epoch_acc = 0
for batch in iterator:
optimizer.zero_grad()
text, text_lengths = batch.text
predictions = model(text, text_lengths).squeeze(1)
loss = criterion(predictions, batch.label)
acc = accuracy(predictions, batch.label)
loss.backward()
optimizer.step()
epoch_loss += loss.item()
epoch_acc += acc.item()
return epoch_loss / len(iterator), epoch_acc / len(iterator)
def evaluate(model, iterator, criterion):
model.eval()
epoch_loss = 0
epoch_acc = 0
with torch.no_grad():
for batch in iterator:
text, text_lengths = batch.text
predictions = model(text, text_lengths).squeeze(1)
loss = criterion(predictions, batch.label)
acc = accuracy(predictions, batch.label)
epoch_loss += loss.item()
epoch_acc += acc.item()
return epoch_loss / len(iterator), epoch_acc / len(iterator)
def accuracy(preds, y):
preds = torch.argmax(preds, dim=1)
correct = (preds == y).float()
acc = correct.sum() / len(correct)
return acc
These functions will be used to train and evaluate the RNN model, calculating loss and accuracy metrics for each epoch.
原文地址: https://www.cveoy.top/t/topic/o9a3 著作权归作者所有。请勿转载和采集!