多模态特征融合:基于CNN和BiGRU的八分类模型

本项目使用CNN和BiGRU模型并行提取特征,并进行特征融合,最终使用全连接层进行八分类。训练集、验证集和测试集均为包含23维特征值和分类标签的txt文件。代码使用PyTorch实现,包含模型定义、数据加载、训练、验证和测试等环节。

数据集格式:

在训练集、验证集和测试集三个txt文件夹中,每条样本包含23个特征值和1个分类标签,共24位数据,以逗号分隔,如下所示:

7,7,183,233,10,10,3,10,3,10,0,25,21,0,0,2,78,2,1,0,0,86.6685638427734,1.25,4
7,7,183,233,10,10,3,10,3,10,0,25,21,90,80,20,10,2,1,0,0,86.4980087280273,1.10,0
7,0,183,0,9,0,3,10,3,0,0,25,123,90,80,20,10,0,1,0,1,0,1.00,7

其中前23位为特征值,最后一位为分类标签,共有8个类别。

模型架构:

  1. CNN模型:输入为将特征值转化为图片格式的数据,经过卷积层、池化层和全连接层,提取图像特征。
  2. BiGRU模型:输入为原始数据的23维特征值,经过双向GRU层和全连接层,提取序列特征。
  3. 特征融合:将CNN和BiGRU提取的特征进行拼接。
  4. 全连接层:对融合后的特征进行八分类。

代码实现:

import torch
import torch.nn as nn
import torch.optim as optim
import torch.utils.data as data
import torchvision.transforms as transforms

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 16, kernel_size=3, stride=1, padding=1)
        self.relu1 = nn.ReLU(inplace=True)
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1)
        self.relu2 = nn.ReLU(inplace=True)
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc1 = nn.Linear(32 * 7 * 7, 128)
        self.relu3 = nn.ReLU(inplace=True)
        
    def forward(self, x):
        x = self.conv1(x)
        x = self.relu1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.relu2(x)
        x = self.pool2(x)
        x = x.view(x.size(0), -1)
        x = self.fc1(x)
        x = self.relu3(x)
        return x

class BiGRU(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, num_classes):
        super(BiGRU, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.gru = nn.GRU(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True)
        self.fc = nn.Linear(hidden_size * 2, num_classes)
        
    def forward(self, x):
        h0 = torch.zeros(self.num_layers * 2, x.size(0), self.hidden_size).to(device)
        out, _ = self.gru(x, h0)
        out = self.fc(out[:, -1, :])
        return out

class FusionModel(nn.Module):
    def __init__(self, cnn, bigru, num_classes):
        super(FusionModel, self).__init__()
        self.cnn = cnn
        self.bigru = bigru
        self.fc = nn.Linear(128 + bigru.hidden_size * 2, num_classes)
        
    def forward(self, x1, x2):
        x1 = self.cnn(x1)
        x2 = self.bigru(x2)
        x = torch.cat((x1, x2), dim=1)
        x = self.fc(x)
        return x

# Hyperparameters
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
num_epochs = 10
batch_size = 32
learning_rate = 0.001

# Load data
transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((28, 28)),
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,)),
])

train_dataset = data.Dataset.from_folder('train_txt_folder', transform=transform)
val_dataset = data.Dataset.from_folder('val_txt_folder', transform=transform)
test_dataset = data.Dataset.from_folder('test_txt_folder', transform=transform)

train_loader = data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = data.DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
test_loader = data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Initialize models
cnn = CNN().to(device)
bigru = BiGRU(input_size=23, hidden_size=64, num_layers=2, num_classes=8).to(device)
model = FusionModel(cnn, bigru, num_classes=8).to(device)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# Training loop
total_step = len(train_loader)
for epoch in range(num_epochs):
    model.train()
    for i, (images, features, labels) in enumerate(train_loader):
        images = images.to(device)
        features = features.to(device)
        labels = labels.to(device)
        
        # Forward pass
        outputs = model(images, features)
        loss = criterion(outputs, labels)
        
        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        if (i+1) % 100 == 0:
            print(f'Epoch [{epoch+1}/{num_epochs}], Step [{i+1}/{total_step}], Loss: {loss.item():.4f}')

    # Validation
    model.eval()
    with torch.no_grad():
        correct = 0
        total = 0
        for images, features, labels in val_loader:
            images = images.to(device)
            features = features.to(device)
            labels = labels.to(device)
            
            outputs = model(images, features)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

        accuracy = 100 * correct / total
        print(f'Validation Accuracy: {accuracy:.2f}%')

# Testing
model.eval()
with torch.no_grad():
    correct = 0
    total = 0
    for images, features, labels in test_loader:
        images = images.to(device)
        features = features.to(device)
        labels = labels.to(device)

        outputs = model(images, features)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    accuracy = 100 * correct / total
    print(f'Test Accuracy: {accuracy:.2f}%')

代码说明:

  1. 模型定义:定义CNN、BiGRU和融合模型的网络结构。
  2. 数据加载:定义数据转换函数,将特征值转化为图片格式,并加载训练集、验证集和测试集数据。
  3. 训练:定义损失函数和优化器,进行模型训练,并定期打印训练损失。
  4. 验证:在训练过程中,使用验证集评估模型性能,并打印验证集准确率。
  5. 测试:训练完成后,使用测试集评估模型性能,并打印测试集准确率。

注意:

  1. 该代码假设数据集已准备好,并存储在名为'train_txt_folder'、'val_txt_folder'和'test_txt_folder'的文件夹中。
  2. 代码中CNN模型的输入图像大小为28x28,需要根据实际数据调整。
  3. 代码中BiGRU模型的输入特征维度为23,需要根据实际数据调整。
  4. 代码中超参数(如学习率、batch大小、训练轮数等)可以根据实际情况调整。

总结:

本项目使用CNN和BiGRU模型并行提取特征,并进行特征融合,最终使用全连接层进行八分类。该方法可以有效地融合图像特征和序列特征,提高模型性能。代码使用PyTorch实现,结构清晰,易于理解和修改。

多模态特征融合:基于CNN和BiGRU的八分类模型

原文地址: https://www.cveoy.top/t/topic/lHmW 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录