基于GCN的图节点标签预测模型:使用CNN降维节点特征
基于GCN的图节点标签预测模型:使用CNN降维节点特征
本文介绍了如何使用GCN模型对图节点进行标签预测,并使用CNN对节点特征进行降维。模型使用42个图,每个图包含37个节点,每个节点有8个标签,并使用边的关系来构建图结构。通过使用CNN降维节点特征,可以有效地提高GCN模型的性能。
数据描述
- 图数量: 42
- 节点数量: 37
- 图像大小: 40
- 标签数量: 8
- 边数量: 61
- 节点特征文件: 'C:\Users\jh\Desktop\data\input\images{i}.png_{j}.png',其中'i'表示图序号,从1到42,'j'表示节点序号,从0到36,每个节点有8个标签,存储在'C:\Users\jh\Desktop\data\input\labels{i}{j}.txt'文本文件中,标签用空格隔开。
- 边关系文件: 'C:\Users\jh\Desktop\data\input\edges_L.csv',csv文件,表格中没有header,第一列为源节点,第二列为目标节点,共有61条无向边。
- 标签文件格式: 每个标签文件txt都是'2 3 1 1 3 2 2 1'这样的格式,只是数字不同,标签类别总共有:0,1,2,3,4。
模型架构
- CNN模型: 使用CNN模型对每个节点的图像特征进行降维,将维度从34040降维至5维。
- GCN模型: 使用GCN模型对降维后的节点特征进行学习,并预测每个节点的标签。
代码实现
import os
import pandas as pd
import torch
import torch.nn as nn
from torch_geometric.data import Data, DataLoader
from torch_geometric.nn import GCNConv
import torch.nn.functional as F
from torchvision import transforms
from PIL import Image
from sklearn.model_selection import train_test_split
# 定义CNN网络
class CNN(nn.Module):
def __init__(self, in_channels, out_channels):
super(CNN, self).__init__()
self.conv1 = nn.Conv2d(in_channels, 16, kernel_size=3, stride=1, padding=1)
self.pool = nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
self.conv2 = nn.Conv2d(16, out_channels, kernel_size=3, stride=1, padding=1)
def forward(self, x):
x = F.relu(self.conv1(x))
x = self.pool(x)
x = F.relu(self.conv2(x))
x = self.pool(x)
return x
# 定义GCN模型
class GCN(nn.Module):
def __init__(self, in_channels, out_channels):
super(GCN, self).__init__()
self.conv1 = GCNConv(in_channels, 128)
self.conv2 = GCNConv(128, out_channels)
def forward(self, data):
x, edge_index = data.x, data.edge_index
x = F.relu(self.conv1(x, edge_index))
x = self.conv2(x, edge_index)
return x
# 读取边的关系数据
edges = pd.read_csv(r'C:\Users\jh\Desktop\data\input\edges_L.csv', header=None)
edges = edges.values # 转换为NumPy数组
# 读取节点特征数据
features = []
for i in range(1, 43):
for j in range(37):
image_path = f'C:\Users\jh\Desktop\data\input\images{i}.png_{j}.png'
image = Image.open(image_path).convert('RGB')
transform = transforms.Compose([transforms.Resize((40, 40)), transforms.ToTensor()])
image_tensor = transform(image)
features.append(image_tensor)
# 将节点特征转换为PyTorch的Tensor
x = torch.stack(features)
x = x.view(-1, 3, 40, 40) # 调整数据的维度
# 划分训练集和验证集的掩码
mask_train = torch.zeros(42, 37, dtype=torch.bool)
mask_val = torch.zeros(42, 37, dtype=torch.bool)
for i in range(42):
mask_train[i, :30] = 1 # 将每个图的前30个节点设置为训练集
mask_val[i, 30:] = 1 # 将每个图的后7个节点设置为验证集
mask_train = mask_train.view(-1)
mask_val = mask_val.view(-1)
# 创建图结构
edge_index = torch.tensor(edges, dtype=torch.long).t().contiguous()
data_list = []
for i in range(42):
data = Data(x=x, edge_index=edge_index)
data.mask_train = mask_train[i * 37:(i + 1) * 37]
data.mask_val = mask_val[i * 37:(i + 1) * 37]
data_list.append(data)
# 创建CNN模型实例,降维至5维
cnn_model = CNN(in_channels=3, out_channels=5)
# 使用CNN模型对节点特征进行降维
with torch.no_grad():
cnn_output = []
for i in range(42):
x_i = x[i * 37:(i + 1) * 37].unsqueeze(1) # 获取当前图的节点特征
x_i = x_i.squeeze(1) # 去掉维度中的1
output_i = cnn_model(x_i) # 使用CNN模型对节点特征进行降维
output_i = output_i.view(output_i.size(0), -1) # 将特征展平为二维矩阵
cnn_output.append(output_i)
cnn_output = torch.cat(cnn_output, dim=0) # 将降维后的特征加入data对象
for i in range(42):
data_list[i].x = cnn_output[i * 37:(i + 1) * 37]
# 创建GCN模型实例
gcn_model = GCN(in_channels=40, out_channels=5)
# 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(gcn_model.parameters(), lr=0.01)
# 训练模型
num_epochs = 50
for epoch in range(num_epochs):
gcn_model.train()
total_loss = 0
for data in data_list:
optimizer.zero_grad()
i = int(data.x[0][0]) // 37 + 1
j = int(data.x[0][0]) % 37
labels_path = f'C:\Users\jh\Desktop\data\input\labels{i}{j}.txt'
labels = torch.tensor([list(map(int, open(labels_path).read().strip().split()))])
labels = labels.squeeze(0) # 将标签的 shape 修改为 (num_nodes,)
out = gcn_model(data)
loss = criterion(out, labels)
loss.backward()
optimizer.step()
total_loss += loss.item()
avg_loss = total_loss / len(data_list)
print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {avg_loss:.4f}')
# 在验证集上评估模型
gcn_model.eval()
with torch.no_grad():
total_correct = 0
total_samples = 0
for data in data_list:
i = int(data.x[0][0]) // 37 + 1
j = int(data.x[0][0]) % 37
labels_path = f'C:\Users\jh\Desktop\data\input\labels{i}{j}.txt'
labels = torch.tensor([list(map(int, open(labels_path).read().strip().split()))])
labels = labels.squeeze(0) # 将标签的 shape 修改为 (num_nodes,)
out = gcn_model(data)
predicted = torch.round(torch.sigmoid(out))
total_correct += (predicted == labels).sum().item()
total_samples += labels.size(0) * labels.size(1)
accuracy = total_correct / total_samples
print(f'Validation Accuracy: {accuracy:.2f}')
总结
本文介绍了如何使用GCN模型对图节点进行标签预测,并使用CNN对节点特征进行降维。通过使用CNN降维节点特征,可以有效地提高GCN模型的性能。
原文地址: https://www.cveoy.top/t/topic/phaf 著作权归作者所有。请勿转载和采集!