用python写出对yolov5s的输出端的三个featuremap大小分别为1272424 1271212 12766的处理
假设我们已经通过yolov5s网络得到了输出端的三个feature map,分别为featmap1、featmap2和featmap3。
下面是对这三个feature map的处理代码:
import torch
import numpy as np
# 定义处理函数
def process_featmap(featmap, S, B, C):
# featmap: 输入的feature map,size=[1, C*S*S+B*5, S, S]
# S: 每个格子的大小
# B: 每个格子预测的bbox数
# C: 类别数
# 获取特征图的大小
_, _, H, W = featmap.shape
# 将feature map转成numpy格式
featmap = featmap.detach().cpu().numpy()
# 将feature map按照bbox数拆分成不同的部分
bbox_attrs = 5 + C
featmap = np.reshape(featmap, (1, B, bbox_attrs, S*S, H*W))
featmap = np.transpose(featmap, (0, 4, 3, 1, 2))
# 获取bbox的坐标信息和置信度信息
bbox_xy = torch.sigmoid(torch.from_numpy(featmap[..., :2]))
bbox_wh = torch.exp(torch.from_numpy(featmap[..., 2:4]))
obj_conf = torch.sigmoid(torch.from_numpy(featmap[..., 4]))
# 获取类别信息
class_conf = torch.sigmoid(torch.from_numpy(featmap[..., 5:]))
# 计算bbox在原图中的坐标
grid_x = np.arange(S)
grid_y = np.arange(S)
grid_x, grid_y = np.meshgrid(grid_x, grid_y)
x_offset = torch.from_numpy(grid_x).view(-1, 1)
y_offset = torch.from_numpy(grid_y).view(-1, 1)
x_y_offset = torch.cat((x_offset, y_offset), 1).repeat(1, B*C).view(-1, 2).unsqueeze(0)
bbox_xy = (bbox_xy + x_y_offset) * S
bbox_wh = bbox_wh * S
# 计算bbox的左上角和右下角坐标
bbox_x1y1 = bbox_xy - 0.5 * bbox_wh
bbox_x2y2 = bbox_xy + 0.5 * bbox_wh
bbox_coords = torch.cat((bbox_x1y1, bbox_x2y2), 2)
# 将bbox的坐标信息、置信度信息和类别信息拼接在一起
detections = torch.cat((bbox_coords.view(-1, 4), obj_conf.view(-1, 1), class_conf.view(-1, C)), 1)
return detections
# 对featmap1、featmap2和featmap3进行处理
detections1 = process_featmap(featmap1, 24, 3, 80)
detections2 = process_featmap(featmap2, 12, 3, 80)
detections3 = process_featmap(featmap3, 6, 3, 80)
以上就是对yolov5s输出端的三个feature map的处理代码。处理结果是每个feature map上预测出的目标的坐标信息、置信度和类别信息,可以用来进行目标检测和识别。
原文地址: https://www.cveoy.top/t/topic/Ycz 著作权归作者所有。请勿转载和采集!