YOLOv5 Object Detection: Processing Detections and Saving Results
This code snippet is a crucial part of the YOLOv5 object detection model, responsible for processing predictions, applying non-maximum suppression (NMS), and saving the results as images and/or text files. Here's a detailed breakdown of the code:
-
Setting Up Parameters
imgsz = check_img_size(imgsz, s=model.stride.max()) # check img_size device = select_device(device) half = device.type != 'cpu' # half precision only supported on CUDAThis section ensures the input image size aligns with the model's requirements, selects the appropriate device (CPU or GPU), and determines whether to use half-precision floating-point computations for efficiency.
-
Model Construction
model = attempt_load(weights, map_location=device) # load FP32 modelHere, the model weights are loaded, the model is constructed, and it's moved to the selected device.
-
Loading Images
dataset = LoadImages(source, img_size=imgsz, stride=stride)The
LoadImagesclass handles loading input images and resizing them to the model's expected size. -
Prediction
for path, img, im0s, vid_cap in dataset: img = torch.from_numpy(img).to(device) img = img.half() if half else img.float() # uint8 to fp16/32 img /= 255.0 # 0 - 255 to 0.0 - 1.0 if img.ndimension() == 3: img = img.unsqueeze(0) # Inference t1 = time_synchronized() pred = model(img, augment=opt.augment)[0]The image is transformed into a PyTorch tensor, moved to the device, and passed through the model for prediction. Data augmentation (if enabled) is applied during this step.
-
Processing Detections
# Apply NMS pred = non_max_suppression(pred, opt.conf_thres, opt.iou_thres, classes=opt.classes, agnostic=opt.agnostic_nms) t2 = time_synchronized() # Process detections for i, det in enumerate(pred): # detections per image if webcam: # batch_size >= 1 p, s, im0, frame = path[i], f'{i}: ', im0s[i].copy(), dataset.count else: p, s, im0, frame = path, '', im0s.copy(), getattr(dataset, 'frame', 0)Non-maximum suppression (NMS) is applied to filter overlapping predictions, and the results are processed. For video inputs, the current frame number is tracked.
-
Outputting Results
p = Path(p) # to Path save_path = str(save_dir / p.name) # img.jpg txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}') # img.txt s += '%gx%g ' % img.shape[2:] # print string gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh imc = im0.copy() if save_crop else im0 # for save_crop if len(det): # Rescale boxes from img_size to im0 size det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round() # Print results for c in det[:, -1].unique(): n = (det[:, -1] == c).sum() # detections per class s += f'{n} {names[int(c)]}{'s' * (n > 1)}, ' # add to stringThe code sets paths for saving results (image and text files), counts and categorizes detected objects, and rescales bounding box coordinates for display.
-
Displaying and Saving Results
# Stream results if view_img: cv2.imshow(str(p), im0) cv2.waitKey(1) # 1 millisecond # Save results (image with detections) if save_img: if dataset.mode == 'image': cv2.imwrite(save_path, im0) else: # 'video' or 'stream' if vid_path[i] != save_path: # new video vid_path[i] = save_path if isinstance(vid_writer[i], cv2.VideoWriter): vid_writer[i].release() # release previous video writer if vid_cap: # video fps = vid_cap.get(cv2.CAP_PROP_FPS) w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH)) h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) else: # stream fps, w, h = 30, im0.shape[1], im0.shape[0] save_path += '.mp4' vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h)) vid_writer[i].write(im0)This section visualizes the results if
view_imgis enabled and saves them as images or videos depending on the input type. Thesave_imgflag controls whether the results are saved. -
Outputting Results
if save_txt or save_img: s = f'
{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}' if save_txt else '' print(f'Results saved to {save_dir}{s}')
The code prints the paths where results are saved.
9. **Model Update**
```python
if update:
strip_optimizer(weights) # update model (to fix SourceChangeWarning)
This section updates the model if update is set to handle potential SourceChangeWarning issues.
This code snippet serves as the core of YOLOv5's detection process. It demonstrates the steps involved in processing predictions, applying NMS, and saving the results in various formats. It also includes features for visualization and model updating, enhancing the overall utility of the YOLOv5 model.
原文地址: https://www.cveoy.top/t/topic/oHSZ 著作权归作者所有。请勿转载和采集!