使用MediaPipe和TensorFlow识别视频中的指定动作
首先,需要加载训练好的模型,用于识别动作。可以使用OpenCV的cv2.VideoCapture函数读取视频文件,并逐帧处理视频中的每个画面。对于每个画面,使用与生成数据集时相同的方法提取骨骼,并计算腿与右手的角度。然后,将角度传递给训练好的模型进行预测,以判断当前画面中的动作是否为所需的任务动作。最后,可以在画面中绘制出骨骼,并将结果保存为视频文件。
以下是示例代码:
import mediapipe as mp
import cv2
import os
import pandas as pd
import math
import tensorflow as tf
import numpy as np
# 加载模型
model = tf.keras.models.load_model('model.h5')
# 定义保存骨骼角度的函数
def save_angles(angles_list, action_name, folder_name):
filename = f'{folder_name}_{action_name}.csv'
df = pd.DataFrame(angles_list, columns=['angle1', 'angle2', 'angle3','angle4','angle5'])
df.to_csv(filename, index=False)
print(f'{filename} saved successfully')
# 初始化mediapipe
mp_drawing = mp.solutions.drawing_utils
mp_pose = mp.solutions.pose
# 定义任务动作
task_action = 'jumping_jacks'
# 读取视频文件
cap = cv2.VideoCapture('1.mp4')
# 获取视频帧率
fps = cap.get(cv2.CAP_PROP_FPS)
# 获取视频宽度和高度
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
# 创建视频编写器
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter('output.mp4', fourcc, fps, (width, height))
# 遍历视频中的每个画面
while cap.isOpened():
# 读取画面
ret, image = cap.read()
if not ret:
break
# 将画面转换为RGB格式
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# 处理画面,提取骨骼
with mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5) as pose:
results = pose.process(image)
if results.pose_landmarks is None:
continue
right_knee = results.pose_landmarks.landmark[mp_pose.PoseLandmark.RIGHT_KNEE]
right_ankle = results.pose_landmarks.landmark[mp_pose.PoseLandmark.RIGHT_ANKLE]
right_wrist = results.pose_landmarks.landmark[mp_pose.PoseLandmark.RIGHT_WRIST]
# 获取左肩、左肘和左手腕关键点的信息
left_shoulder = results.pose_landmarks.landmark[mp_pose.PoseLandmark.LEFT_SHOULDER]
left_elbow = results.pose_landmarks.landmark[mp_pose.PoseLandmark.LEFT_ELBOW]
left_wrist = results.pose_landmarks.landmark[mp_pose.PoseLandmark.LEFT_WRIST]
# 获取右肩、右肘和右手腕关键点的信息
right_shoulder = results.pose_landmarks.landmark[mp_pose.PoseLandmark.RIGHT_SHOULDER]
right_elbow = results.pose_landmarks.landmark[mp_pose.PoseLandmark.RIGHT_ELBOW]
right_wrist = results.pose_landmarks.landmark[mp_pose.PoseLandmark.RIGHT_WRIST]
# 获取左臀、左膝和左踝关键点的信息
left_hip = results.pose_landmarks.landmark[mp_pose.PoseLandmark.LEFT_HIP]
left_knee = results.pose_landmarks.landmark[mp_pose.PoseLandmark.LEFT_KNEE]
left_ankle = results.pose_landmarks.landmark[mp_pose.PoseLandmark.LEFT_ANKLE]
# 获取右臀、右膝和右踝关键点的信息
right_hip = results.pose_landmarks.landmark[mp_pose.PoseLandmark.RIGHT_HIP]
right_knee = results.pose_landmarks.landmark[mp_pose.PoseLandmark.RIGHT_KNEE]
right_ankle = results.pose_landmarks.landmark[mp_pose.PoseLandmark.RIGHT_ANKLE]
# 计算腿与右手的角度
angle = math.degrees(math.atan2(right_wrist.y - right_ankle.y, right_wrist.x - right_ankle.x) -
math.atan2(right_knee.y - right_ankle.y, right_knee.x - right_ankle.x))
# 获取左肩、左肘和左手腕
angle1 = math.degrees(math.atan2(right_wrist.y - right_ankle.y, right_wrist.x - right_ankle.x) -
math.atan2(right_knee.y - right_ankle.y, right_knee.x - right_ankle.x))
# 获取左臀、左膝和左踝
angle_dl = math.degrees(math.atan2(left_ankle.y - left_knee.y, left_ankle.x - left_knee.x) -
math.atan2(left_hip.y - left_knee.y, left_hip.x - left_knee.x))
# 获取右臀、右膝和右踝
angle_dr = math.degrees(math.atan2(right_ankle.y - right_knee.y, right_ankle.x - right_knee.x) -
math.atan2(right_hip.y - right_knee.y, right_hip.x - right_knee.x))
# 获取右肩、右肘和右手腕
angle_tr = math.degrees(math.atan2(right_wrist.y - right_elbow.y, right_wrist.x - right_elbow.x) -
math.atan2(right_shoulder.y - right_elbow.y, right_shoulder.x - right_elbow.x))
# 将角度添加到角度列表中
angles_list = [[angle, angle1, angle_dl, angle_dr, angle_tr]]
# 预测动作
prediction = model.predict(np.array(angles_list))
action = np.argmax(prediction)
# 判断是否为任务动作
if action == task_action:
# 在图像上绘制骨骼
annotated_image = image.copy()
mp_drawing.draw_landmarks(annotated_image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)
# 将绘制好骨骼的图像添加到视频中
out.write(cv2.cvtColor(annotated_image, cv2.COLOR_RGB2BGR))
# 显示画面
cv2.imshow('image', cv2.cvtColor(image, cv2.COLOR_RGB2BGR))
# 按下q键退出
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# 释放资源
cap.release()
out.release()
cv2.destroyAllWindows()
注意:
model.h5是训练好的模型文件,需要替换为实际的模型文件路径。task_action是需要识别的任务动作,需要替换为实际的动作名称。1.mp4是需要分析的视频文件,需要替换为实际的视频文件路径。output.mp4是保存结果的视频文件,需要替换为实际的文件路径。- 代码中使用了
math.degrees将弧度转换为角度。 - 代码中使用了
np.argmax获取预测结果中概率最大的动作。 - 代码中使用了
cv2.waitKey(1)控制视频播放速度,可以调整参数以改变播放速度。 - 代码中使用了
cv2.destroyAllWindows()关闭所有窗口。
本文介绍了使用MediaPipe和TensorFlow识别视频中指定动作的基本流程和代码示例。在实际应用中,可能需要根据具体情况调整参数和代码,以获得最佳的识别效果。
原文地址: http://www.cveoy.top/t/topic/guck 著作权归作者所有。请勿转载和采集!