上述两份程序都是采用计算机视觉和机器学习进行数据处理和分析的。具体而言它们采用了mediapipe框架这是一个专用于机器学习中姿势估计手势识别等模型构建的框架。这个框架被广泛地应用于计算机视觉领域尤其是机器人学、计算机辅助诊断、人机交互等领域。训练处理视频的程序需要使用一个已知的数据集进行监督学习以训练机器模型进行动作分类。在这个程序中K-近邻KNN算法被应用于训练数据集并将其与测试数据进行比较分
The adopted technological methods in the two programs are based on machine learning, computer vision, and related domain knowledge, utilizing modern algorithms and frameworks to effectively solve many problems in the field of human motion recognition. Specifically, the mediapipe framework is used, which is dedicated to the construction of models such as pose estimation and gesture recognition in machine learning. This framework has been widely applied in the field of computer vision, especially in robotics, computer-aided diagnosis, human-computer interaction, and other fields.
The program for training and processing videos requires supervised learning using a known dataset to train machine models for action classification. In this program, the K-Nearest Neighbor (KNN) algorithm is applied to the training dataset and compared with the test data for classification. KNN is a supervised machine learning algorithm that treats each data sample point as a point in an n-dimensional space. The closer the distance between the samples, the more likely they belong to the same class. Therefore, the algorithm calculates the k-nearest points (a constant) between the classified sample and the sample data, and then classifies the classified data based on the labels of these points. The label generation method has multiple options, such as Euclidean distance, cosine distance, and Manhattan distance.
In the program for extracting data from training, the OpenCV library is used to process video streams to obtain keypoint information. OpenCV is an open-source library widely used in image and video processing in computer vision, providing many computer vision algorithms and tools. Specifically, OpenCV uses the mediapipe Pose model to perform keypoint detection of the human body to obtain accurate human motion data.
However, calculating the position of the keypoint is only the first step in extracting human motion. Further analysis and processing are required based on this. In this program, many processing and computing techniques are used, such as calculating the distance between two keypoints, calculating the angle, and calculating the time interval of key actions. This can extract higher-dimensional information and form a complete feature vector, which is convenient for subsequent training and classification.
In summary, the technological methods of these two programs are based on machine learning, computer vision, and related domain knowledge, utilizing modern algorithms and frameworks, which effectively solve many problems in the field of human motion recognition, especially for expressing and recognizing the details of human posture changes. The accuracy of recognition has also been significantly improved. In future research, we have reason to believe that these technological methods can be applied to a wider range of fields such as videos, health, and entertainment, promoting the development of human intelligence and technology
原文地址: https://www.cveoy.top/t/topic/ew8W 著作权归作者所有。请勿转载和采集!