使用逻辑回归进行样本数据分类 - Python代码实现 - 常规

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

np.random.seed(5)  # 随机种子

a, b = 0, 2  # 类别区间

# 生成10个样本数据和对应的标签
X = np.random.randint(a, b, 20).reshape((10, 2))
y = np.random.randint(a, b, 5).reshape((5, 1))
y = np.concatenate([y, y], axis=0)

# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=5)

# 使用逻辑回归拟合数据
model = LogisticRegression()
model.fit(X_train, y_train)

# 计算精度
acc = model.score(X_test, y_test)

print('accu of model is {:.2f}'.format(acc))

输出结果为：

accu of model is 0.60

代码解释:

导入库: 导入NumPy、Scikit-learn中的train_test_split和LogisticRegression库。
设置随机种子: 使用np.random.seed(5)设置随机种子，确保结果可重复。
生成样本数据: 使用np.random.randint(a, b, 20).reshape((10, 2))生成10个2维样本数据。标签使用np.random.randint(a, b, 5).reshape((5, 1))生成，并重复以匹配样本数量。
划分数据集: 使用train_test_split(X, y, test_size=0.5, random_state=5)将数据划分成训练集和测试集，比例为0.5，并设置随机种子为5。
逻辑回归拟合: 创建逻辑回归模型，使用model.fit(X_train, y_train)进行训练。
计算精度: 使用model.score(X_test, y_test)计算模型在测试集上的精度。
打印结果: 打印模型精度，保留两位小数。

注意: 由于随机数据的生成，模型精度可能会有所不同。