Python实现1 下载标记为欺诈或真实的匿名信用卡交易httpswwwkagglecomcodepierracredit-card-dataset-svm-classificationinput。2 按照一定比例随机将数据集分成训练集和测试集。3 在训练数据集上实现CFSCorrelation-based Feature Selection特征选择算法从原始特征空间中选择一部分特征。注意:选择特
- 下载数据集,并导入所需的Python库。
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.feature_selection import SelectKBest, f_classif
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix, accuracy_score
- 读取数据集,并将标签列单独提取出来,然后按照一定比例随机将数据集分成训练集和测试集。
data = pd.read_csv('creditcard.csv')
X = data.iloc[:, :-1]
y = data.iloc[:, -1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
- 在训练数据集上,实现CFS特征选择算法,从原始特征空间中选择一部分特征。
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
kbest = SelectKBest(score_func=f_classif, k=10)
X_train = kbest.fit_transform(X_train, y_train)
X_test = kbest.transform(X_test)
- 基于所选择的特征,在训练集上训练相应的分类模型,并在测试集上评估训练得到的集成学习模型。
classifier = SVC(kernel='rbf', random_state=0)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
accuracy = accuracy_score(y_test, y_pred)
print('Confusion Matrix:\n', cm)
print('Accuracy:', accuracy)
完整代码如下:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.feature_selection import SelectKBest, f_classif
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix, accuracy_score
data = pd.read_csv('creditcard.csv')
X = data.iloc[:, :-1]
y = data.iloc[:, -1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
kbest = SelectKBest(score_func=f_classif, k=10)
X_train = kbest.fit_transform(X_train, y_train)
X_test = kbest.transform(X_test)
classifier = SVC(kernel='rbf', random_state=0)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
accuracy = accuracy_score(y_test, y_pred)
print('Confusion Matrix:\n', cm)
print('Accuracy:', accuracy)
``
原文地址: https://www.cveoy.top/t/topic/e3H8 著作权归作者所有。请勿转载和采集!