建模前需要先导入相关的库和数据集:

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

df14 = pd.read_csv('rock.csv')

接下来进行数据预处理,将目标变量转换为二元变量:

df14['class'] = df14['class'].apply(lambda x: 1 if x == 'M' else 0)

然后将数据集分为训练集和测试集:

X = df14.drop('class', axis=1)
y = df14['class']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

最后使用决策树算法进行建模:

clf = DecisionTreeClassifier()
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)
accuracy_score(y_test, y_pred)

完整代码如下:

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

df14 = pd.read_csv('rock.csv')

df14['class'] = df14['class'].apply(lambda x: 1 if x == 'M' else 0)

X = df14.drop('class', axis=1)
y = df14['class']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

clf = DecisionTreeClassifier()
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)
accuracy_score(y_test, y_pred)
``
df14=rock数据集 怎么建模 代码

原文地址: https://www.cveoy.top/t/topic/gBij 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录