from sklearnmetrics import classification_reportfrom sklearnpreprocessing import LabelEncoderfrom sklearnfeature_extraction import DictVectorizerfrom sklearntree import DecisionTreeClassifier
Importing necessary libraries
import pandas as pd import numpy as np
Loading the dataset
df = pd.read_csv('dataset.csv')
Dropping unnecessary columns
df = df.drop(['id', 'name', 'date'], axis=1)
Encoding the target variable
le = LabelEncoder() df['class'] = le.fit_transform(df['class'])
Converting the dataset into a dictionary
data = df.to_dict('records')
Vectorizing the features
vec = DictVectorizer() X = vec.fit_transform(data).toarray()
Splitting the data into training and testing sets
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, df['class'], test_size=0.2, random_state=42)
Training the decision tree classifier
clf = DecisionTreeClassifier() clf.fit(X_train, y_train)
Predicting the target variable for the test data
y_pred = clf.predict(X_test)
Evaluating the performance of the model
print(classification_report(y_test, y_pred, target_names=le.classes_)
原文地址: https://www.cveoy.top/t/topic/fzhM 著作权归作者所有。请勿转载和采集!