1、请读入数据credit_dataset_finalcsv2、将数值变量标准化3、将分类变量转成因子型4、抽样60的原始数据作为建立模型的数据剩下的数据作为验证模型的数据
# 读入数据
credit <- read.csv("credit_dataset_final.csv", stringsAsFactors = FALSE)
# 将数值变量标准化
num_vars <- c("LIMIT_BAL", "AGE", "BILL_AMT1", "BILL_AMT2", "BILL_AMT3", "BILL_AMT4", "BILL_AMT5", "BILL_AMT6", "PAY_AMT1", "PAY_AMT2", "PAY_AMT3", "PAY_AMT4", "PAY_AMT5", "PAY_AMT6")
credit[num_vars] <- scale(credit[num_vars])
# 将分类变量转成因子型
cat_vars <- c("SEX", "EDUCATION", "MARRIAGE", "PAY_0", "PAY_2", "PAY_3", "PAY_4", "PAY_5", "PAY_6")
credit[cat_vars] <- lapply(credit[cat_vars], factor)
# 抽样60%的原始数据作为建立模型的数据,剩下的数据作为验证模型的数据
set.seed(123)
train_index <- sample(1:nrow(credit), round(0.6 * nrow(credit)))
train_data <- credit[train_index, ]
test_data <- credit[-train_index, ]
原文地址: https://www.cveoy.top/t/topic/bTps 著作权归作者所有。请勿转载和采集!