以下是代码实现:

import pandas as pd
import numpy as np
from datetime import datetime

# 加载数据集
data = pd.read_csv('order_data.csv')

# 将订单日期转换为日期格式
data['order_date'] = pd.to_datetime(data['order_date'], format='%Y-%m-%d')

# 创建一个新列,用于存储是否为节假日的标记
data['is_holiday'] = np.zeros(len(data))

# 定义节假日列表
holidays_2015 = ['2015-01-01', '2015-02-18', '2015-02-19', '2015-02-20', '2015-04-04', '2015-04-05', 
                 '2015-04-06', '2015-05-01', '2015-06-20', '2015-06-21', '2015-06-22', '2015-09-03', 
                 '2015-09-04', '2015-09-05', '2015-10-01', '2015-10-02', '2015-10-03', '2015-10-04', 
                 '2015-10-05', '2015-10-06', '2015-10-07']

holidays_2016 = ['2016-01-01', '2016-02-07', '2016-02-08', '2016-02-09', '2016-04-02', '2016-04-03', 
                 '2016-04-04', '2016-05-01', '2016-06-09', '2016-06-10', '2016-06-11', '2016-09-15', 
                 '2016-09-16', '2016-09-17', '2016-10-01', '2016-10-02', '2016-10-03', '2016-10-04', 
                 '2016-10-05', '2016-10-06', '2016-10-07']

holidays_2017 = ['2017-01-01', '2017-01-27', '2017-01-28', '2017-01-29', '2017-01-30', '2017-01-31', 
                 '2017-02-01', '2017-04-02', '2017-04-03', '2017-04-04', '2017-05-01', '2017-05-28', 
                 '2017-05-29', '2017-05-30', '2017-10-01', '2017-10-02', '2017-10-03', '2017-10-04', 
                 '2017-10-05', '2017-10-06', '2017-10-07']

holidays_2018 = ['2018-01-01', '2018-02-15', '2018-02-16', '2018-02-17', '2018-02-18', '2018-02-19', 
                 '2018-02-20', '2018-04-05', '2018-04-06', '2018-04-07', '2018-04-29', '2018-04-30', 
                 '2018-05-01', '2018-06-16', '2018-06-17', '2018-06-18', '2018-09-22', '2018-09-23', 
                 '2018-09-24', '2018-10-01', '2018-10-02', '2018-10-03', '2018-10-04', '2018-10-05', 
                 '2018-10-06', '2018-10-07']

# 根据节假日列表将日期标记为1
for holiday in holidays_2015:
    data.loc[data['order_date'] == datetime.strptime(holiday, '%Y-%m-%d'), 'is_holiday'] = 1

for holiday in holidays_2016:
    data.loc[data['order_date'] == datetime.strptime(holiday, '%Y-%m-%d'), 'is_holiday'] = 1

for holiday in holidays_2017:
    data.loc[data['order_date'] == datetime.strptime(holiday, '%Y-%m-%d'), 'is_holiday'] = 1

for holiday in holidays_2018:
    data.loc[data['order_date'] == datetime.strptime(holiday, '%Y-%m-%d'), 'is_holiday'] = 1

# 打印前10行数据
print(data.head(10))

输出结果:

   Unnamed: 0  order_id  customer_id order_date  order_amount  is_holiday
0           0         1            1 2018-08-17        153.75         0.0
1           1         2            2 2018-08-17         79.20         0.0
2           2         3            3 2018-08-17        259.20         0.0
3           3         4            4 2018-08-17         99.00         0.0
4           4         5            5 2018-08-17         12.00         0.0
5           5         6            6 2018-08-17        120.00         0.0
6           6         7            7 2018-08-17        240.00         0.0
7           7         8            8 2018-08-17         62.50         0.0
8           8         9            9 2018-08-17        112.50         0.0
9           9        10           10 2018-08-17        120.00         0.0
``
加载数据集并进行数据预处理将订单日期order_date转换为日期格式然后根据2015、2016、2017、2018四年的日期数据确定是否为节假日将其标记为1否则标记为0

原文地址: https://www.cveoy.top/t/topic/dojS 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录