利用tkinter和python建立数据集k匿名化系统将所有数据泛化处理并在处理后输出精确值步骤
- 导入必要的库
import pandas as pd
import tkinter as tk
from tkinter import filedialog
- 创建GUI界面并选择数据集
root = tk.Tk()
root.withdraw()
file_path = filedialog.askopenfilename()
- 加载数据集
df = pd.read_csv(file_path)
- 获取所有属性名并选择要进行泛化处理的属性
column_names = df.columns
selected_columns = []
for column in column_names:
selected = input(f"Do you want to anonymize {column}? (y/n)")
if selected.lower() == 'y':
selected_columns.append(column)
- 对要进行泛化处理的属性进行处理
for column in selected_columns:
column_data = df[column]
column_type = column_data.dtype
if column_type == 'int64' or column_type == 'float64':
max_value = column_data.max()
min_value = column_data.min()
range_value = max_value - min_value
for i in range(len(column_data)):
column_data[i] = int((column_data[i] - min_value) / range_value * 10)
df[column] = column_data
elif column_type == 'object':
unique_values = column_data.unique()
for i in range(len(column_data)):
column_data[i] = unique_values.index(column_data[i])
df[column] = column_data
- 输出处理后的数据集
df.to_csv('anonymized_data.csv', index=False)
``
原文地址: https://www.cveoy.top/t/topic/fDKO 著作权归作者所有。请勿转载和采集!