This script assumes that your Excel file has two columns named ColumnName and ColumnName_Duplicate The first column stores unique values and the second column indicates whether each row is a duplicate
import pandas as pd
df = pd.read_excel('filename.xlsx')
Replace 'ColumnName' with the name of the column you want to use to detect duplication
duplicates = df[df.duplicated(subset=['ColumnName'], keep=False)]
Create a new column to indicate whether a row is a duplicate or not
df['ColumnName_Duplicate'] = df.duplicated(subset=['ColumnName'], keep=False).map({True: 'Duplicate', False: 'Unique'})
Save the updated dataframe to a new Excel file
df.to_excel('filename_updated.xlsx', index=False
原文地址: https://www.cveoy.top/t/topic/cvKe 著作权归作者所有。请勿转载和采集!