unicode_escape会导致中文乱码 python

日期: 2024-08-14

标签: 科技

在Python 3中，使用unicode_escape编码时，需要确保字符串本身是Unicode字符串。如果使用普通字符串（即字节串），则会导致中文乱码。以下是一个示例：

# 使用字节串
s = b'\u4e2d\u6587'
print(s.decode('unicode_escape'))

# 输出：中文

# 使用Unicode字符串
s = '\u4e2d\u6587'
print(s.encode('unicode_escape').decode())

# 输出：\u4e2d\u6587

可以看到，使用字节串时，解码后的结果是正确的，但是在使用unicode_escape编码的过程中，中文字符被转义了。而使用Unicode字符串时，编码后的结果是正确的，中文字符没有被转义。

原文地址: https://www.cveoy.top/t/topic/blNl 著作权归作者所有。请勿转载和采集!