使用re.findall("Hotel_Review-{.*?}-Reviews","https://www.tripadvisor.com/Hotel_Review-g60745-d6485213-Reviews-The_Verb_Hotel-Boston_Massachusetts.html")无法提取到"g60745-d6485213"的原因是正则表达式中的"{}"被当作量词来解释,而不是表示具体的字符。要匹配具体的字符"{}",需要使用转义字符"\"。\n\n以下是正确的正则表达式和代码示例:\n\npython\nimport re\n\nurl = "https://www.tripadvisor.com/Hotel_Review-g60745-d6485213-Reviews-The_Verb_Hotel-Boston_Massachusetts.html"\nregex = r"Hotel_Review-g(\d+)-d(\d+)-Reviews"\nmatch = re.search(regex, url)\nif match:\n group1 = match.group(1)\n group2 = match.group(2)\n hotel_id = f"g{group1}-d{group2}"\n print(hotel_id)\nelse:\n print("No match found.")\n\n\n输出结果为:"g60745-d6485213"。

Python正则表达式提取TripAdvisor酒店ID:解析re.findall()问题

原文地址: https://www.cveoy.top/t/topic/pVza 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录