from pyspark.sql import SparkSession

spark = SparkSession.builder.appName('AverageScore').getOrCreate() lines = spark.read.text('data01.txt').rdd.map(lambda x: x[0]) score = lines.map(lambda x: int(x.split()[2])) num = score.count() total_score = score.reduce(lambda x, y: x + y) avg = total_score / num print(avg)


原文地址: https://www.cveoy.top/t/topic/kH4 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录