这两段代码的区别在于计算中位数的方式不同。第一段代码使用了 AVG 函数来计算中位数,而第二段代码使用了 PERCENTILE_CONT 函数。PERCENTILE_CONT 函数会根据指定的分位数计算出对应的数值,而 AVG 函数则是将所有数值相加再除以总数。因此,第一段代码计算的是平均决策时间,而第二段代码计算的是中位数决策时间。

使用 AVG 函数计算中位数的代码:

SELECT m.category,
       CAST(
           AVG(TIMESTAMPDIFF(SECOND, a.time, d.time)) AS UNSIGNED
       ) AS median_decision_time
FROM app a
JOIN mapping_table m ON a.package_name = m.package_name
JOIN (
    SELECT userid, package_name, MIN(time) AS time
    FROM app
    WHERE action_type = 'download'
    GROUP BY userid, package_name
) AS d ON a.userid = d.userid AND a.package_name = d.package_name
WHERE a.action_type = 'itemShow'
GROUP BY m.category;

使用 PERCENTILE_CONT 函数计算中位数的代码:

SELECT
    m.category,
    CAST(
        PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY TIMESTAMPDIFF(SECOND, a.time, d.time)) AS UNSIGNED
    ) AS median_decision_time
FROM app a
JOIN mapping_table m ON a.package_name = m.package_name
JOIN (
    SELECT userid, package_name, MIN(time) AS time
    FROM app
    WHERE action_type = 'download'
    GROUP BY userid, package_name
) AS d ON a.userid = d.userid AND a.package_name = d.package_name
WHERE a.action_type = 'itemShow'
GROUP BY m.category;

总结:

  • 使用 AVG 函数计算的是平均决策时间,可能受到极端值的影响。
  • 使用 PERCENTILE_CONT 函数计算的是中位数决策时间,不受极端值影响,更能反映数据集的真实情况。

建议在分析数据时根据具体情况选择合适的计算方法。

SQL 中位数计算:AVG vs PERCENTILE_CONT 的区别

原文地址: https://www.cveoy.top/t/topic/mMv8 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录