SQL 中位数计算:AVG vs PERCENTILE_CONT 的区别
这两段代码的区别在于计算中位数的方式不同。第一段代码使用了 AVG 函数来计算中位数,而第二段代码使用了 PERCENTILE_CONT 函数。PERCENTILE_CONT 函数会根据指定的分位数计算出对应的数值,而 AVG 函数则是将所有数值相加再除以总数。因此,第一段代码计算的是平均决策时间,而第二段代码计算的是中位数决策时间。
使用 AVG 函数计算中位数的代码:
SELECT m.category,
CAST(
AVG(TIMESTAMPDIFF(SECOND, a.time, d.time)) AS UNSIGNED
) AS median_decision_time
FROM app a
JOIN mapping_table m ON a.package_name = m.package_name
JOIN (
SELECT userid, package_name, MIN(time) AS time
FROM app
WHERE action_type = 'download'
GROUP BY userid, package_name
) AS d ON a.userid = d.userid AND a.package_name = d.package_name
WHERE a.action_type = 'itemShow'
GROUP BY m.category;
使用 PERCENTILE_CONT 函数计算中位数的代码:
SELECT
m.category,
CAST(
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY TIMESTAMPDIFF(SECOND, a.time, d.time)) AS UNSIGNED
) AS median_decision_time
FROM app a
JOIN mapping_table m ON a.package_name = m.package_name
JOIN (
SELECT userid, package_name, MIN(time) AS time
FROM app
WHERE action_type = 'download'
GROUP BY userid, package_name
) AS d ON a.userid = d.userid AND a.package_name = d.package_name
WHERE a.action_type = 'itemShow'
GROUP BY m.category;
总结:
- 使用 AVG 函数计算的是平均决策时间,可能受到极端值的影响。
- 使用 PERCENTILE_CONT 函数计算的是中位数决策时间,不受极端值影响,更能反映数据集的真实情况。
建议在分析数据时根据具体情况选择合适的计算方法。
原文地址: https://www.cveoy.top/t/topic/mMv8 著作权归作者所有。请勿转载和采集!