ClickHouse SQL 查询优化: 错误分析和解决方案

原始 SQL 查询:

-- SELECT toStartOfDay(toDateTime(Ds)) AS datetime_day_Ds,
--   count(DISTINCT AppIdentifier) AS index_appidentifier_0
-- FROM log_iMonkey_iOS_overview
-- WHERE Ds >= toDateTime(#accumulated_access_services.start#)
--   AND Ds <= toDateTime(#accumulated_access_services.end#)
-- GROUP BY datetime_day_Ds
-- ORDER BY datetime_day_Ds DESC
-- LIMIT 5000;
SELECT ts,
  arraySum(arraySlice(total, 1, i)) as total_num,
FROM (
    SELECT groupArray(min_Ds) as t,
      groupArray(app_count) as total,
      from (
        SELECT min_Ds,
          count(DISTINCT AppIdentifier) as app_count
        FROM (
            select AppIdentifier,
              min(Ds) as min_Ds
            from log_iMonkey_iOS_overview
            GROUP BY AppIdentifier
          )
        GROUP BY min_Ds
        order by min_Ds
      )
  ) ARRAY
  arrayEnumerate(total) as i where t >= toDate(#accumulated_access_services.start#)

错误信息:

errormessage: 运行SQL失败:[400]运行异常: Code: 62. DB::Exception: Syntax error: failed at position 406 ('groupArray') (line 12, col 12): groupArray(min_Ds) as t,
      groupArray(app_count) as total,
      from (
        SELECT min_Ds,
          count(DISTINCT AppIdentifier) as app_count
        . Expected one of: token, Comma, Arrow, Dot, UUID, DoubleColon, MOD, DIV, NOT, BETWEEN, LIKE, ILIKE, NOT LIKE, NOT ILIKE, IN, NOT IN, GLOBAL IN, GLOBAL NOT IN, IS, AND, OR, QuestionMark, alias, AS, end of query. (SYNTAX_ERROR) (version 52.3.10.8)

修改后的 SQL 查询:

SELECT toDate(Ds) AS datetime_day_Ds,
  count(DISTINCT AppIdentifier) AS index_appidentifier_0
FROM log_iMonkey_iOS_overview
WHERE Ds >= toDateTime(#accumulated_access_services.start#)
  AND Ds <= toDateTime(#accumulated_access_services.end#)
GROUP BY datetime_day_Ds
ORDER BY datetime_day_Ds DESC
LIMIT 5000;

SELECT toDate(min_Ds) AS ts,
  sum(app_count) as total_num
FROM (
  SELECT AppIdentifier,
    min(Ds) as min_Ds,
    count(DISTINCT AppIdentifier) as app_count
  FROM log_iMonkey_iOS_overview
  WHERE Ds >= toDateTime(#accumulated_access_services.start#)
    AND Ds <= toDateTime(#accumulated_access_services.end#)
  GROUP BY AppIdentifier
  ORDER BY min_Ds
)
GROUP BY ts
HAVING ts >= toDate(#accumulated_access_services.start#)
ORDER BY ts;

主要修改:

  1. 第一个查询中使用了 toStartOfDayarraySlice 函数,这两个函数在第二个查询中不需要,因此去掉了这两个函数。

  2. 第二个查询中的语法错误是因为缺少了一个 SELECT 关键字。正确的写法是将第一个 SELECT 和第二个 SELECT 合并为一个查询,并在第二个 SELECT 中加上 SELECT 关键字。同时,修改了查询中的一些细节,例如将日期时间转换为日期、使用 sum 函数替代 arraySumarrayEnumerate 函数等。

优化后的 SQL 查询解释:

  • 第一个查询用于统计每个日期的 AppIdentifier 数量,并根据日期降序排列,最多返回 5000 条数据。
  • 第二个查询用于统计每个日期的 AppIdentifier 数量,并将结果按日期排序。

总结:

通过对 SQL 查询进行优化,我们解决了语法错误并简化了查询语句。同时,我们还解释了查询的优化过程,希望能够帮助读者更好地理解 SQL 查询的优化方法。

ClickHouse SQL 查询优化: 错误分析和解决方案

原文地址: https://www.cveoy.top/t/topic/ozN5 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录