集群中有N台机器,每台机器中有M个时序指标(CPU、内存、IO、流量等),若单独的针对每条时序曲线作建模,要手写太多重复的SQL,且对平台的计算消耗特别大。该如何更好的应用SQL实现上述的场景需求?html
针对系统中的N条时序曲线进行异常检测后,有要如何快速知道:这其中有哪些时序曲线是有异常的呢?算法
针对场景一中描述的问题,咱们给出以下的数据约束。其中数据在日志服务的LogStore中按照以下结构存储:数组
timestamp : unix_time_stamp machine: name1 metricName: cpu0 metricValue: 50 --- timestamp : unix_time_stamp machine: name1 metricName: cpu1 metricValue: 50 --- timestamp : unix_time_stamp machine: name1 metricName: mem metricValue: 50 --- timestamp : unix_time_stamp machine: name2 metricName: mem metricValue: 60
在上述的LogStore中咱们先获取N个指标的时序信息:学习
* | select timestamp - timestamp % 60 as time, machine, metricName, avg(metricValue) from log group by time, machine, metricName
如今咱们针对上述结果作批量的时序异常检测算法,并获得N个指标的检测结果:url
* | select machine, metricName, ts_predicate_aram(time, value, 5, 1, 1) as res from ( select timestamp - timestamp % 60 as time, machine, metricName, avg(metricValue) as value from log group by time, machine, metricName ) group by machine, metricName
经过上述SQL,咱们获得的结果的结构以下spa
| machine | metricName | [[time, src, pred, upper, lower, prob]] | | ------- | ---------- | --------------------------------------- |
针对上述结果,咱们利用矩阵转置操做,将结果转换成以下格式,具体的SQL以下:unix
* | select machine, metricName, res[1] as ts, res[2] as ds, res[3] as preds, res[4] as uppers, res[5] as lowers, res[6] as probs from ( select machine, metricName, array_transpose(ts_predicate_aram(time, value, 5, 1, 1)) as res from ( select timestamp - timestamp % 60 as time, machine, metricName, avg(metricValue) as value from log group by time, machine, metricName ) group by machine, metricName )
通过对二维数组的转换后,咱们将每行的内容拆分出来,获得符合预期的结果,具体格式以下:日志
| machine | metricName | ts | ds | preds | uppers | lowers | probs | | ------- | ---------- | -- | -- | ----- | ------ | ------ | ----- |
针对批量检测的结果,咱们该如何快速的将存在特定异常的结果过滤筛选出来呢?日志服务平台提供了针对异常检测结果的过滤操做。code
select ts_anomaly_filter(lineName, ts, ds, preds, probs, nWatch, anomalyType)
其中,针对anomalyType有以下说明:htm
其中,针对nWatch有以下说明:
具体使用以下所示:
* | select ts_anomaly_filter(lineName, ts, ds, preds, probs, cast(5 as bigint), cast(1 as bigint)) from ( select concat(machine, '-', metricName) as lineName, res[1] as ts, res[2] as ds, res[3] as preds, res[4] as uppers, res[5] as lowers, res[6] as probs from ( select machine, metricName, array_transpose(ts_predicate_aram(time, value, 5, 1, 1)) as res from ( select timestamp - timestamp % 60 as time, machine, metricName, avg(metricValue) as value from log group by time, machine, metricName ) group by machine, metricName ) )
经过上述结果,咱们拿到的是一个Row类型的数据,咱们能够使用以下方式,将具体的结构提炼出来:
* | select res.name, res.ts, res.ds, res.preds, res.probs from ( select ts_anomaly_filter(lineName, ts, ds, preds, probs, cast(5 as bigint), cast(1 as bigint)) as res from ( select concat(machine, '-', metricName) as lineName, res[1] as ts, res[2] as ds, res[3] as preds, res[4] as uppers, res[5] as lowers, res[6] as probs from ( select machine, metricName, array_transpose(ts_predicate_aram(time, value, 5, 1, 1)) as res from ( select timestamp - timestamp % 60 as time, machine, metricName, avg(metricValue) as value from log group by time, machine, metricName ) group by machine, metricName ) ) )
经过上述操做,就能够实现对批量异常检测的结果进行过滤处理操做,帮助用户更好的批量设置告警。
这里是日志服务的各类功能的演示 日志服务总体介绍,各类Demo
更多日志进阶内容能够参考:日志服务学习路径。
本文为云栖社区原创内容,未经容许不得转载。