causation理论的一点应用。证实分值不是偶然发生的算法
RCA的工具通常能够query and classify anomalies,相关性分析(causal probabilistic gaphical models)express
causal bayesian network。嗯,能够用带条件的两个变量关系去构造复杂的关系。app
- ExplainIt!– A Declarative Root-cause Analysis Engine for Time Series Data - Why? The above approach offers three main benefits. - First, the formalism is a non-parametric and declarative way of expressing dependencies between variables and defers any specific approach to the runtime system. - Second, the unified approach naturally lends itself to multivariate dependencies of more complex relationships beyond simple correlations between pairwise univariate metrics. - Third, the approach also gives us a way to reason about dependencies that might be easier to detect only when holding some variables con- stant;
1.feature family (能够按照host聚合,相似group by。好比某个feature family是75th延时,当前clusterjobs数量)dom
2.ranking 假设(X,Y,Z)=》给出Xi的排序
单变量Z空score:X中每一个Xi,Y中每一个Yj,Pearson product-moment coorelation 的均值和最值 coorMean=meani,j|pi,j|。
多变量Z空,线性回归(random projection降维)+loss function 计算R方
Z不空:回归Y~Z,X~Z.获得RY;X.,RX;Z. 回归两个R计算R2(Y;X|Z)
当X中predictors不少,observations不多时。用Ridge penalty达到了和adjusted R2同样的效果。见后文。ide
实验是否可以补全图工具
打分方法的评估:
ranking accuracy:cause是第r个,1/r
success rate: cause in topk 得1,不然0ui
PC/SGS算法 use pairwise conditional independence=>full causal structture.also considering a joint set of variables.
rarely requires the full causal structuewspa
给出了过拟合 用radj。当一个score至少大于s是意外正常发生的几率和n,p的关系。当s小于这个值时不可信的。rest