ES14-指标聚合

时间 2019-12-19

原文原文链接

1.max

统计最大年龄html

GET my_person/_search
{
  "size": 0, 
  "aggs": {
    "age_max": {
      "max": {
        "field": "age"
      }
    }
  }
}

结果web

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "age_max": {
      "value": 28
    }
  }
}

2.min

统计最小年龄elasticsearch

GET my_person/_search
{
  "size": 0, 
  "aggs": {
    "age_min": {
      "min": {
        "field": "age"
      }
    }
  }
}

3.avg

统计平均年龄ide

GET my_person/_search
{
  "size": 0, 
  "aggs": {
    "age_avg": {
      "avg": {
        "field": "age"
      }
    }
  }
}

4.sum

求和网站

GET my_person/_search
{
  "size": 0, 
  "aggs": {
    "salary_sum": {
      "sum": {
        "field": "salary"
      }
    }
  }
}

5.stats

The stats that are returned consist of: min, max, sum, count and avg.ui

GET my_person/_search
{
  "size": 0, 
  "aggs": {
    "salary_stats": {
      "stats": {
        "field": "salary"
      }
    }
  }
}

结果spa

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "salary_stats": {
      "count": 4,
      "min": 4000,
      "max": 8000,
      "avg": 5750,
      "sum": 23000
    }
  }
}

6.extended_stats

相比于stats新增了几个统计属性code

GET my_person/_search
{
  "size": 0, 
  "aggs": {
    "salary_stats": {
      "extended_stats": {
        "field": "salary"
      }
    }
  }
}

查询结果htm

{
  "took": 13,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "salary_stats": {
      "count": 4,
      "min": 4000,
      "max": 8000,
      "avg": 5750,
      "sum": 23000,
      "sum_of_squares": 141000000,
      "variance": 2187500,
      "std_deviation": 1479.019945774904,
      "std_deviation_bounds": {
        "upper": 8708.039891549808,
        "lower": 2791.960108450192
      }
    }
  }
}

7.基数统计

查询薪资分为几个等级blog

GET my_person/_search
{
  "size": 0, 
  "aggs": {
    "salary_class": {
      "cardinality": {
        "field": "salary"
      }
    }
  }
}

查询结果

{
  "took": 9,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "salary_class": {
      "value": 4
    }
  }
}

8.文档数量统计

统计包含某一字段的文档数量

添加两个文档

PUT my_person/my_index/5
{
  "name":"lucy",
  "age":23
}

PUT my_person/my_index/6
{
  "name":"blue",
  "age":20
}

查询全部文档中包含”salary“文档数量

GET my_person/_search
{
  "size": 0, 
  "aggs": {
    "doc_count": {
      "value_count": {
        "field": "salary"
      }
    }
  }
}

查询结果

{
  "took": 9,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 6,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "doc_count": {
      "value": 4
    }
  }
}

9.百分位计算

https://www.elastic.co/guide/cn/elasticsearch/guide/current/percentiles.html

Elasticsearch 提供的另一个近似度量就是 percentiles 百分位数度量。百分位数展示某以具体百分比下观察到的数值。例如，第95个百分位上的数值，是高于 95% 的数据总和。

百分位数一般用来找出异常。在（统计学）的正态分布下，第 0.13 和第 99.87 的百分位数表明与均值距离三倍标准差的值。任何处于三倍标准差以外的数据一般被认为是不寻常的，由于它与平均值相差太大。

假设咱们正运行一个庞大的网站，一个很重要的工做是保证用户请求能获得快速响应，所以咱们就须要监控网站的延时来判断响应是否能保证良好的用户体验。

在此场景下，一个经常使用的度量方法就是平均响应延时。但这并非一个好的选择（尽管很经常使用），由于平均数一般会隐藏那些异常值，中位数有着一样的问题。咱们能够尝试最大值，但这个度量会垂手可得的被单个异常值破坏。

依靠如平均值或中位数这样的简单度量

加载 99 百分位数时

人吃惊！在上午九点半时，均值只有 75ms。若是做为一个系统管理员，咱们都不会看他第二眼。一切正常！但 99 百分位告诉咱们有 1% 的用户碰到的延时超过 850ms，这是另一幅场景。在上午4点48时也有一个小波动，这甚至没法从平均值和中位数曲线上观察到。