ES7.3 学习记录

时间 2019-11-24

标签 es7.3 学习记录繁體版

原文原文链接

1、安装es以及kibana
参看：https://www.cnblogs.com/kakatadage/p/9922359.htmlhtml

2、查看官方使用文档
参看：https://www.elastic.co/guide/en/elasticsearch/reference/7.x/index.htmlnode

3、index相关操做数据结构

注：如下 test 均为索引名app

1.建立index
（1）最简单的建立方式，参数均使用默认配置less

PUT /test

（2）能够带三个参数：aliases、mappings以及settingselasticsearch

- aliases: 给一个或者多个index赋予另一个别名
eg:
给单个index添加别名
ide

POST /_aliases
{
    "actions" : [
        {
             "add" : { 
                    "index" : "test1", "alias" : "alias1" 
                     }     
         }
    ]
}

orui

PUT /test/_alias/alias1

给多个index添加别名spa

POST /_aliases
{
    "actions" : [
        { 
            "add" : { 
                "indices" : ["test1", "test2"], "alias" : "alias1" 
             } 
        }
    ]
}

建立索引时取别名code

PUT /test
{
    "aliases": {
        "test-1": {}
    }
}

- mappings：索引index里面的数据结构，就像JAVA对象同样，其里面还包含这种属性（Field；注意：ES 不支持修改已有的Filed名字和类型）
eg:

PUT /test/_mapping
{
    "properties": {
       "email": {
            "type": "keyword"
        }
    }
}

获取mapping

GET /test/_mapping

获取mapping field

GET /test/_mapping/field/fieldName

删除mapping field
　　- settings: index的一些参数有以下参数：
　　　　-- include_type_name（是否包含mapping type；默认 false；注意：mapping types 在7.x的时候被移除，具体缘由参看：https://www.elastic.co/guide/en/elasticsearch/reference/7.3/removal-of-types.html#_custom_type_field）
　　　　-- wait_for_active_shards（等待副本节点都处于活动状态才进行操做，不然一直等待直至超时；默认1，即仅等待主分片在继续操做以前处于活动状态）
　　　　-- timeout（操做超时时间;默认30s）
　　　　-- master_timeout （链接master节点超时时间；默认30s）
eg:

PUT /test
{
    "settings" : { // 在7.x版本：分片数和备份数默认都是 1
    　　"number_of_shards" : "1",
    　　"number_of_replicas" : "1",
    },
    "mappings" : {
        "properties" : {
            "field1" : { "type" : "text" }
        }
    }
}

注意：因为在7.x的时候，mapping types被移除，因此每一个index只存储一个实体

2.建立Mapping
（1）在建立index的时候建立

PUT /test
{
    "settings" : { // 在7.x版本：分片数和备份数默认都是 1
        "number_of_shards" : "1",
        "number_of_replicas" : "1",
    },
    "mappings" : {
        "properties" : {
        "field1" : { "type" : "text" }
        }
    }
}

（2）在已有的index里建立

PUT /test/_mapping
{
    "properties": {
        "email": {
            "type": "keyword"
        }
    }
}

注意：ES 不支持修改已有的Filed名字和类型

3.删除index
（1）删除单个

DELETE /twitter

（2）删除多个/所有（注意：此操做很是危险，因此此操做是能够禁止的，即将 elasticsearch.yml 配置中的 action.destructive_requires_name 设置为true，此时删除索引就须要指明明确的索引名）

DELETE _all 或者 DELETE xx*

4.获取index详情
（1）获取单个

GET /test

（2）获取多个/全部

GET _all 或者 GET xx*

5.判断某个index是否存在（存在则返回 200；不然返回 404）

HEAD /test

注意：此处 test 为索引名，可是若是不存在名为 test 的索引，可是存在别名为 test 也判断为true

6.关闭/开启索引（一旦关闭则没法对索引进行读写操做）

POST /test/_close
POST /test/_open

注意：也能够和删除操做同样使用_all关键字或模糊匹配关闭多个索引，固然也能够禁用此操做，即设置 action.destructive_requires_name 为 true。关闭索引操做很是占用磁盘空间，因此能够设置 cluster.indices.close.enable = false 禁用关闭索引操做，默认是 true.

7.收缩索引（将一个索引迁至一个主分片更少的索引，可是目标索引的分片数，必须是源索引分片数的因子。好比，源索引的分片数是：8，那么目标索引的分片数能够是：4, 2, 1；）
（1）将索引设置为只读且将全部副本迁至同一个节点（收缩前提条件：副本必须在同一节点）

PUT /my_source_index/_settings
{
    "settings": {
        "index.routing.allocation.require._name":         
        "shrink_node_name", // 强制将每一个分片的副本重定位到名为shrink_node_name的节点
        "index.blocks.write": true // 设置为只读
    }
}

（2）建立目标索引（基于源索引复制一份）

POST my_source_index/_shrink/my_target_index?copy_settings=true
{
    "settings": {
        "index.routing.allocation.require._name": null, // 清除源索引 强制将每一个分片的副本重定位到名为shrink_node_name的节点 配置
        "index.blocks.write": null // 清除源索引 设置为只读 配置
    }
}

（3）迁移

POST my_source_index/_shrink/my_target_index?copy_settings=true
{
    "settings": {
        "index.number_of_replicas": 1, // 副本个数
        "index.number_of_shards": 1, // 分片数
        "index.codec": "best_compression" // 最佳压缩仅在对索引进行新写入时生效，例如强制合并分片到单个段时
    },
    "aliases": {
        "my_search_indices": {}
    }
}

8.拓展索引（和收缩相反）
拆分规则：索引能够拆分屡次，但拆分的最大分片数是由建立索引是的number_of_routing_shards决定的。拆分后的分片数量需是number_of_routing_shards的因子，即number_of_routing_shards是拆分后分片数的倍数。
例如，原有主分片为5，number_of_routing_shards=30的索引，能够按以下几种状况拆分：
5 → 10 → 30 (split by 2, then by 3)
5 → 15 → 30 (split by 3, then by 2)
5 → 30 (split by 6)

（1）将源索引设置为只读

PUT /my_source_index/_settings
{
    "settings": {
        "index.blocks.write": true 
    }
}

（2）迁移

POST my_source_index/_split/my_target_index
{
    "settings": {
        "index.number_of_shards": 3
    }
}

注意：拆分必须知足的条件：
- 目标索引必须不存在
- 索引的主碎片必须少于目标索引。
- 目标索引中的主碎片数量必须是源索引中的主碎片数量的一个因子。
- 处理拆分进程的节点必须有足够的空闲磁盘空间来容纳现有索引的第二个副本

9.利用别名自动抛弃旧索引而从新指向新索引
（1）建立索引并带上别名

PUT /logs-000001 
{
    "aliases": {
        "logs_write": {}
    }
}

（2）设置生成新索引规则

POST /logs_write/_rollover 
{
    "conditions": {
        "max_age": "7d", // 7天
        "max_docs": 1000, // 1000行
        "max_size": "5gb" // 5GB
    }
}

即logs-000001文件自建立以来存活7天或者最大文档数超过1000或者索引主分片最大超过5GB 则会自动建立logs-000002.其索引生成规则：若是现有索引的名称以 - 和数字结尾，例如 logs-000001，而后新索引的名称将遵循相同的模式，递增数字（logs-000002）。若是不是以-和数字结尾则须要自定义名字

POST /logs_write/_rollover/my_new_index_name
{
    "conditions": {
        "max_age": "7d",
        "max_docs": 1000,
        "max_size": "5gb"
    }
}

10.冻结/解冻索引（被冻结的索引没法进行写入操做）

POST /my_index/_freeze
POST /my_index/_unfreeze

4、数据相关操做（根据ID基础操做）
1.数据插入
一共以下四种方式：_doc 和 _create 区别在于，_doc 是若是插入数据已经存在则会更新，而_create只能是插入不存在的数据；默认状况下若是插入的index以及mapping不存在则会自动建立，也能够经过设置让其不自动建立，设置参数为：action.auto_create_index

PUT /<index>/_doc/<_id>

POST /<index>/_doc/

PUT /<index>/_create/<_id>

POST /<index>/_create/<_id>

（1）自增ID插入（使用POST请求）

POST my-index/_doc
{
    "age": 0
}

（2）本身建立ID插入（使用PUT请求）

PUT my-index/_doc/2
{
　　"age": 0
}

（3）upSert方法，记录不存在就插入，不然就执行脚本作更新

POST test/_doc/2/_update
    {
        "script" : {
            "source": "ctx._source.counter += params.count",
            "lang": "painless",
            "params" : {
                "count" : 4
            }
        },
        "upsert" : {
            "counter" : 1
        }
    }

（4）记录不存在就插入，不然就更新相应字段（counter）,无需更新的字段不要传，不然会更新，即便字段值为null也会更新为null

POST test/_doc/1/_update
    {
        "doc" : {
            "counter" : 7
        }
    }

（5）对已有数据进行更新

POST bigdata-archive/_update_by_query
    {
      "script": {
        "source": "ctx._source['imageCount']=0" 
      },
      "query": {
        "bool": {
          "must_not": [
            {
              "exists": {
                "field": "imageCount"
              }
            }
          ]
        }
      }
    }

2.数据查找
一共以下四种方式： GET 获取数据，HEAD 验证数据是否存在；默认状况下每次 GET 都会刷新索引以保证数据是最新的，也能够关闭此功能，经过设置：realtime = false 实现；_source 只返回 fields（可指定返回哪些field）而 _doc 返回 index 全部信息；

GET <index>/_doc/<_id>

HEAD <index>/_doc/<_id>

GET <index>/_source/<_id>

HEAD <index>/_source/<_id>

3.删除数据

DELETE /<index>/_doc/<_id>

4.更新数据

POST /<index>/_update/<_id>

（1）通常修改方式，固然也能够用数据插入方式覆盖已有的字段

POST test/_update/1
{
    "doc" : {
        "name" : "new_name"
    }
}

（2）使用脚本修改（默认是使用ES脚本）

POST test/_update/1
{
    "script" : {
        "source": "ctx._source.counter +=     
params.count",
    "lang": "painless",
    "params" : {
            "count" : 4
        }
    }
}

5、查询
1.根据条件分页查询（具体有关QUERY操做参看官网文档：https://www.elastic.co/guide/en/elasticsearch/reference/7.x/query-dsl.html）

POST test/_search
{
    "query": {
    "match_all": {
    }
//    "match": {
//    "filedName": xxx
//    }
    },
    "sort": [
        {
            "counter": {
            "order": "asc"
        }
    }
]，
    "from": 1,
    "size": 1
}

2.聚合查询（aggs：聚合查询关键词；参看官网文档：https://www.elastic.co/guide/en/elasticsearch/reference/7.x/search-aggregations.html）

POST test/_search
{
    "aggs": {
        "avg_grade": {
            "avg": {
                "field": "counter"
            }
        }
    }
}

3.模糊查询（具体参看官方文档：https://www.elastic.co/guide/en/elasticsearch/reference/7.x/query-dsl-query-string-query.html）

POST test/_search
{
    "query": {
        "query_string": {
            "query": 1,
            "fields": ["counter"]
        }
    }
}