elasticsearch 冷热数据的读写分离

时间 2020-04-09

标签 elasticsearch 冷热数据读写分离栏目日志分析繁體版

原文原文链接

步骤node

1、冷热分离集群配置centos

好比三个机器共六个node的es集群。并发

每一个机器上各挂载一个ssd 和一个sata。每一个机器须要启动两个es进程、每一个进程对应不一样类型的磁盘。app

关键配置：elasticsearch

node.max_local_storage_nodes: 2 #容许每一个机器启动两个es进程高并发

path.data: /home/centos/es/elasticsearch-2.1.0/data/ssd #须要显示指定es对应的数据目录大数据

启动命令中须要指定node tagui

./elasticsearch -d -Des.path.conf=/home/centos/es/elasticsearch-2.1.0/config/ssd -d --node.tag=ssd
./elasticsearch -d -Des.path.conf=/home/centos/es/elasticsearch-2.1.0/config/sata -d --node.tag=sata

启动之后节点以下：spa

2、建立索引模板code

http://192.168.126.132:9200/_template/hottest/ PUT

{
    "order": 1,
    "template": "hottest*",
    "settings": {
        "index": {
            "number_of_shards": "3",
            "number_of_replicas": "1",
            "refresh_interval": "1s",
            "routing.allocation.require.tag": "ssd"
        }
    },
    "mappings": {
        "_default_": {
            "properties": {
                "userid": {
                    "index": "not_analyzed",
                    "type": "string"
                },
                "username": {
                    "index": "not_analyzed",
                    "type": "string"
                },
                "sex": {
                    "index": "not_analyzed",
                    "type": "string"
                },
                "address": {
                    "index": "no",
                    "type": "string"
                }
            },
            "_all": {
                "enabled": false
            }
        }
    },
    "aliases": {
        "hottest": {}
    }
}

"routing.allocation.require.tag": "ssd"  指定默认写入到 ssd 节点。

3、插入数据

http://192.168.126.132:9200/hottest_20170805/def/100001/ PUT

{
    "userid": "100001",
    "username": "zhangsan",
    "sex": "1",
    "address": "beijing"
}

在head 中看到数据所有保存在的 ssd 节点。

4、定时迁移老数据到 sata

http://192.168.126.132:9200/hottest_20170805/_settings/ PUT

{
    "index.routing.allocation.require.tag": "sata"
}

在head中看到数据移动到了 sata 节点

解决了两个问题

1、使用有限的ssd节点资源来实现同时支持高并发读写和大数据量的存储。

经过配置使最新的数据保存在ssd磁盘节点上，较老的数据自动迁移到廉价sata节点。

2、用户作一次大的查询，大量的读io和聚合操做致使集群load升高，阻塞新数据的写入，能作到必定程度的读写分离。