Elasticsearch 学习笔记

时间 2019-12-01

标签 elasticsearch 学习笔记栏目日志分析繁體版

原文原文链接

Elasticsearch 学习笔记

基本概念

curl全解node

** 节点(node) **是一个运行着的Elasticsearch实例。git

** 集群(cluster) **是一组具备相同cluster.name的节点集合，他们协同工做，共享数据并提供故障转移和扩展功能，固然一个节点也能够组成一个集群。github

ES与传统关系数据库对比

Relational DB -> Databases -> Tables -> Rows -> Columns
Elasticsearch -> Indices   -> Types  -> Documents -> Fields

Elasticsearch集群能够包含多个索引(indices)（数据库），每个索引能够包含多个类型(types)（表），每个类型包含多个文档(documents)（行），而后每一个文档包含多个字段(Fields)（列数据库

文档（Document）

这里我理解的文档，就是原来在关系型数据库的状况下，一条数据存储时会被分在不一样的数据库里，查询时会从不一样的数据库聚合数据。而在ES中，一条数据不须要被拆分，直接能够储存，相同的数据被存储到一个文档下面，能够方便被检索、排序、过滤。json

一般对象和文档在某种程度上是等价的，只是表现的形式不一样。文档特指最顶层结构或者**根对象(root object)**序列化成的JSON数据（以惟一ID标识并存储于Elasticsearch中。安全

一个文档不只包括数据这个主题，还包括一些 metadata :app

_index:索引
1. 名字必须是所有小写
2. 不能如下划线开头
3. 不能包含逗号
_type:每一个**类型(type)都有本身的映射(mapping)或者结构定义，就像传统数据库表中的列同样。全部类型下的文档被存储在同一个索引下，可是类型的映射(mapping)**会告诉Elasticsearch不一样的文档如何被索引
1. 名字能够是大写或小写
2. 不能包含下划线或逗号
- _id:id仅仅是一个字符串，它与_index和_type组合时，就能够在Elasticsearch中惟一标识一个文档。当建立一个文档，你能够自定义_id，也可让Elasticsearch帮你自动生成

以下例子展现了文档的本质：curl

curl -H "content-Type:application/json" -XPOST 'http://localhost:9200/megacorp/employee/1' -d '
{
   "first_name" :  "es",
   "last_name" :   "Smith",
   "age" :         21,
   "about" :       "I love you",
   "interests":  [ "music"]
}
'
curl -H "content-Type:application/json" -XPOST 'http://localhost:9200/megacorp/employee/2' -d '
{
   "first_name" :  "ys",
   "last_name" :   "Smith",
   "age" :         28,
   "about" :       "I love you",
   "interests":  [ "game"]
}
'
curl -H "content-Type:application/json" -XPOST 'http://localhost:9200/megacorp/employee/3' -d '
{
   "first_name" :  "pinker",
   "last_name" :   "Smith",
   "age" :         18,
   "about" :       "I love yuxin",
   "interests":  [ "music" ,"game"]
}
'

{"megacorp":
	{"aliases":{},
	"mappings":{
		"employee":{"properties":{
		"about":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},
		"age":{"type":"long"},
		"first_name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},
		"interests":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},
		"last_name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}
		}}},
 "settings":{"index":{"creation_date":"1561710155126",
 "number_of_shards":"5",
 "number_of_replicas":"1",
 "uuid":"IMRKdZ_xRd61FLtRXPVVCg",
 "version":{"created":"6080199"},
 "provided_name":"megacorp"}}
 }
}

查询文档

curl -X GET "localhost:9200/megacorp/employee/3"

索引文档

对文档的索引包括如下：elasticsearch

- 简单搜索： 至关于数据库查询

DSL：自定义语言包括：
1. match，match既能够用于彻底匹配某一属性，又能够用于全文搜索，用于全文搜索时，其主要关键字为_score，它能够按照匹配度来排序并给出正确的答案。
2. 区间过滤：主要用 gt,lt,eq,ne,ge,le 来过滤须要的文档数据。主要针对的类型为数字型。
3. 短语搜索：须要将全文搜索中的match替换为match_phrase，这样就不会出现匹配部分单词，是按照整个短语来匹配的。

简单搜索

curl -X GET "localhost:9200/megacorp/employee/_search?q=first_name:pinker"
{"took":5,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.2876821,"hits":[{"_index":"megacorp","_type":"employee","_id":"3","_score":0.2876821,"_source":
{
   "first_name" :  "pinker",
   "last_name" :   "Smith",
   "age" :         18,
   "about" :       "I love yuxin",
   "interests":  [ "music" ,"game"]
}

DSL

curl -H "Content-Type:application/json" -X GET "localhost:9200/megacorp/employee/_search" -d '{
"query":{
"match":{
"first_name":"pinker"
}
}
}
'

过滤查询

过滤查询已被弃用，并在ES 5.0中删除。解决：使用bool / must / filter查询ide

curl -H "Content-Type:application/json" -X GET "localhost:9200/megacorp/employee/_search" -d '
{
    "query" : {
        "bool" : {
            "filter" : {
                "range" : {
                    "age" : { "lt" : 25} 
                }
            },
            "must" : {
                "match" : {
                    "last_name" : "Smith"
                }
            }
        }
    }
}'

全文搜索

curl -H "Content-Type:application/json" -X GET "localhost:9200/megacorp/employee/_search" -d '{
"query":{
"match":{
"about":"yuxin"
}
}
}
'

短语搜索

match查询变动为match_phrase查询便可

分词插件安装

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.8.0/elasticsearch-analysis-ik-6.8.0.zip

关闭ES

ctrl+c
curl -XPOST 'http://localhost:9200/_shutdown'

查询

curl -H "Content-Type: application/json" -XGET 'http://localhost:9200/_count?pretty' -d '
{
    "query": {
        "match_all": {}
    }
}

注意：上述中6.X版本开始须要加入头信息，不然会报不识别格式的错误，这是新版ES的更加安全的考虑所致。

验证

curl -i -XGET 'localhost:9200/'

-i表示回答的报文须要添加完整的http头。