es操做手册

时间 2019-11-12

标签手册繁體版

原文原文链接

0 _search查询数据时能够指定多个index和type程序员

GET /index1,index2/type1,type2/_search

GET /_all/type1/_search  至关于查询所有index下的type1的document

GET /_all/type1/_search?from=0&size=5 from和size为分页参数

1 增长一条数据，手动指定document的IDjson

PUT /index1/type1/1
{
"content1":"abcnt地方士大夫",
"age":"abc你的"
}

2 增长一条数据，自动指定document的ID数组

POST /index1/type1
{
"content1":"abcnt地方士大夫",
"age":"abc你的"
}

3 获取一条数据的方式，并指定查询返回字段app

GET /index1/type1/1?_source=age,content1

4 es更新数据时使用自定义版本号，只有版本号大于当前版本号才容许更新操做ide

PUT /index1/type1/1?version=5&version_type=external  （以前的_version属性必须小于5）
{
"description":"程序员一枚~~~"
}

5 partial update对document中的部分field进行更新性能

POST /index1/type1/1/_update?version=13 （必须version等于当前版本号时才能够修改数据，并且内容和原来相同则认为未更改版本号不变；该操做在更新期间不会被打断）
{
"doc":{
"description":"程序员一枚~~~7778899056"
}
}

6 经过GET /_mget 批量查找数据，须要提供index，type，id（能够经过url参数增长，根据搜索范围不一样使用不一样的查询参数）ui

GET /_mget
{
"docs":[
{
"_index":"index1",
"_type":"type1",
"_id":"1",

  "_version":16
},
{
"_index":"index1",
"_type":"type1",
"_id":"2"
}
]
}

GET /index1/_mget
{
"docs":[
{
"_type":"type1",
"_id":"1",
"_version":16
},
{
"_type":"type1",
"_id":"2"
}
]
}

GET /index1/type1/_mget
{
"ids":[1,2]
}

7 _search搜索默认查询前10条（timeout=1ms能够指定超时时间）url

GET /index1/type1/_search?timeout=1ms

8 使用_search?q=xxx，为全字段查询，若是使用_search?q=field:xxx为按照具体字段进行查询；idea

PUT /index3/type3/1
{
  "date":"2019-01-02",
  "name":"the little",
  "content":"Half the ideas in his talk were plagiarized from an article I wrote last month."
}

PUT /index3/type3/2
{
  "date":"2019-01-01",
  "name":"a dog",
  "content":"is the girl, women's attention and love day. July 7th Qiqiao customs, originated in the Han "
}

PUT /index3/type3/3
{
  "date":"2019-07-01",
  "name":"very tag",
  "content":"Some of our comrades love to write long articles with no substance, very much like the foot bindings of a slattern, long as well as smelly"
}

//可是按照具体字段查询时若是字段类型为date或者long等时间和数值类型则使用exact value去匹配
GET /index3/type3/_search?q=date:2019-01 //只能查询出1条数据，查询方式为exact value
GET /index3/type3/_search?q=2019-01 //则能查询出3条，由于会使用full text全字匹配，会将每一可能的部分都进行分词，只要包含则能够查询出来

9 使用mapping指定索引字段的类型以及是否要进行分词，可是手动建立索引的mapping，只能给新字段指定，或者还没建立的索引指定，mapping不能修改spa

DELETE /index3

//建立索引并指定字段的属性
PUT /index3
{ 
  "mappings": {
    "type3": {
      "properties": {
        "date":{
          "type": "date"//日期类型的exact value匹配粗略除外，es会按照搜索的部分日期匹配出一个返回.如：GET /index3/type3/_search?q=date:2019 
        },
        "name":{
          "type": "keyword"
        },
        "no":{
           "type": "long"
        },
        "content":{
           "analyzer": "standard",
           "type": "string"
        }
      }
    }
  }
}

//添加数据
PUT /index3/type3/1
{
  "date":"2019-01-02",
  "name":"the little",
  "content":"Half the ideas in his talk were plagiarized from an article I wrote last month.",
  "no":"123"
}

PUT /index3/type3/2
{
  "date":"2019-01-01",
  "name":"a dog",
  "content":"is the girl, women's attention and love day. July 7th Qiqiao customs, originated in the Han ",
  "no":"6867858"
}

PUT /index3/type3/3
{
  "date":"2019-07-01",
  "name":"very tag",
  "content":"Some of our comrades love to write long articles with no substance, very much like the foot bindings of a slattern, long as well as smelly",
  "no":"123"
}

GET /index3/type3/_search?q=name:very tag  使用字段名称只能exact value策略匹配能够查询的到，由于type指定为keyword

View Code

10 _search的精准匹配和分词后的全文检索

GET /index2/type2/_search
{
  "query": {
    //bool中出现的must和should等取交集
    "bool": {
      //must要求match里面的字段name全字匹配
      "must": [
        {
          "match": {
            "name": "ui the mark"
          }
        }
      ]
      , 
      //should要求match里面的字段content能够进行分词后的查询 
      "should": [
        {
          "match": {
            "content": "bought"
          }
        }
      ]
    }
  }
}

11 使用滚动分页数据查询方式，代替es的分页功能，由于es分页功能在深度分页时会向coordinate节点发送大量数据，排序后在取出指定位置的数据，性能很低下

//scroll=100ms滚动查询方式，超时时长100ms
GET /index3/type3/_search?scroll=100ms
{
  "query": {
    //查询全部数据
    "match_all": {}
  },
  "sort": [
    {
      //排序方式按时间升序
      "date": {
        "order": "asc"
      }
    }
  ],
  //每次向后查询3条
  "size": 3
}

12 es的DSL方式filter指定字段范围过滤（filter不参与TF&IDF评分，只进行条件过滤）

PUT /index2/type2/1
{
  "num":10,
  "name":"ui the mark",
  "content":"Mr. Johnson had never been up in an aerophane before and he had read a lot about air accidents, so one day when a"
}

PUT /index2/type2/2
{
  "num":100,
  "title":"他的名字",
  "name":"my tag",
  "content":"He bought a gallon of gas. He put the gas into a gas can. He waited until "
}

PUT /index2/type2/3
{
  "num":1000,
  "title":"这是谁的名字",
  "name":"very lit",
  "content":"happening in the world.But radio isn't lost. It is still with us. That's because a radio is very small，and it's easy to carry. You can put one in your pocket and "
}

GET /index2/type2/_search
{
  "query": {
    //bool中出现的must和should等取交集
    "bool": {
      //should要求match里面的字段content能够进行分词后的查询 
      "should": [
        {
          "match": {
            "content": "bought"
          }
        }
      ]
      ,
      "filter": {
        "range": {
          "num": {
            "gte": 10,
            "lte": 1010
          }
        }
      }
    }
  }
}

View Code

13 使用mapping的动态属性限定索引的document中的json内容

PUT /index3
{ 
  "mappings": {
    "type3": {
      "dynamic":"true",
      "properties": {
        "date":{
          "type": "date"
        },
        "name":{
          "type": "keyword"
        },
        "no":{
           "type": "long"
        },
        "content":{
           "type": "keyword"
        },
        "address":{
          //dynamic（默认为true）一旦声明为strict，则不容许type下添加额外未指定的字段，并且dynamic可在json属性内部嵌套
          "dynamic":"strict",
          "properties": {
            "city":{
              "type":"keyword"
            },
            "description":{
              "type":"text"
            }
          }
        }
      }
    }
  }
}

14 为索引index添加一个别名，若是须要index从新建立，能够经过添加删除别名指向的索引，从而不用修改程序无缝切换

POST /_aliases
{
  "actions": [
    {
      "remove": {
        "index": "index4",
        "alias": "index3_alias"
      }
    }
  ]
}

POST /_aliases
{
  "actions": [
    {
      "add": {
        "index": "index3",
        "alias": "index3_alias"
      }
    }
  ]
}

PUT /index4/type3/1
{
  "a":"a"
}

GET /index3_alias/type3/1

15 若是索引mapping中的字段类型已经指定，则没法添加其余类型的值（形如13中的索引建立方式）

//这种操做则会报错，由于定义date字段已经指定为日期类型
PUT /index3/type3/7
{
  "date":"asdsad",
  "name":"http litty",
  "content":"The happiest of people don’t necessarily have the best of everything;they just make the most of everything that comes along their",
  "no":"9786"
}

16 调整一个document提交到index能查询到的时间阈值，也就是buffer的refresh时间间隔（buffer默认每秒refresh到磁盘一次，translog若是没达到阈值大小则30分钟持久化到磁盘1次，并清空buffer）

PUT /index5
{
  "settings": {
    //每次从buffer执行refresh到磁盘的时间为30s    
    "refresh_interval": "30s"
  }
}

17 当咱们不关心检索词频率TF（Term Frequency）对搜索结果排序的影响时，可使用constant_score将查询语句query或者过滤语句filter包装起来。并且term对搜索部分词，全字匹配输入；（filter不参与TF&IDF评分，只进行条件过滤，使用constant_score能够取代只有filter的bool查询，filter可以保证不参与相关度计算，只是数据过滤，因此效率要高出不少）

GET index2/type2/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "name": "my tag"
        }
      }
    }
  }
}

18 若是name指定了type:keyword，那么只能使用"_all":"xxx"去匹配，由于keyword支持按字段extract value匹配和_all的full text全文检索匹配

GET index2/type2/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match":{
            //"name":"very"不会搜索出任何内容   
            "_all":"very"//走全文检索才能匹配出结果
          }
        }
      ]
    }
  }
}

19 filter能够嵌套多层bool查询

PUT /index2/type2/1
{
  "num": 1,
  "title":"你的名字",
  "name":"ui the mark",
  "content":"Mr. Johnson had never been up in an aerophane before and he had read a lot about air accidents, so one day when a"
}

PUT /index2/type2/2
{
  "num": 10,
  "title":"他的名字",
  "name":"my tag",
  "content":"He bought a gallon of gas. He put the gas into a gas can. He waited until "
}

PUT /index2/type2/3
{
  "num": 105,
  "title":"这是谁的名字",
  "name":"very lit",
  "content":"happening in the world.But radio isn't lost. It is still with us. That's because a radio is very small，and it's easy to carry. You can put one in your pocket and "
}


POST /index2/_mapping/type2
{
  "properties": {
    "name":{
      "type": "keyword"
    },
    "content":{
      "type": "text",
      "analyzer": "english"
    }
  }
}

GET /index2/type2/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must":[
            {
              "term":{
                "name":"very lit"
              }
            },
            {
              "bool":{
                "should":[
                  {
                    "match":{
                      "title": "我"
                    }
                  }
                ]
              }
            }
          ]
        }
      }
    }
  }
}

View Code

20 使用bool查询时，若是没有must而有should则should中必须匹配一条，若是有must，则should中的条件能够不作任何匹配（"minimum_number_should_match": 3, should数组至少匹配3个条件）

GET /index2/type2/_search
{
  "query": {
    "bool": {
      "minimum_number_should_match": 3, 
      "should": [
        {
          "match": {
            "title": "尼玛"
          }
        },
        {
          "match": {
            "name": "very lit"
          }
        },
        {
          "match": {
            "content":"happening"
          }
        }
      ]
    }
  }
}

21 单个field查询时的词量匹配，可手动控制精准程度，minimum_should_match指定在 ”你名字 d“ 4个词至少得匹配3个词即为75%

GET /index2/type2/_search
{
  "query": {
    "match": {
      "title": {
        "query": "你 名 字 d",
        "minimum_should_match": "75%"
      }
    }
  }
}

22 查询后的结果若是想要提高某一搜索关键词的评分使用boost属性指定score

GET /index2/type2/_search
{
  "query": {
    "bool": {
      "minimum_number_should_match": 2, 
      "should": [
        {
          "match": {
            "title": {
              "query": "谁",
              "boost":5
            }
          }
        },
        {
          "match": {
            "name": "very lit"
          }
        },
        {
          "match": {
            "content":"happening"
          }
        }
      ]
    }
  }
}

23 multi_match方式的多字段，多查询模式

GET /index2/type2/_search
{
  "query": {
    "multi_match": {
      "query": "happening like",
      //query中的搜索词条去content和name两个字段中来匹配，不过会因为两个字段mapping定义不一样致使得分不一样，排序结果可能有差别
      "fields": ["name","content"],
      //best_fields策略是每一个document的得分等于得分最高的match field的值；而匹配出最佳之后，其它document得分未必准确；most_fields根据每一个field的评分计算出ducoment的综合评分
      "type":"best_fields"
    }
  }
}

结果
{
  "took": 71,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0.5063205,
    "hits": [
      {
        "_index": "index2",
        "_type": "type2",
        "_id": "3",
        "_score": 0.5063205,
        "_source": {
          "num": 105,
          "title": "这是谁的名字",
          "name": "happening like write",
          "content": ""
        }
      },
      {
        "_index": "index2",
        "_type": "type2",
        "_id": "2",
        "_score": 0.41043553,
        "_source": {
          "num": 10,
          "title": "他的名字",
          "name": "yes happening like write",
          "content": "happening i like"
        }
      },
      {
        "_index": "index2",
        "_type": "type2",
        "_id": "4",
        "_score": 0.34450945,
        "_source": {
          "num": 1000,
          "title": "个人名字",
          "name": "happening like write",
          "content": "happening like yeas and he had read a lot about"
        }
      }
    ]
  }
}

View Code

24 使用match_phrase对field值进行完整query词组匹配，该词组不作分词直接完整匹配

GET /index2/type2/_search
{
  "query": {
    "bool": {
      "minimum_number_should_match": 1, 
      "should": [
        {
          //match_phrase短语匹配要求content字段必须包含treasure because值才能匹配得上
            "match_phrase": {
            "content": "treasure because"
          }
        }
      ]
    }
  }
}

View Code

25 使用match_phrase与slop，在使用词组彻底匹配时，能够在整个field值中，移动词组内的单个词位置，移动范围由slop参数指定，若是经过移动后能组成要搜索的词条，也认为匹配成功

GET /index2/type2/_search
{
  "query": {
    "bool": {
      "minimum_number_should_match": 1, 
      "should": [
        {
          //match_phrase短语匹配要求content字段必须包含treasure because值才能匹配得上
          "match_phrase": {
            "content": {
              "query": "treasure because",
              "slop":2//treasure与because每一个词，左右移动2个position后若是可以组合成treasure because词组则匹配成功
            }
          }
        }
      ]
    }
  }
}

结果：
{
  "took": 15,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.53484553,
    "hits": [
      {
        "_index": "index2",
        "_type": "type2",
        "_id": "3",
        "_score": 0.53484553,
        "_source": {
          "num": 105,
          "title": "这是谁的名字",
          "name": "happening like write",
          "content": " national  treasure because  of its rare number and cute appearance. Many foreign people are so crazy about  pandas and they can’t watching these  lovely creatures all the time. Though some action"
        }
      },
      {
        "_index": "index2",
        "_type": "type2",
        "_id": "4",
        "_score": 0.45520112,
        "_source": {
          "num": 1000,
          "title": "个人名字",
          "name": "happening like write",
          "content": "happening treasure hello like because yeas and he happening like had read a lot about happening hello like"
        }
      }
    ]
  }
}

View Code

26 使用rescoring机制增长匹配的精准度，并提升搜索效率，由于match要比match_phrase的性能好10倍左右，match_phrase性能要比match_phrase+slop性能好20几倍；rescoring可在搜索以后取出前300条数据（通常用户分页查询后浏览不会超过10页）进行match_phrase和slop的设置，来从新进行打分排序

GET index3/type3/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "content":"hello book"//先按照hello book分词后匹配出结果
          }
        }
      ]
    }
  },
  "rescore":{
    "window_size":300,//从must的结果中取出300条重排序
    "query":{
      "rescore_query":{
        "match_phrase":{
          "content":{
            "query":"hello book",
            "slop":88//排序规则是按照hello和book两个词的position关系来决定，距离越近得分越高
          }
        }
      }
    }
  }
}
结果：
{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 6,
    "max_score": 1.0520453,
    "hits": [
      {
        "_index": "index3",
        "_type": "type3",
        "_id": "1",
        "_score": 1.0520453,
        "_source": {
          "date": "2019-01-02",
          "name": "the little",
          "content": "Half the hello book ideas in his talk were plagiarized from an article I wrote last month.",
          "no": "123"
        }
      },
      {
        "_index": "index3",
        "_type": "type3",
        "_id": "4",
        "_score": 1.0472052,
        "_source": {
          "date": "2019-03-01",
          "name": "http litty",
          "content": "http://localhost:5601/app/kibana#/dev_tools/console?_g=() hello the book you ",
          "no": "123"
        }
      },
      {
        "_index": "index3",
        "_type": "type3",
        "_id": "3",
        "_score": 0.8442862,
        "_source": {
          "date": "2019-07-01",
          "name": "very tag",
          "content": "Some of our hello  comrades love book to write long articles with no substance, very much like the foot bindings of a slattern, long as well as smelly",
          "no": "123"
        }
      },
      {
        "_index": "index3",
        "_type": "type3",
        "_id": "5",
        "_score": 0.6407875,
        "_source": {
          "date": "2019-05-01",
          "name": "http litty",
          "content": "There are hello moments in life when you miss book someone so much that you just want to pick them from your dreams",
          "no": "564",
          "description": "描述"
        }
      },
      {
        "_index": "index3",
        "_type": "type3",
        "_id": "6",
        "_score": 0.52347976,
        "_source": {
          "date": "2019-06-01",
          "name": "http litty",
          "content": "The happiest of hello people don’t necessarily have the best of everything;they just make the you most of everything that comes along their book",
          "no": "9786"
        }
      },
      {
        "_index": "index3",
        "_type": "type3",
        "_id": "2",
        "_score": 0.1252801,
        "_source": {
          "date": "2019-02-01",
          "name": "a dog",
          "content": "is the girl,hello women's attention and love day. July 7th Qiqiao customs, originated in the Han ",
          "no": "6867858"
        }
      }
    ]
  }
}

View Code

27 词组+左匹配搜索实现自动完成

GET index3/type3/_search
{
  "query": {
    "match_phrase_prefix": {//词组搜索+左前缀匹配（可用于自动完成功能）
      "title": {
        "query": "the yellow",
        "slop":10,//两单词左右移动位置，能匹配doc就返回
        "max_expansions": 50//词组左匹配的时候，最多匹配50条（该参数颇有必要，若是不限制匹配条数则可能出现性能急剧降低，由于要针对全部索引进行左前缀过滤，这种状况是灾难的）
      }
      
    }
  }
}

View Code

28 在bool组合查询下，1 先进行条件过滤筛选，2 在进行字段分词检索

GET /index3/type3/_search
{
  "query": {
    "bool": {
      "filter": [//name和日期交际
        {
          "term":{
            "name":"http litty"
          }
        },
        {
          "terms":{
              "date":["2019-06-01","2019-03-01"]//日期条件并集
          }
        }
      ],
      "must": [
        {
          "match":{//模糊检索content字段
            "content":"The happiest of"
          }
        }
      ]
    }
  }
}

View Code