大数据利器Elasticsearch之全文本查询之match_bool_prefix查询

这是我参与8月更文挑战的第10天,活动详情查看:8月更文挑战
本Elasticsearch相关文章的版本为:7.4.2markdown

一个match_bool_prefix查询:post

  1. 对输入的内容进行分词;
  2. 而后构造bool查询;
  3. 对每一个分词(除了最后一个分词)使用term查询;
  4. 但对最后一个分词采用prefix查询。

一个match_bool_prefix的例子以下:
测试数据:测试

POST /match_test/_doc/1
{
  "my_text": "my Favorite food is cold porridge"
}

POST /match_test/_doc/2
{
  "my_text": "when it's cold my favorite food is porridge"
}
复制代码

进行match_bool_prefix查询:spa

POST /match_test/_search
{
  "query": {
    "match_bool_prefix":{
      "my_text": {
        "query": "food p"
      }
    }
  }
}
复制代码

查询分析:code

  1. ”food p“通过分词将会变成foodp;
  2. food分词应用于term查询,p分词应用于prefix查询;
  3. 由于doc1和doc2的my_text分词后都有food和以p开头(porridge)的分词,因此doc1和doc2都会命中

因此会和下面的bool查询等效:orm

POST /match_test/_search
{
    "query": {
        "bool" : {
            "should": [
                { "term": { "my_text": "food" }},
                { "prefix": { "my_text": "p"}}
            ]
        }
    }
}
复制代码

返回的数据:ip

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : 1.3147935,
    "hits" : [
      {
        "_index" : "match_test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.3147935,
        "_source" : {
          "my_text" : "my Favorite food is cold porridge"
        }
      },
      {
        "_index" : "match_test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.2816185,
        "_source" : {
          "my_text" : "when it's cold my favorite food is porridge"
        }
      }
    ]
  }
}

复制代码

其余参数:
match_bool_prefix支持minimum_should_match和operator参数的配置,只有知足最小匹配子句数量的文档才会返回。同时也支持在查询的时候指定要使用的analyzer,默认是使用所查询的字段的analyzer。若是指定了analyzer,那么在分词阶段将会使用所指定的analyzer。文档

POST /match_test/_search
{
  "query": {
    "match_bool_prefix":{
      "my_text": {
        "query": "favorite Food p",
        "minimum_should_match": 2,
        "analyzer": "standard"
      }
    }
  }
}
复制代码

等同于如下的bool查询:get

POST /match_test/_search
{
    "query": {
        "bool" : {
            "should": [
                { "term": { "my_text": "favorite" }},
                { "term": { "my_text": "food" }},
                { "prefix": { "my_text": "p"}}
            ],
            "minimum_should_match": 2
        }
    }
}
复制代码

返回的数据:it

{
  "took" : 10,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 1.9045854,
    "hits" : [
      {
        "_index" : "match_test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.9045854,
        "_source" : {
          "my_text" : "my Favorite food is cold porridge"
        }
      },
      {
        "_index" : "match_test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.8092544,
        "_source" : {
          "my_text" : "when it's cold my favorite food is porridge"
        }
      }
    ]
  }
}
复制代码

总结:

  1. match_bool_prefix会把输入的数据使用字段的analyzer或用户指定的analyzer进行分词;
  2. 除了最后一个分词以外全部分词都使用term查询,最后一个分词使用prefix查询,而后把全部子查询放到bool查询的should列表中。
相关文章
相关标签/搜索