大数据利器Elasticsearch之全文本查询之match_bool_prefix查询

这是我参与8月更文挑战的第10天，活动详情查看：8月更文挑战
本Elasticsearch相关文章的版本为：7.4.2markdown

一个match_bool_prefix查询：post

对输入的内容进行分词；
而后构造bool查询；
对每一个分词(除了最后一个分词)使用term查询；
但对最后一个分词采用prefix查询。

一个match_bool_prefix的例子以下：
测试数据：测试

POST /match_test/_doc/1
{
  "my_text": "my Favorite food is cold porridge"
}

POST /match_test/_doc/2
{
  "my_text": "when it's cold my favorite food is porridge"
}
复制代码

进行match_bool_prefix查询：spa

POST /match_test/_search
{
  "query": {
    "match_bool_prefix":{
      "my_text": {
        "query": "food p"
      }
    }
  }
}
复制代码

查询分析：code

”food p“通过分词将会变成food 和 p;
把food分词应用于term查询，p分词应用于prefix查询；
由于doc1和doc2的my_text分词后都有food和以p开头(porridge)的分词，因此doc1和doc2都会命中

因此会和下面的bool查询等效：orm

POST /match_test/_search
{
    "query": {
        "bool" : {
            "should": [
                { "term": { "my_text": "food" }},
                { "prefix": { "my_text": "p"}}
            ]
        }
    }
}
复制代码

返回的数据：ip

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : 1.3147935,
    "hits" : [
      {
        "_index" : "match_test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.3147935,
        "_source" : {
          "my_text" : "my Favorite food is cold porridge"
        }
      },
      {
        "_index" : "match_test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.2816185,
        "_source" : {
          "my_text" : "when it's cold my favorite food is porridge"
        }
      }
    ]
  }
}

复制代码

其余参数：
match_bool_prefix支持minimum_should_match和operator参数的配置，只有知足最小匹配子句数量的文档才会返回。同时也支持在查询的时候指定要使用的analyzer，默认是使用所查询的字段的analyzer。若是指定了analyzer，那么在分词阶段将会使用所指定的analyzer。文档

POST /match_test/_search
{
  "query": {
    "match_bool_prefix":{
      "my_text": {
        "query": "favorite Food p",
        "minimum_should_match": 2,
        "analyzer": "standard"
      }
    }
  }
}
复制代码

等同于如下的bool查询：get

POST /match_test/_search
{
    "query": {
        "bool" : {
            "should": [
                { "term": { "my_text": "favorite" }},
                { "term": { "my_text": "food" }},
                { "prefix": { "my_text": "p"}}
            ],
            "minimum_should_match": 2
        }
    }
}
复制代码

返回的数据：it

{
  "took" : 10,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 1.9045854,
    "hits" : [
      {
        "_index" : "match_test",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.9045854,
        "_source" : {
          "my_text" : "my Favorite food is cold porridge"
        }
      },
      {
        "_index" : "match_test",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.8092544,
        "_source" : {
          "my_text" : "when it's cold my favorite food is porridge"
        }
      }
    ]
  }
}
复制代码

总结：

match_bool_prefix会把输入的数据使用字段的analyzer或用户指定的analyzer进行分词；
除了最后一个分词以外全部分词都使用term查询，最后一个分词使用prefix查询，而后把全部子查询放到bool查询的should列表中。