最近有一个线上的es查询问题,最后肯定在使用
bool query
多条件组合查询时出现should
子句查询失效,因而查找资料来肯定问题所在。html
其中Elasticsearch
: 5.5.0
json
找到相关的查询语句:app
"query": {
"bool": { // bool query 查询
"should": [ // should子句
{
"match_phrase": {
"name": {
"query": "星起",
"boost": 30,
"slop": 5
}
}
}
],
"filter": { // #filter子句
"bool": {
"must": [
{
"terms": {
"round": ["A轮"]
}
},
]
}
}
}
}
复制代码
问题在于:使用 bool query
组合查询时,should
与filter
组合查询的结果只匹配了filter
子句,并不匹配should
子句,达不到should
和filter
取交集的预期。elasticsearch
翻了一下官方文档:Bool Query | Elasticsearch Reference [5.5] | Elastic 对should
的解释:ide
The clause (query) should appear in the matching document. If the
bool
query is in a query context and has amust
orfilter
clause then a document will match thebool
query even if none of theshould
queries match. In this case these clauses are only used to influence the score. If thebool
query is a filter context or has neithermust
orfilter
then at least one of theshould
queries must match a document for it to match thebool
query. This behavior may be explicitly controlled by settings the minimum_should_match parameter.测试
大致的意思就是:should
子句是在匹配文档中使用的,若是bool
查询是在query
上下文,而且有must
或者 filter
子句时无论should
查询是否匹配,都不影响must
或者filter
子句的查询。这些子句只是影响查询的score
而已。若是bool
查询是在filter
上下文 或者 既没有must
也没有filter
则应至少一个should
查询必须匹配bool
查询。也能够显式设置minimum_should_match这个参数来解决。 从官方文档能够看出,有2种方式能够在bool query
取各数据的交集:ui
filter
上下文里minimum_should_match
参数用上面提到2种方式,咱们分别尝试一下是否能够达到预期目标。this
使用filter
上下文:spa
"query": {
"bool": {
"filter": { // filter上下文
"bool": {
"should": [ // should子句
{
"match_phrase": {
"name": {
"query": "星起",
"boost": 30,
"slop": 5
}
}
}
],
"filter": { // filter子句
"bool": {
"must": [
{
"terms": {
"round": ["A轮"]
}
}
]
}
}
}
}
}
}
复制代码
测试结果以下:code
"hits": {
"total": 1,
"max_score": null,
"hits": [
{
"_index": "index_name",
"_type": "hub/product",
"_id": "id",
"_score": 0.0, // filter下分值为0.0
"_source": {
"round": "A轮",
"name": "星起Starup",
"created_at": "2015-12-25T22:20:36.210+08:00",
"sector_name": "企业服务"
},
"highlight": {
"name": ["<em>星起</em>Starup"]
},
"sort": []
}
]
}
复制代码
测试结果知足should
与filter
子句交集,须要注意结果的分值为0.0
, 没有对查询结果匹配程度打分。
使用minimum_should_match
,至少匹配一项should
子句,能够以下设置:
"query": {
"bool": {
"should": [ // should 子句
{
"match_phrase": {
"name": {
"query": "星起",
"boost": 30,
"slop": 5
}
}
}
],
"minimum_should_match": 1, // 最少匹配一项should中条件子句
"filter": { // filter子句
"bool": {
"must": [
{
"terms": {
"round": ["A轮"]
}
},
]
}
}
}
}
复制代码
测试结果以下:
"hits": {
"total": 1,
"max_score": null,
"hits": [
{
"_index": "index_name",
"_type": "hub/product",
"_id": "id",
"_score": 757.66394,
"_source": {
"round": "A轮",
"name": "星起Starup",
"created_at": "2015-12-25T22:20:36.210+08:00",
"sector_name": "企业服务"
},
"highlight": {
"name": ["<em>星起</em>Starup"]
},
"sort": [757.66394]
}
]
}
复制代码
数据为should
与filter
子句的交集,符合预期的结果,而且有相应的匹配程度分值。
从上面2种解决方案能够看出,Elasticsearch
在查询上仍是比较灵活,平时除了须要熟悉官方的文档,还要结合业务的需求,才能找到正确解决问题的方法。