本文基于es7.1版本。node
针对空值的测试,使用了以下几种值:null、“null”、“”、[ ];数据库
测试代码太长,先说结论,对于全部类型,null、“”、[ ]都可以被索引,可是没法检索。对于部分数据类型,因为“null”不能转换为对应的类型,所以索引时会报错,可是对于keywork、text等能够索引string类型的字段,“null”被视做普通的string,可被索引与检索。不能够被直接检索的缘由,套用es权威指南中的一句原话: If a field has no values, how is it stored in an inverted index?现实是,空值字段在倒排索引中没有存储,it isn’t stored at all。json
须要注意的是,若是是基于es2.x版本,可以使用exists,或者missing来检索非null/null值。分别等同于关系数据库中的is not null 和is null。可是missing在7.1版本中已不可用。直接使用会报错:“no [query] registered for [missing]”。数组
在程序设计时,为了给null值设置默认值,可以使用null_value属性。相似于关系数据库中的default默认值,但又有不一样,这个请继续往下看第3点。可是须要注意的是,以下三点:app
1,在es中,只有显示设置null时,null_value才会生效,设置空数组如[ ],空字符串如""均不生效。
2,null_value默认值应该匹配数据类型。例如,date类型不能设置字符串默认值。
3,null_value仅可让字段以null_value值被倒排索引存储,以即可以让此文档被检索。并不会替换_source中的实际json文档值。测试
建立测试对象:ui
PUT ac_blog1 { "mappings": { "properties": { "title":{ "type": "text" }, "body":{ "type": "text" }, "author":{ "type": "keyword" }, "views":{ "type": "long" } } } }
录入数据:设计
POST ac_blog1/_doc { "views":null } POST ac_blog1/_doc { "views":[] } POST ac_blog1/_doc { "views":"" }
测试一下,获取所有数据:code
GET ac_blog1/_search { "query": { "match_all": {} }, "size":100 }
响应:orm
{ "took" : 355, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "ac_blog1", "_type" : "_doc", "_id" : "HFBiSW0Bf1cVbYphJHEo", "_score" : 1.0, "_source" : { "views" : null } }, { "_index" : "ac_blog1", "_type" : "_doc", "_id" : "HVBiSW0Bf1cVbYphPHEa", "_score" : 1.0, "_source" : { "views" : [ ] } }, { "_index" : "ac_blog1", "_type" : "_doc", "_id" : "HlBiSW0Bf1cVbYphRXGX", "_score" : 1.0, "_source" : { "views" : "" } } ] } }
可见文档数据都已被索引。下面来查一下:
测试null的状况:
GET ac_blog1/_search { "query": { "term": { "views":null } } }
响应:
{ "error": { "root_cause": [ { "type": "illegal_argument_exception", "reason": "field name is null or empty" } ], "type": "illegal_argument_exception", "reason": "field name is null or empty" }, "status": 400 }
测试[ ]的状况:
GET ac_blog1/_search { "query": { "term": { "views":[] } } }
响应:
{ "error": { "root_cause": [ { "type": "parsing_exception", "reason": "[term] query does not support array of values", "line": 4, "col": 15 } ], "type": "parsing_exception", "reason": "[term] query does not support array of values", "line": 4, "col": 15 }, "status": 400 }
测试""的状况:
GET ac_blog1/_search { "query": { "term": { "views":"" } } }
响应:
{ "error": { "root_cause": [ { "type": "query_shard_exception", "reason": "failed to create query: {\n \"term\" : {\n \"views\" : {\n \"value\" : \"\",\n \"boost\" : 1.0\n }\n }\n}", "index_uuid": "f_2YYPS6RAaew5bXcQwlzQ", "index": "ac_blog1" } ], "type": "search_phase_execution_exception", "reason": "all shards failed", "phase": "query", "grouped": true, "failed_shards": [ { "shard": 0, "index": "ac_blog1", "node": "oJRDxfVrQlGOJ9eqCGozDg", "reason": { "type": "query_shard_exception", "reason": "failed to create query: {\n \"term\" : {\n \"views\" : {\n \"value\" : \"\",\n \"boost\" : 1.0\n }\n }\n}", "index_uuid": "f_2YYPS6RAaew5bXcQwlzQ", "index": "ac_blog1", "caused_by": { "type": "number_format_exception", "reason": "empty String" } } } ] }, "status": 400 }
由于views为null类型,没法测试“null”的状况,会报错null没法转换为long类型,这个显而易见是es作的处理,并非底层lucene的功能。换用keyword类型的author来测试:
POST ac_blog1/_doc { "author":"null" } GET ac_blog1/_search { "query": { "term": { "author":"null" } } }
响应:
{ "took" : 416, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 0.2876821, "hits" : [ { "_index" : "ac_blog1", "_type" : "_doc", "_id" : "H1BoSW0Bf1cVbYphtHF9", "_score" : 0.2876821, "_source" : { "author" : "null" } } ] } }
以上。