Elasticsearch增删改查

面向文档

document数据格式node

  1. 应用系统的数据结构都是面向对象的,复杂的
  2. 对象数据存储到数据库中,只能拆解开来,变为扁平的多张表,每次查询的时候还得还原回对象格式,至关麻烦
  3. ES是面向文档的,文档中存储的数据结构,与面向对象的数据结构是同样的,基于这种文档数据结构,es能够提供复杂的索引,全文检索,分析聚合等功能
  4. es的document用json数据格式来表达

Java数据

public class Employee {

  private String email;
  private String firstName;
  private String lastName;
  private EmployeeInfo info;
  private Date joinDate;

}

private class EmployeeInfo {
  
  private String bio; // 性格
  private Integer age;
  private String[] interests; // 兴趣爱好

}
EmployeeInfo info = new EmployeeInfo();
info.setBio("curious and modest");
info.setAge(30);
info.setInterests(new String[]{"bike", "climb"});

Employee employee = new Employee();
employee.setEmail("zhangsan@sina.com");
employee.setFirstName("san");
employee.setLastName("zhang");
employee.setInfo(info);
employee.setJoinDate(new Date());

 数据库数据

employee
id 	email 	first_name 	last_name 	join_date
001 	hangsan@sina.com 	san 	zhang 	2017/01/01
employee_info
employee_id 	bio 	age 	interests
001 	curious and modest 	30 	bike, climb

 Json数据

{
    "email":      "zhangsan@sina.com",
    "first_name": "san",
    "last_name": "zhang",
    "info": {
        "bio":         "curious and modest",
        "age":         30,
        "interests": [ "bike", "climb" ]
    },
    "join_date": "2017/01/01"
}

 集群管理

GET /_cat/health?v

 

green:每一个索引的primary shard和replica shard都是active状态的
yellow:每一个索引的primary shard都是active状态的,可是部分replica shard不是active状态,处于不可用的状态
red:不是全部索引的primary shard都是active状态的,部分索引有数据丢失了python

如今只启动动了一个es进程,至关于就只有一个node。如今es中有一个index,就是kibana本身内置创建的index。因为默认的配置是给每一个index分配5个primary shard和5个replica shard,并且primary shard和replica shard不能在同一台机器上(为了容错)。如今kibana本身创建的index是1个primary shard和1个replica shard。当前就一个node,因此只有1个primary shard被分配了和启动了,可是一个replica shard没有第二台机器去启动。只要启动第二个es进程,就会在es集群中有2个node,而后那1个replica shard就会自动分配过去,而后cluster status就会变成green状态。数据库

新增

#语法
PUT /index/type/id
{
  "json数据"
}
# 添加商品1
PUT /ecommerce/product/1
{
    "name" : "gaolujie yagao",                        #商品名称
    "desc" :  "gaoxiao meibai",                       #商品描述
    "price" :  30,								   #商品价格
    "producer" :      "gaolujie producer",            #生厂厂家
    "tags": [ "meibai", "fangzhu" ]                   #产品标签
}
#添加商品2
PUT /ecommerce/product/2
{
    "name" : "jiajieshi yagao",
    "desc" :  "youxiao fangzhu",
    "price" :  25,
    "producer" :      "jiajieshi producer",
    "tags": [ "fangzhu" ]
}
#添加商品3
PUT /ecommerce/product/3
{
    "name" : "zhonghua yagao",
    "desc" :  "caoben zhiwu",
    "price" :  40,
    "producer" :      "zhonghua producer",
    "tags": [ "qingxin" ]
}

 es会自动创建index和type,不须要提早建立,并且es默认会对document每一个field都创建倒排索引,让其能够被搜索json

查询

#语法
GET /index/type/id
GET /ecommerce/product/1

 

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "name": "gaolujie yagao",
    "desc": "gaoxiao meibai",
    "price": 30,
    "producer": "gaolujie producer",
    "tags": [
      "meibai",
      "fangzhu"
    ]
  }
}

 修改

PUT /ecommerce/product/1
{
    "name" : "jiaqiangban gaolujie yagao",
    "desc" :  "gaoxiao meibai",
    "price" :  30,
    "producer" :      "gaolujie producer",
    "tags": [ "meibai", "fangzhu" ]
}

 删除

DELETE /ecommerce/product/1

 查询

query string search的由来:由于search参数都是以http请求的query string来附带的数据结构

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
    "hits": {
    "total": 3,
    "max_score": 1,
    "hits": 
  ......
  {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
       ......
}

 

took:耗费了几毫秒
timed_out:是否超时,这里是没有
_shards:数据拆成了5个分片,因此对于搜索请求,会打到全部的primary shard(或者是它的某个replica shard)
hits.total:查询结果的数量,3个document
hits.max_score:score的含义,就是document对于一个search的相关度的匹配分数,越相关,就越匹配,分数也高
hits.hits:包含了匹配搜索的document的详细数据curl

按售价降序排列工具

GET /ecommerce/product/_search?q=name:yagao&sort=price:desc

适用场景url

适用于临时的在命令行使用一些工具,好比curl,快速的发出请求,来检索想要的信息;若是查询请求很复杂,是很难去构建的在生产环境中,几乎不多使用query string search命令行

query DSL

DSL:Domain Specified Language,特定领域的语言
http request body:请求体,能够用json的格式来构建查询语法,比较方便,能够构建各类复杂的语法,比query string search确定强大多了rest

查询全部

GET /ecommerce/product/_search
{
  "query": { "match_all": {} }
}

条件查询

查询名称包含yagao的商品,同时按照价格降序排序

GET /ecommerce/product/_search
{
    "query" : {
        "match" : {
            "name" : "yagao"
        }
    },
    "sort": [
        { "price": "desc" }
    ]
}

 分页查询

GET /ecommerce/product/_search
{
  "query": { "match_all": {} },
  "from": 1,
  "size": 1
}

 指定查询

更加适合生产环境的使用,能够构建复杂的查询

GET /ecommerce/product/_search
{
  "query": { "match_all": {} },
  "_source": ["name", "price"]
  }

 query filter

过滤查询

搜索商品名称包含yagao,并且售价大于25元的商品

GET /ecommerce/product/_search
{
    "query" : {
        "bool" : {
            "must" : {
                "match" : {
                    "name" : "yagao" 
                }
            },
            "filter" : {
                "range" : {
                    "price" : { "gt" : 25 } 
                }
            }
        }
    }
}

 full-text search(全文检索)

GET /ecommerce/product/_search
{
    "query" : {
        "match" : {
            "producer" : "yagao producer"
        }
    }
}

 

producer这个字段,会先被拆解,创建倒排索引

special   4
yagao   4
producer 1,2,3,4  
gaolujie 1  
zhognhua 3  
jiajieshi 2

yagao producer 会被拆解为 yagao和producer

phrase search(短语搜索)

跟全文检索相对应,相反,全文检索会将输入的搜索串拆解开来,去倒排索引里面去一一匹配,只要能匹配上任意一个拆解后的单词,就能够做为结果返回
phrase search,要求输入的搜索串,必须在指定的字段文本中,彻底包含如出一辙的,才能够算匹配,才能做为结果返回

GET /ecommerce/product/_search
{
    "query" : {
        "match_phrase" : {
            "producer" : "yagao producer"
        }
    }
}

 highlight search(高亮搜索结果)

GET /ecommerce/product/_search
{
    "query" : {
        "match" : {
            "producer" : "producer"
        }
    },
    "highlight": {
        "fields" : {
            "producer" : {}
        }
    }
}