原文连接:http://tabalt.net/blog/elasti...html
做为最受欢迎和最有活力的全文搜索引擎系统,ElasticSearch有着你没法拒绝的魅力,能够方便快速地集成到项目中储存、搜索和分析海量数据。本文咱们从零开始上手来体验学习一下ElasticSearch。node
打开ElasticSearch官网的下载页面 https://www.elastic.co/downlo... 能够获取相应版本的下载地址,经过以下命令下载安装并启动ElasticSearch:数据库
cd ~/soft/ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.3.zip unzip elasticsearch-5.6.3.zip cd elasticsearch-5.6.3 ./bin/elasticsearch # 加 -d参数可做为守护进程后台运行
注意,上述示例中下载的ElasticSearch 5.6.3要求Java版本为8以上,若是你机器上没有安装Java或者版本不符合要求,须要先更新再执行./bin/elasticsearch
命令启动。此外,ElasticSearch对机器的配置要求也比较高。编程
在命令行使用curl 'http://localhost:9200/?pretty'
可测试是否启动成功,正常输出以下:服务器
{ "name" : "8Low6xs", "cluster_name" : "elasticsearch", "cluster_uuid" : "CAMAT2P2QS-UnI32tB53_A", "version" : { "number" : "5.6.3", "build_hash" : "1a2f265", "build_date" : "2017-10-06T20:33:39.012Z", "build_snapshot" : false, "lucene_version" : "6.6.1" }, "tagline" : "You Know, for Search" }
ElasticSearch提供Json格式的基于HTTP的RESTful API,可经过CURL命令直接请求,也能很是简便的在任何编程语言中使用,官方提供的经常使用语言客户端可在 https://www.elastic.co/guide/... 查询下载。app
请求格式:curl
curl -X <VERB> '<PROTOCOL>://<HOST>:<PORT>/<PATH>?<QUERY_STRING>' -d '<BODY>'
参数 | 说明 |
---|---|
VERB | HTTP方法 : GET 、 POST 、 PUT 、 HEAD 或者 DELETE |
PROTOCOL | http 或者 https |
HOST | 集群中任意节点的主机名 |
PORT | 端口号,默认是 9200 |
PATH | API 的终端路径 |
QUERY_STRING | 任意可选的查询字符串参数 |
BODY | JSON格式的请求体 (若是须要) |
请求示例:elasticsearch
curl -X GET 'http://localhost:9200/_count?pretty' -d ' { "query": { "match_all": {} } } '
Elasticsearch接口返回一个HTTP状态码(如:200 OK
)和一个JSON格式的返回值(HEAD
请求除外)。上面的CURL请求将返回一个像下面同样的 JSON 体:编程语言
{ "count" : 0, "_shards" : { "total" : 0, "successful" : 0, "skipped" : 0, "failed" : 0 } }
如需显示状态码可使用curl
命令的-i
参数。分布式
Elasticsearch是面向文档的,使用JSON做为序列化格式存储整个对象。user对象文档示例以下:
{ "email": "john@smith.com", "first_name": "John", "last_name": "Smith", "info": { "bio": "Eco-warrior and defender of the weak", "age": 25, "interests": [ "dolphins", "whales" ] }, "join_date": "2014/05/01" }
实际存储的文档还包含文档的元数据,元数据中的常见元素:
元素 | 说明 |
---|---|
_index | 文档在哪一个索引存放 |
_type | 文档对象类型 |
_id | 文档惟一标识 |
_version | 数据版本 |
注意:Type只是Index中的虚拟逻辑分组,不一样的Type应该有类似的结构。6.x版只容许每一个Index包含一个Type,7.x 版将会完全移除 Type。
索引(Index)在ElasticSearch中是多义词:
ElasticSearch默认给索引(1)中每一个文档的每一个属性创建倒排索引(3)使之能够被快速检索。
ElasticSearch是分布式数据库,容许多台服务器协同工做,每台服务器能够运行多个实例。单个实例称为一个节点(node),一组节点构成一个集群(cluster)。分片是底层的工做单元,文档保存在分片内,分片又被分配到集群内的各个节点里,每一个分片仅保存所有数据的一部分。
咱们以wecompany公司的员工信息管理为例来学习ElasticSearch中的基本操做。
向名称为wecompany的索引中添加类型为employee的3个员工信息的文档:
curl -X PUT 'http://localhost:9200/wecompany/employee/1?pretty' -d ' { "first_name" : "John", "last_name" : "Smith", "age" : 25, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] } ' curl -X PUT 'http://localhost:9200/wecompany/employee/2?pretty' -d ' { "first_name" : "Jane", "last_name" : "Smith", "age" : 32, "about" : "I like to collect rock albums", "interests": [ "music" ] } ' curl -X PUT 'http://localhost:9200/wecompany/employee/3?pretty' -d ' { "first_name" : "Douglas", "last_name" : "Fir", "age" : 35, "about": "I like to build cabinets", "interests": [ "forestry" ] } '
获取ID为1的文档:
curl -X GET 'http://localhost:9200/wecompany/employee/1?pretty' { "_index" : "wecompany", "_type" : "employee", "_id" : "1", "_version" : 1, "found" : true, "_source" : { "first_name" : "John", "last_name" : "Smith", "age" : 25, "about" : "I love to go rock climbing", "interests" : [ "sports", "music" ] } }
搜索姓氏为Smith
的员工信息:
curl -X GET 'http://localhost:9200/wecompany/employee/_search?q=last_name:Smith&pretty' { "took" : 5, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.2876821, "hits" : [ { "_index" : "wecompany", "_type" : "employee", "_id" : "2", "_score" : 0.2876821, "_source" : { "first_name" : "Jane", "last_name" : "Smith", "age" : 32, "about" : "I like to collect rock albums", "interests" : [ "music" ] } }, { "_index" : "wecompany", "_type" : "employee", "_id" : "1", "_score" : 0.2876821, "_source" : { "first_name" : "John", "last_name" : "Smith", "age" : 25, "about" : "I love to go rock climbing", "interests" : [ "sports", "music" ] } } ] } }
使用查询表达式搜索姓氏为Smith
的员工信息:
curl -X GET 'http://localhost:9200/wecompany/employee/_search?pretty' -d ' { "query" : { "match" : { "last_name" : "Smith" } } } ' # 返回结果同上
姓氏为Smith
且年龄大于30的复杂条件搜索员工信息:
curl -X GET 'http://localhost:9200/wecompany/employee/_search?pretty' -d ' { "query" : { "bool" : { "must" : { "match" : { "last_name" : "Smith" } }, "filter": { "range" : { "age" : { "gt" : 30 } } } } } } ' { "took" : 5, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.2876821, "hits" : [ { "_index" : "wecompany", "_type" : "employee", "_id" : "2", "_score" : 0.2876821, "_source" : { "first_name" : "Jane", "last_name" : "Smith", "age" : 32, "about" : "I like to collect rock albums", "interests" : [ "music" ] } } ] } }
全文搜索喜欢攀岩(rock climbing)的员工信息:
curl -X GET 'http://localhost:9200/wecompany/employee/_search?pretty' -d ' { "query" : { "match" : { "about" : "rock climbing" } } } ' { "took" : 4, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.53484553, "hits" : [ { "_index" : "wecompany", "_type" : "employee", "_id" : "1", "_score" : 0.53484553, "_source" : { "first_name" : "John", "last_name" : "Smith", "age" : 25, "about" : "I love to go rock climbing", "interests" : [ "sports", "music" ] } }, { "_index" : "wecompany", "_type" : "employee", "_id" : "2", "_score" : 0.26742277, "_source" : { "first_name" : "Jane", "last_name" : "Smith", "age" : 32, "about" : "I like to collect rock albums", "interests" : [ "music" ] } } ] } }
此外,将上述请求中的"match"换成"match_phrase"能够精确匹配短语"rock climbing"的结果。在"query"同级添加"highlight"参数能够在结果中用<em></em>
标签标注匹配的关键词:
{ "query" :{ ... } "highlight" : { "fields" : { "about" : {} } } }
聚合分析员工的兴趣:
curl -X PUT 'http://localhost:9200/wecompany/_mapping/employee?pretty' -d ' { "properties": { "interests": { "type": "text", "fielddata": true } } } ' { "acknowledged" : true }
curl -X GET 'http://localhost:9200/wecompany/employee/_search?pretty' -d ' { "aggs": { "all_interests": { "terms": { "field": "interests" } } } } ' { "took" : 33, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 1.0, "hits" : [ ... ] }, "aggregations" : { "all_interests" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "music", "doc_count" : 2 }, { "key" : "forestry", "doc_count" : 1 }, { "key" : "sports", "doc_count" : 1 } ] } } }
更新ID为2的文档,只需再次PUT便可:
curl -X PUT 'http://localhost:9200/wecompany/employee/2?pretty' -d ' { "first_name" : "Jane", "last_name" : "Smith", "age" : 33, "about" : "I like to collect rock albums", "interests": [ "music" ] } '
curl -X DELETE 'http://localhost:9200/wecompany/employee/1?pretty'
如今,你已经基本了解ElasticSearch的安装使用和简单概念了,但请不要止步于此;ElasticSearch有着深入的内涵和丰富的功能等待着你去发现,官方文档是最新最全最好的学习材料了,打开下面这个页面便可获得它:
https://www.elastic.co/guide/...
原文连接:http://tabalt.net/blog/elasti...