搜索数据创建html
ElasticSearch最诱人的地方便是为咱们提供了方便快捷的搜索功能,咱们首先尝试使用以下的命令建立测试文档:json
curl -XPUT "http://localhost:9200/movies/movie/1" -d' { "title": "The Godfather", "director": "Francis Ford Coppola", "year": 1972, "genres": ["Crime", "Drama"] }' curl -XPUT "http://localhost:9200/movies/movie/2" -d' { "title": "Lawrence of Arabia", "director": "David Lean", "year": 1962, "genres": ["Adventure", "Biography", "Drama"] }' curl -XPUT "http://localhost:9200/movies/movie/3" -d' { "title": "To Kill a Mockingbird", "director": "Robert Mulligan", "year": 1962, "genres": ["Crime", "Drama", "Mystery"] }' curl -XPUT "http://localhost:9200/movies/movie/4" -d' { "title": "Apocalypse Now", "director": "Francis Ford Coppola", "year": 1979, "genres": ["Drama", "War"] }' curl -XPUT "http://localhost:9200/movies/movie/5" -d' { "title": "Kill Bill: Vol. 1", "director": "Quentin Tarantino", "year": 2003, "genres": ["Action", "Crime", "Thriller"] }' curl -XPUT "http://localhost:9200/movies/movie/6" -d' { "title": "The Assassination of Jesse James by the Coward Robert Ford", "director": "Andrew Dominik", "year": 2007, "genres": ["Biography", "Crime", "Drama"] }'
这里须要了解的是,ElasticSearch为咱们提供了通用的_bulk端点来在单请求中完成多文档建立操做,不过这里为了简单起见仍是分为了多个请求进行执行。数组
ElasticSearch中搜索主要是基于_search
这个端点进行的,其标准请求格式为:<index>/<type>/_search</type></index>
,其中index与type都是可选的。
换言之,咱们能够以以下几种方式发起请求:sass
响应内容会包含文档的元信息,文档的原始数据存在 _source 字段中。app
检索某个文档
咱们也能够直接检索出文档的 _source 字段,以下:curl
curl -XGET 'http://localhost:9200/movies/movie/1/_source'
返回的结果:elasticsearch
{ "title": "The Godfather", "director": "Francis Ford Coppola", "year": 1972, "genres": ["Crime", "Drama"] }
检索全部文档
咱们可使用 _search 这个 API 检索出全部的文档,命令以下:ide
curl -XGET 'http://localhost:9200/movies/movie/_search'
返回的结果:post
{ "took": 5, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 6, "max_score": 1, "hits": [ { "_index": "movies", "_type": "movie", "_id": "5", "_score": 1, "_source": { "title": "Kill Bill: Vol. 1", "director": "Quentin Tarantino", "year": 2003, "genres": [ "Action", "Crime", "Thriller" ] } }, { "_index": "movies", "_type": "movie", "_id": "2", "_score": 1, "_source": { "title": "Lawrence of Arabia", "director": "David Lean", "year": 1962, "genres": [ "Adventure", "Biography", "Drama" ] } }, { "_index": "movies", "_type": "movie", "_id": "4", "_score": 1, "_source": { "title": "Apocalypse Now", "director": "Francis Ford Coppola", "year": 1979, "genres": [ "Drama", "War" ] } }, { "_index": "movies", "_type": "movie", "_id": "6", "_score": 1, "_source": { "title": "The Assassination of Jesse James by the Coward Robert Ford", "director": "Andrew Dominik", "year": 2007, "genres": [ "Biography", "Crime", "Drama" ] } }, { "_index": "movies", "_type": "movie", "_id": "1", "_score": 1, "_source": { "title": "The Godfather", "director": "Francis Ford Coppola", "year": 1972, "genres": [ "Crime", "Drama" ] } }, { "_index": "movies", "_type": "movie", "_id": "3", "_score": 1, "_source": { "title": "To Kill a Mockingbird", "director": "Robert Mulligan", "year": 1962, "genres": [ "Crime", "Drama", "Mystery" ] } } ] } }
能够看到,hits
这个 object
包含了 total
,hits
数组等字段,其中,hits
数组包含了全部的文档,这里只有两个文档,total
代表了文档的数量,默认状况下会返回前 10 个结果。咱们也能够设定 From/Size
参数来获取某一范围的文档,可参考这里,好比:测试
curl -XGET 'http://localhost:9200/movies/movie/_search?from=1&size=2'
返回的结果以下:
{ "took": 6, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 6, "max_score": 1, "hits": [ { "_index": "movies", "_type": "movie", "_id": "2", "_score": 1, "_source": { "title": "Lawrence of Arabia", "director": "David Lean", "year": 1962, "genres": [ "Adventure", "Biography", "Drama" ] } }, { "_index": "movies", "_type": "movie", "_id": "4", "_score": 1, "_source": { "title": "Apocalypse Now", "director": "Francis Ford Coppola", "year": 1979, "genres": [ "Drama", "War" ] } } ] } }
检索某些字段
有时候,咱们只需检索文档的个别字段,这时可使用 _source 参数,多个字段可使用逗号分隔,以下所示:
curl -XGET 'http://localhost:9200/movies/movie/1?_source=title,director'
返回的结果:
{ "_index": "movies", "_type": "movie", "_id": "1", "_version": 1, "found": true, "_source": { "director": "Francis Ford Coppola", "title": "The Godfather" } }
query string 搜索
query string 搜索以 q=field:value
的形式进行查询,好比查询 title
字段含有 godfather
的电影:
curl -XGET 'http://localhost:9200/movies/movie/_search?q=title:godfather'
返回的结果:
{ "took": 6, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 0.25811607, "hits": [ { "_index": "movies", "_type": "movie", "_id": "1", "_score": 0.25811607, "_source": { "title": "The Godfather", "director": "Francis Ford Coppola", "year": 1972, "genres": [ "Crime", "Drama" ] } } ] } }
DSL 搜索
上面的 query string 搜索比较轻量级,只适用于简单的场合。Elasticsearch 提供了更为强大的 DSL(Domain Specific Language)查询语言,适用于复杂的搜索场景,好比全文搜索。咱们能够将上面的 query string 搜索转换为 DSL 搜索,以下:
GET /movies/movie/_search { "query" : { "match" : { "title" : "godfather" } } }
使用 curl请求:
curl -X GET "127.0.0.1:9200/movies/movie/_search" -d '{"query": {"match": {"title": "godfather"}}}'
最简单的查询请求便是全文检索,譬如咱们这里须要搜索关键字:godfather:
搜索包含“godfather”的关键字:
curl -XPOST "http://localhost:9200/_search" -d' { "query": { "query_string": { "query": "godfather", } } }'
在title中搜索包含“godfather”的关键字
curl -XPOST "http://localhost:9200/_search" -d' { "query": { "query_string": { "query": "godfather", "fields": ["title"] } } }'
返回的结果:
{ "took": 24, "timed_out": false, "_shards": { "total": 25, "successful": 25, "failed": 0 }, "hits": { "total": 1, "max_score": 0.25811607, "hits": [ { "_index": "movies", "_type": "movie", "_id": "1", "_score": 0.25811607, "_source": { "title": "The Godfather", "director": "Francis Ford Coppola", "year": 1972, "genres": [ "Crime", "Drama" ] } } ] } }
检查文档是否存在
若是你想作的只是检查文档是否存在——你对内容彻底不感兴趣——使用HEAD方法来代替GET。HEAD请求不会返回响应体,只有HTTP头:
curl -i -XHEAD "http://localhost:9200/movies/movie/3"
Elasticsearch将会返回200 OK状态若是你的文档存在:
HTTP/1.1 200 OK content-type: application/json; charset=UTF-8 content-length: 255
若是不存在返回404 Not Found:
curl -i -XHEAD "http://localhost:9200/movies/movie/36"
HTTP/1.1 404 Not Found content-type: application/json; charset=UTF-8 content-length: 60
固然,这只表示你在查询的那一刻文档不存在,但并不表示几毫秒后依旧不存在。另外一个进程在这期间可能建立新文档。