Elasticsearch 的搜索方法

时间 2019-12-11

原文原文链接

搜索数据创建html

ElasticSearch最诱人的地方便是为咱们提供了方便快捷的搜索功能，咱们首先尝试使用以下的命令建立测试文档:json

curl -XPUT "http://localhost:9200/movies/movie/1" -d'
{
    "title": "The Godfather",
    "director": "Francis Ford Coppola",
    "year": 1972,
    "genres": ["Crime", "Drama"]
}'

curl -XPUT "http://localhost:9200/movies/movie/2" -d'
{
    "title": "Lawrence of Arabia",
    "director": "David Lean",
    "year": 1962,
    "genres": ["Adventure", "Biography", "Drama"]
}'

curl -XPUT "http://localhost:9200/movies/movie/3" -d'
{
    "title": "To Kill a Mockingbird",
    "director": "Robert Mulligan",
    "year": 1962,
    "genres": ["Crime", "Drama", "Mystery"]
}'

curl -XPUT "http://localhost:9200/movies/movie/4" -d'
{
    "title": "Apocalypse Now",
    "director": "Francis Ford Coppola",
    "year": 1979,
    "genres": ["Drama", "War"]
}'

curl -XPUT "http://localhost:9200/movies/movie/5" -d'
{
    "title": "Kill Bill: Vol. 1",
    "director": "Quentin Tarantino",
    "year": 2003,
    "genres": ["Action", "Crime", "Thriller"]
}'

curl -XPUT "http://localhost:9200/movies/movie/6" -d'
{
    "title": "The Assassination of Jesse James by the Coward Robert Ford",
    "director": "Andrew Dominik",
    "year": 2007,
    "genres": ["Biography", "Crime", "Drama"]
}'

这里须要了解的是，ElasticSearch为咱们提供了通用的_bulk端点来在单请求中完成多文档建立操做，不过这里为了简单起见仍是分为了多个请求进行执行。数组

ElasticSearch中搜索主要是基于_search这个端点进行的，其标准请求格式为:<index>/<type>/_search</type></index>，其中index与type都是可选的。
换言之，咱们能够以以下几种方式发起请求:sass

http://localhost:9200/_search... - 搜索全部的Index与Type
http://localhost:9200/movies/... - 搜索Movies索引下的全部类型
http://localhost:9200/movies/movie... -仅搜索包含在Movies索引Movie类型下的文档

响应内容会包含文档的元信息，文档的原始数据存在 _source 字段中。app

检索某个文档
咱们也能够直接检索出文档的 _source 字段，以下：curl

curl -XGET 'http://localhost:9200/movies/movie/1/_source'

返回的结果：elasticsearch

{
    "title": "The Godfather",
    "director": "Francis Ford Coppola",
    "year": 1972,
    "genres": ["Crime", "Drama"]
}

检索全部文档
咱们可使用 _search 这个 API 检索出全部的文档，命令以下：ide

curl -XGET 'http://localhost:9200/movies/movie/_search'

返回的结果：post

{
    "took": 5,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 6,
        "max_score": 1,
        "hits": [
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "5",
                "_score": 1,
                "_source": {
                    "title": "Kill Bill: Vol. 1",
                    "director": "Quentin Tarantino",
                    "year": 2003,
                    "genres": [
                        "Action",
                        "Crime",
                        "Thriller"
                    ]
                }
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "2",
                "_score": 1,
                "_source": {
                    "title": "Lawrence of Arabia",
                    "director": "David Lean",
                    "year": 1962,
                    "genres": [
                        "Adventure",
                        "Biography",
                        "Drama"
                    ]
                }
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "4",
                "_score": 1,
                "_source": {
                    "title": "Apocalypse Now",
                    "director": "Francis Ford Coppola",
                    "year": 1979,
                    "genres": [
                        "Drama",
                        "War"
                    ]
                }
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "6",
                "_score": 1,
                "_source": {
                    "title": "The Assassination of Jesse James by the Coward Robert Ford",
                    "director": "Andrew Dominik",
                    "year": 2007,
                    "genres": [
                        "Biography",
                        "Crime",
                        "Drama"
                    ]
                }
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "1",
                "_score": 1,
                "_source": {
                    "title": "The Godfather",
                    "director": "Francis Ford Coppola",
                    "year": 1972,
                    "genres": [
                        "Crime",
                        "Drama"
                    ]
                }
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "3",
                "_score": 1,
                "_source": {
                    "title": "To Kill a Mockingbird",
                    "director": "Robert Mulligan",
                    "year": 1962,
                    "genres": [
                        "Crime",
                        "Drama",
                        "Mystery"
                    ]
                }
            }
        ]
    }
}

能够看到，hits 这个 object 包含了 total，hits 数组等字段，其中，hits 数组包含了全部的文档，这里只有两个文档，total 代表了文档的数量，默认状况下会返回前 10 个结果。咱们也能够设定 From/Size 参数来获取某一范围的文档，可参考这里，好比：测试

curl -XGET 'http://localhost:9200/movies/movie/_search?from=1&size=2'

返回的结果以下：

{
    "took": 6,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 6,
        "max_score": 1,
        "hits": [
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "2",
                "_score": 1,
                "_source": {
                    "title": "Lawrence of Arabia",
                    "director": "David Lean",
                    "year": 1962,
                    "genres": [
                        "Adventure",
                        "Biography",
                        "Drama"
                    ]
                }
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "4",
                "_score": 1,
                "_source": {
                    "title": "Apocalypse Now",
                    "director": "Francis Ford Coppola",
                    "year": 1979,
                    "genres": [
                        "Drama",
                        "War"
                    ]
                }
            }
        ]
    }
}

检索某些字段

有时候，咱们只需检索文档的个别字段，这时可使用 _source 参数，多个字段可使用逗号分隔，以下所示：

curl -XGET 'http://localhost:9200/movies/movie/1?_source=title,director'

返回的结果：

{
    "_index": "movies",
    "_type": "movie",
    "_id": "1",
    "_version": 1,
    "found": true,
    "_source": {
        "director": "Francis Ford Coppola",
        "title": "The Godfather"
    }
}

query string 搜索
query string 搜索以 q=field:value 的形式进行查询，好比查询 title 字段含有 godfather 的电影：

curl -XGET 'http://localhost:9200/movies/movie/_search?q=title:godfather'

返回的结果：

{
    "took": 6,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 0.25811607,
        "hits": [
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "1",
                "_score": 0.25811607,
                "_source": {
                    "title": "The Godfather",
                    "director": "Francis Ford Coppola",
                    "year": 1972,
                    "genres": [
                        "Crime",
                        "Drama"
                    ]
                }
            }
        ]
    }
}

DSL 搜索
上面的 query string 搜索比较轻量级，只适用于简单的场合。Elasticsearch 提供了更为强大的 DSL（Domain Specific Language）查询语言，适用于复杂的搜索场景，好比全文搜索。咱们能够将上面的 query string 搜索转换为 DSL 搜索，以下：

GET /movies/movie/_search
{
    "query" : {
        "match" : {
            "title" : "godfather"
        }
    }
}

使用 curl请求：

curl -X GET "127.0.0.1:9200/movies/movie/_search" -d '{"query": {"match": {"title": "godfather"}}}'

最简单的查询请求便是全文检索，譬如咱们这里须要搜索关键字:godfather:

搜索包含“godfather”的关键字：

curl -XPOST "http://localhost:9200/_search" -d'
{
    "query": {
        "query_string": {
            "query": "godfather",
        }
    }
}'

在title中搜索包含“godfather”的关键字

curl -XPOST "http://localhost:9200/_search" -d'
{
    "query": {
        "query_string": {
            "query": "godfather",
            "fields": ["title"]
        }
    }
}'

返回的结果：

{
    "took": 24,
    "timed_out": false,
    "_shards": {
        "total": 25,
        "successful": 25,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 0.25811607,
        "hits": [
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "1",
                "_score": 0.25811607,
                "_source": {
                    "title": "The Godfather",
                    "director": "Francis Ford Coppola",
                    "year": 1972,
                    "genres": [
                        "Crime",
                        "Drama"
                    ]
                }
            }
        ]
    }
}

检查文档是否存在
若是你想作的只是检查文档是否存在——你对内容彻底不感兴趣——使用HEAD方法来代替GET。HEAD请求不会返回响应体，只有HTTP头：

curl -i -XHEAD "http://localhost:9200/movies/movie/3"

Elasticsearch将会返回200 OK状态若是你的文档存在：

HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 255

若是不存在返回404 Not Found：

curl -i -XHEAD "http://localhost:9200/movies/movie/36"

HTTP/1.1 404 Not Found
content-type: application/json; charset=UTF-8
content-length: 60

固然，这只表示你在查询的那一刻文档不存在，但并不表示几毫秒后依旧不存在。另外一个进程在这期间可能建立新文档。

参考：
ElasticSearch 2.x 入门与快速实践
 Elasticsearch 入门使用