elasticsearch-ik中文分词器下载地址:git
https://github.com/medcl/elasticsearch-analysis-ikgithub
须要注意es版本和ik版本须要安装ik的README.md对比说明进行下载。json
我当前使用的es是5.6.0那么我须要下载curl
下载完成后解压:elasticsearch
直接将解压后的elasticsearch文件夹复制到 es的plugins文件夹里工具
完成后启动espost
验证ik分词器是否生效,使用postman或者curl工具发送url
分词使用ik_max_wordcode
验证分词的结果:token
{ "tokens": [ { "token": "这里是", "start_offset": 0, "end_offset": 3, "type": "CN_WORD", "position": 0 }, { "token": "这里", "start_offset": 0, "end_offset": 2, "type": "CN_WORD", "position": 1 }, { "token": "是", "start_offset": 2, "end_offset": 3, "type": "CN_CHAR", "position": 2 }, { "token": "开源", "start_offset": 3, "end_offset": 5, "type": "CN_WORD", "position": 3 }, { "token": "中国", "start_offset": 5, "end_offset": 7, "type": "CN_WORD", "position": 4 }, { "token": "社区", "start_offset": 7, "end_offset": 9, "type": "CN_WORD", "position": 5 }, { "token": "博客", "start_offset": 9, "end_offset": 11, "type": "CN_WORD", "position": 6 } ] }
分词使用ik_smart的结果
{ "tokens": [ { "token": "这里是", "start_offset": 0, "end_offset": 3, "type": "CN_WORD", "position": 0 }, { "token": "开源", "start_offset": 3, "end_offset": 5, "type": "CN_WORD", "position": 1 }, { "token": "中国", "start_offset": 5, "end_offset": 7, "type": "CN_WORD", "position": 2 }, { "token": "社区", "start_offset": 7, "end_offset": 9, "type": "CN_WORD", "position": 3 }, { "token": "博客", "start_offset": 9, "end_offset": 11, "type": "CN_WORD", "position": 4 } ] }