Coreseek + Sphinx + Mysql + PHP构建中文检索引擎

大体图解

1、安装

1.下载和解压安装包php

cd /var/install
wget http://git.oschina.net/tanjiajun/sphinx/raw/master/coreseek-3.2.14.tar.gz
sudo tar -zxvf coreseek-3.2.14.tar.gz ```

**2.首先安装mmseg3(用于中文字分词)**

cd mmseg-3.2.14/mmseg-3.2.14 sudo ./bootstrap sudo ./congigure --prefix=/usr/local/mmseg3 make & make install77html

**3.安装coreseek**

cd csft-3.2.14/ sudo sh buildconf.sh sudo ./configure --prefix=/usr/local/coreseek --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql make & make instalmysql

### 2、测试安装是否成功


**1.测试mmseg3**

cd /var/install/coreseek-3.2.14/testpack/var/test cat test.xmlgit

此时
![输入图片说明](https://static.oschina.net/uploads/img/201610/31141405_Trx0.png "在这里输入图片标题")

mmseg分词

sudo /usr/local/mmseg3/bin/mmseg -d /usr/local/mmseg3/etc test.xml程序员

此时
![输入图片说明](https://static.oschina.net/uploads/img/201610/31141405_Trx0.png "在这里输入图片标题")

mmseg3安装成功!

**2.测试coreseek indexer生成索引**

cd /var/install/coreseek-3.2.14/testpack/ sudo /usr/local/coreseek/bin/indexer -c etc/csft.conf --allsql

报错
![输入图片说明](https://static.oschina.net/uploads/img/201610/31142036_VwtB.png "在这里输入图片标题")

安装libexpat 或者libexpat-dev

apt-get install libexpat或者 apt-get install libexpat-dev数据库

从新安装coreseek

sudo make clean make & make installbootstrap

再次报错,下图
![输入图片说明](https://static.oschina.net/uploads/img/201610/31142310_YSQk.png "在这里输入图片标题")

编辑:

sudo vi /src/MakeFile文件 sudo vi MakeFile文件api

LIBS = -lm -lexpat -L/usr/local/lib 改为 LIBS = -lm -lexpat -liconv -L/usr/local/lib 服务器

从新安装coreseek

sudo make clean make & make install

coreseek再次安装成功
继续测试索引生成

cd /var/install/coreseek-3.2.14/testpack/ sudo /usr/local/coreseek/bin/indexer -c etc/csft.conf --all

![输入图片说明](https://static.oschina.net/uploads/img/201610/31142556_I6B1.png "在这里输入图片标题")

索引生成成功
搜索关键字‘网络’

sudo /usr/local/coreseek/bin/search -c etc/csft.conf 网络

![输入图片说明](https://static.oschina.net/uploads/img/201610/31142714_cBQ4.png "在这里输入图片标题")

**coreseek安装完成!!!!!!!**

### 3、mysql和coreseek


此次我打算创建两个mysql的数据源配置,开启两个搜索线程服务。一个是以coreseek自带的数据库脚本和配置为例子,另一个是本身根据其例子更改的配置例子,其实都是大同小异

**1.建测试库:**

create database coreseek_test;

建表:
1.一份本身的

create table sphinx_conter(count_id integer primary key not null,max_doc_id integer not null,name varchar(255) null,desc varchar(255) null,address varchar(255) null);

2.一份coreseek自带的,它的脚本会在刚才咱们解压的目录下

/var/install/coreseek-3.2.14/testpack/var/test/documents.sql

创建配置数据源文件
Coreseek也有自带的例子,是和示例数据库表documents对应的,目录在

/var/install/coreseek-3.2.14/testpack/etc/csft_mysql.conf

下,这里复制两份份到目录

/usr/local/coreseek/etc

下,一份命名为sphinx_conter_min.conf,一份为csft_mysql.conf(默认示例)
这样个人目录下如图:

![输入图片说明](https://static.oschina.net/uploads/img/201610/31142714_cBQ4.png "在这里输入图片标题")

**2.数据源配置**

(1)csft_mysql.conf内容为:

#MySQL数据源配置 #源定义 source mysql { type = mysql

sql_host                = 127.0.0.1
sql_user                = root
sql_pass                = root
sql_db                    = coreseek_test
sql_port                = 3306
sql_query_pre            = SET NAMES utf8

sql_query                = SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content FROM documents
                                                          #sql_query第一列id需为整数
                                                          #title、content做为字符串/文本字段,被全文索引
sql_attr_uint            = group_id           #从SQL读取到的值必须为整数
sql_attr_timestamp        = date_added #从SQL读取到的值必须为整数,做为时间属性

sql_query_info_pre      = SET NAMES utf8                                        #命令行查询时,设置正确的字符集
sql_query_info            = SELECT * FROM documents WHERE id=$id #命令行查询时,从数据库读取原始数据信息

} #index定义 index mysql { source = mysql #对应的source名称 path = /usr/local/coreseek/var/data/mysql #请修改成实际使用的绝对路径,例如:/usr/local/coreseek/var/... docinfo = extern mlock = 0 morphology = none min_word_len = 1 html_strip = 0

#中文分词配置,详情请查看:http://www.coreseek.cn/products-install/coreseek_mmseg/
charset_dictpath = /usr/local/mmseg3/etc/ #BSD、Linux环境下设置,/符号结尾
#charset_dictpath = etc/                             #Windows环境下设置,/符号结尾,最好给出绝对路径,例如:C:/usr/local/coreseek/etc/...
charset_type        = zh_cn.utf-8

}

#全局index定义 indexer { mem_limit = 128M }

#searchd服务定义 searchd { listen = 9313 read_timeout = 5 max_children = 30 max_matches = 1000 seamless_rotate = 0 preopen_indexes = 0 unlink_old = 1 pid_file = /var/sphinx_log/searchd_mysql.p log = /var/sphinx_log/searchd_mysql.log
query_log = /var/sphinx_log/query_mysql.log }

(2).sphinx_conter_min.conf内容为:

#源定义 source sphinx_test { type = mysql

sql_host                = 127.0.0.1
sql_user                = root
sql_pass                = root
sql_db                    = coreseek_test
sql_port                = 3306

sql_query_pre			= SET NAMES utf8
sql_query_pre			= INSERT INTO sphinx_conter (`max_doc_id`,`name`,`desc`,`address`) values(unix_timestamp(now()),'我是程序员','侧死','广州大道中国')
sql_query				= \
SELECT count_id, max_doc_id, name, address \
FROM sphinx_conter
sql_attr_uint			= count_id
sql_attr_uint			= max_doc_id
    #sql_attr_timestamp		= date_added
#sql_field_string                = name
#sql_field_string                = desc
#sql_field_string                = address
sql_query_info		= SELECT * FROM sphinx_conter WHERE count_id=$id
#sql_query_info		= SELECT * FROM sphinx_conter

}

#index定义 index sphinx_test { source = sphinx_test #对应的source名称 path = /usr/local/coreseek/var/data/sphinx_test #请修改成实际使用的绝对路径,例如:/usr/local/coreseek/var/... docinfo = extern mlock = 0 morphology = none min_word_len = 1 html_strip = 0

#中文分词配置,详情请查看:http://www.coreseek.cn/products-install/coreseek_mmseg/
charset_dictpath = /usr/local/mmseg3/etc/ #BSD、Linux环境下设置,/符号结尾
#charset_dictpath = etc/                             #Windows环境下设置,/符号结尾,最好给出绝对路径,例如:C:/usr/local/coreseek/etc/...
charset_type        = zh_cn.utf-8

}

#全局index定义 indexer { mem_limit = 128M }

#searchd服务定义 searchd { listen = 9312 read_timeout = 5 max_children = 30 max_matches = 1000 seamless_rotate = 0 preopen_indexes = 0 unlink_old = 1 pid_file = /var/sphinx_log/searchd_sphinx_test.pid log = /var/sphinx_log/searchd_sphinx_test.log
query_log = /var/sphinx_log/query_sphinx_test.log }

**3.对配置的数据源生成索引**

(1)对csft_mysql.conf执行生成索引

sudo /usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft_mysql.conf --all --rotate

![输入图片说明](https://static.oschina.net/uploads/img/201610/31150140_Hv8R.png "在这里输入图片标题")

而后开启搜索服务

sudo /usr/local/coreseek/bin/searchd -c /usr/local/coreseek/etc/csft_mysql.conf

![输入图片说明](https://static.oschina.net/uploads/img/201610/31150312_hjeC.png "在这里输入图片标题")

而后进行搜索测试:

sudo /usr/local/coreseek/bin/search -c /usr/local/coreseek/etc/csft_mysql.conf Opera

![输入图片说明](https://static.oschina.net/uploads/img/201610/31150312_hjeC.png "在这里输入图片标题")


(2)对配置文件sphinx_conter_min.conf进行同样的操做:

sudo /usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/sphinx_conter_min.conf --all --rotate sudo /usr/local/coreseek/bin/searchd -c /usr/local/coreseek/etc/sphinx_conter_min.conf

这样咱们已是开启了两个搜索进程,一个是端口9312的,一个是9313的。可用命令

ps -ef | grep coreseek

进行查看
![输入图片说明](https://static.oschina.net/uploads/img/201610/31150542_s3rR.png "在这里输入图片标题")

**4.其它可用到的命令:**

执行增量索引 sudo /usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft_mysql.conf mysql --rotate

![输入图片说明](https://static.oschina.net/uploads/img/201610/31150542_s3rR.png "在这里输入图片标题")

合并索引 /usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft_mysql.conf --merge main delta --rotate --merge-dst-range deleted 0 0

假如配置文件配置了两个索引的话
这样能够加入定时脚本,每一分钟执行一次增量索引,每5分钟合并一次索引,而后固定时间执行所有从新生成一次索引

![输入图片说明](https://static.oschina.net/uploads/img/201610/31150841_5f7j.png "在这里输入图片标题")

至此,coreseek搜索服务器已经所有创建完毕!

### 4、使用php端连接coreseek


安装sphinxclient(在咱们以前下载的解压缩包已有)

cd /var/install/coreseek-3.2.14/csft-3.2.14/api/libsphinxclient sudo ./configure --prefix=/usr/local/sphinxclient sudo make & make install

安装sphinx的PHP扩展

cd /var/install/ sudo wget http://pecl.php.net/get/sphinx-1.3.0.tgz sudo tar -zxvf sphinx-1.3.0.tgz cd sphinx-1.3.0
sudo phpize sudo ./configure --with-php-config=/usr/local/php/bin/php-config --with-sphinx=/usr/local/sphinxclient sudo make & make install

修改php.ini增长扩展extension=sphinx.so ,重启php
php -m 查看是否已经有sphinx扩展


php使用示例代码:
https://git.oschina.net/tanjiajun/sphinx.git
相关文章
相关标签/搜索