Solr搜索基础

时间 2019-11-19

标签 solr 搜索基础繁體版

原文原文链接

本例咱们使用类库和代码均来自：html

http://www.cnblogs.com/TerryLiang/archive/2011/04/17/2018962.htmljava

使用C#来模拟搜索、索引创建、删除、更新过程，Demo截图以下：web

1、准备工做：

先准备一个实体类Product：数据库

  public  class Product
    {
      public string ID { get; set; }
      public string Name { get; set; }
      public String[] Features { get; set; }
      public float Price { get; set; }
      public int Popularity { get; set; }
      public bool InStock { get; set; }
      public DateTime Incubationdate_dt { get; set; }
    }

再为这个实体类建立一个反序列化类ProductDeserializer：apache

  class ProductDeserializer : IObjectDeserializer<Product>
  {
      public IEnumerable<Product> Deserialize(SolrDocumentList result)
      {
          foreach (SolrDocument doc in result)
          {
              yield return new Product()
              {
                  ID = doc["id"].ToString(),
                  Name = doc["name"].ToString(),
                  Features = (string[])((ArrayList)doc["features"]).ToArray(typeof(string)),
                  Price = (float)doc["price"],
                  Popularity = (int)doc["popularity"],
                  InStock = (bool)doc["inStock"],
                  Incubationdate_dt = (DateTime)doc["incubationdate_dt"]
              };
          }
      }
  }

为项目引入EasyNet.Solr.dll。json

2、建立搜索：

执行Solr客户端初始化操做：tomcat

        #region 初始化
       static List<SolrInputDocument> docs = new List<SolrInputDocument>();
        static OptimizeOptions optimizeOptions = new OptimizeOptions();
        static ISolrResponseParser<NamedList, ResponseHeader> binaryResponseHeaderParser = new BinaryResponseHeaderParser();
        static IUpdateParametersConvert<NamedList> updateParametersConvert = new BinaryUpdateParametersConvert();
        static ISolrUpdateConnection<NamedList, NamedList> solrUpdateConnection = new SolrUpdateConnection<NamedList, NamedList>() { ServerUrl = "http://localhost:8080/solr/" };
        static ISolrUpdateOperations<NamedList> updateOperations = new SolrUpdateOperations<NamedList, NamedList>(solrUpdateConnection, updateParametersConvert) { ResponseWriter = "javabin" };

        static ISolrQueryConnection<NamedList> connection = new SolrQueryConnection<NamedList>() { ServerUrl = "http://localhost:8080/solr/" };
        static ISolrQueryOperations<NamedList> operations = new SolrQueryOperations<NamedList>(connection) { ResponseWriter = "javabin" };

        static IObjectDeserializer<Product> exampleDeserializer = new ProductDeserializer();
        static ISolrResponseParser<NamedList, QueryResults<Product>> binaryQueryResultsParser = new BinaryQueryResultsParser<Product>(exampleDeserializer);
        #endregion

咱们先模拟一个数据源，这里内置一些数据做为示例：安全

            List<Product> products = new List<Product>();
            Product juzi = new Product
            {
                ID = "SOLR1000",
                Name = "浙江桔子",
                Features = new String[] { 
                    "色香味兼优", 
                    "既可鲜食，又可加工成以果汁",
                    "果实养分丰富"},
                Price = 2.0f,
                Popularity = 100,
                InStock = true,
                Incubationdate_dt = new DateTime(2006, 1, 17, 0, 0, 0, DateTimeKind.Utc)
            };
            products.Add(juzi);

            var doc = new SolrInputDocument();
            doc.Add("id", new SolrInputField("id", juzi.ID));
            doc.Add("name", new SolrInputField("name", juzi.Name));
            doc.Add("features", new SolrInputField("features", juzi.Features));
            doc.Add("price", new SolrInputField("price", juzi.Price));
            doc.Add("popularity", new SolrInputField("popularity", juzi.Popularity));
            doc.Add("inStock", new SolrInputField("inStock", juzi.InStock));
            doc.Add("incubationdate_dt", new SolrInputField("incubationdate_dt", juzi.Incubationdate_dt));

            docs.Add(doc);

            Product pingguo = new Product
            {
                ID = "SOLR1002",
                Name = "陕西苹果",
                Features = new String[] { 
                "味道甜美",
                "光泽鲜艳", 
                "养分丰富"
            },
                Price = 1.7f,
                Popularity = 50,
                InStock = true,
                Incubationdate_dt = new DateTime(2010, 1, 17, 0, 0, 0, DateTimeKind.Utc)
            };
            products.Add(pingguo);
            var doc2 = new SolrInputDocument();
            doc2.Add("id", new SolrInputField("id", pingguo.ID));
            doc2.Add("name", new SolrInputField("name", pingguo.Name));
            doc2.Add("features", new SolrInputField("features", pingguo.Features));
            doc2.Add("price", new SolrInputField("price", pingguo.Price));
            doc2.Add("popularity", new SolrInputField("popularity", pingguo.Popularity));
            doc2.Add("inStock", new SolrInputField("inStock", pingguo.InStock));
            doc2.Add("incubationdate_dt", new SolrInputField("incubationdate_dt", pingguo.Incubationdate_dt));

            docs.Add(doc2);

            dataGridView1.DataSource = products;

同时将这些数据添加到List<SolrInputDocument>中，SolrInputDocument是TerryLiang编写的文档交换实体，能够在他提供的源代码中看到。服务器

1. 建立索引：

建立索引是指将原始数据传递给Solr，而后在Solr目录下建立指定格式文件，这些文件可以被Solr快速查询，以下图：app

建立索引实际上就是用Update将数据POST给collection1，代码以下：

            var result = updateOperations.Update("collection1", "/update", new UpdateOptions() { OptimizeOptions = optimizeOptions, Docs = docs });
            var header = binaryResponseHeaderParser.Parse(result);

            lbl_info.Text= string.Format("Update Status:{0} QTime:{1}", header.Status, header.QTime);

索引成功后咱们能够在Solr管理界面查询：

注意：每次使用管理器搜索时，右上角都会显示搜索使用的URL：

http://localhost:8080/solr/collection1/select?q=*%3A*&wt=json&indent=true

这些参数的含义较为简单能够查询一些文档获取信息。

2. 建立查询

查询其实就是提交一个请求给服务器，等待服务器将结果返回的过程，可使用任何语言只要能发起请求并接受结果便可，这里咱们使用客户端。

先建立一个ISolrQuery对象，传入搜索关键字，关键字的构建方法能够从Solr管理界面推理出来：

假如咱们要查询name中带“苹果”的信息，咱们须要在管理界面输入：

若是想知道Solr是如何构建查询的话能够勾选DebugQuery选项，获得调试信息：

意思是只在Name这个列中检索。

因此咱们代码中须要这么写：

ISolrQuery query = new SolrQuery("name:"+keyWord);

安全问题自行考虑。

可是若是要查询所有就简单多了：

ISolrQuery query = SolrQuery.All;

将查询条件发送给服务器以后再把服务器返回的数据还原成对象显示出来即完成了一次查询操做，具体操做代码以下：

            ISolrQuery query = SolrQuery.All;
            if (!string.IsNullOrWhiteSpace(keyWord))
            {
                query = new SolrQuery("name:"+keyWord);
            }
            var result = operations.Query("collection1", "/select", query, null);
            var header = binaryResponseHeaderParser.Parse(result);

            var examples = binaryQueryResultsParser.Parse(result);

            lbl_info.Text= string.Format("Query Status:{0} QTime:{1} Total:{2}", header.Status, header.QTime, examples.NumFound);
            dataGridView1.DataSource = examples.ToList();

3. 增量索引

实际上常常会有数据是新增或者改变的，那么咱们就须要及时更新索引便于查询出新数据，就须要增量索引。这和初次索引同样，若是你想更新原有数据，那么将新数据再次提交一次便可，若是想增长提交不一样数据便可。数据判断标准为id，这是个配置项，能够在中D:\apache-tomcat-7.0.57\webapps\solr\solr_home\collection1\conf\schema.xml找到：

能够理解为主键。

代码以下：

             var docs = new List<SolrInputDocument>();
             Product hetao = new Product
             {
                 ID = "SOLR1003",
                 Name = "陕西山核桃",
                 Features = new String[] { 
                "养分好吃",
                "微量元素丰富", 
                "补脑"
            },
                 Price = 1.7f,
                 Popularity = 50,
                 InStock = true,
                 Incubationdate_dt = new DateTime(2010, 1, 17, 0, 0, 0, DateTimeKind.Utc)
             };
             var doc2 = new SolrInputDocument();
             doc2.Add("id", new SolrInputField("id", hetao.ID));
             doc2.Add("name", new SolrInputField("name", hetao.Name));
             doc2.Add("features", new SolrInputField("features", hetao.Features));
             doc2.Add("price", new SolrInputField("price", hetao.Price));
             doc2.Add("popularity", new SolrInputField("popularity", hetao.Popularity));
             doc2.Add("inStock", new SolrInputField("inStock", hetao.InStock));
             doc2.Add("incubationdate_dt", new SolrInputField("incubationdate_dt", hetao.Incubationdate_dt));
             docs.Clear();
             docs.Add(doc2);

             var result = updateOperations.Update("collection1", "/update", new UpdateOptions() { OptimizeOptions = optimizeOptions, Docs = docs });
             var header = binaryResponseHeaderParser.Parse(result);

             lbl_info.Text= string.Format("Update Status:{0} QTime:{1}", header.Status, header.QTime);

4. 删除索引

和数据库删除同样，固然按照主键进行删除。传入删除Option同时带入主键名和主键值发送给服务器便可。

具体操做代码以下：

              var result = updateOperations.Update("collection1", "/update", new UpdateOptions() { OptimizeOptions = optimizeOptions, DelById = new string[] { id } });
              var header = binaryResponseHeaderParser.Parse(result);

              lbl_info.Text=string.Format("Update Status:{0} QTime:{1}", header.Status, header.QTime);

这样就完成了一个最基本的建立索引，更新删除索引和查询的过程，本例查询速度并无直接操做管理界面那么快，缘由在于序列化和反序列化，延续上述提到的：任何语言只要能发起请求和接收响应便可以查询，能够避免这个过程，提升查询效率。

代码下载