Lucene4.3进阶开发之漫漫修行( 四)

时间 2019-12-07

标签 lucene4.3 lucene 进阶开发漫漫修行繁體版

原文原文链接

本篇简要分析下IndexWriterConfig这个类的做用，IndexWriterConfig这个类并非一个顶级基类，在它上面还有一个父类LiveIndexWriterConfig，咱们先来分析下这个父类的一些做用，LiveIndexWriterConfig这个类是4.0之后新扩展的父类，在4.0以前并无这个类，那么引入这个类的做用是什么呢？
java

下面咱们先来看下LiveIndexWriterConfig里面的部分源码：安全

private final Analyzer analyzer;
  
  private volatile int maxBufferedDocs;
  private volatile double ramBufferSizeMB;
  private volatile int maxBufferedDeleteTerms;
  private volatile int readerTermsIndexDivisor;
  private volatile IndexReaderWarmer mergedSegmentWarmer;
  private volatile int termIndexInterval; // TODO: this should be private to the codec, not settable here

  // modified by IndexWriterConfig
  /** {@link IndexDeletionPolicy} controlling when commit
   *  points are deleted. */
  protected volatile IndexDeletionPolicy delPolicy;

  /** {@link IndexCommit} that {@link IndexWriter} is
   *  opened on. */
  protected volatile IndexCommit commit;

  /** {@link OpenMode} that {@link IndexWriter} is opened
   *  with. */
  protected volatile OpenMode openMode;

  /** {@link Similarity} to use when encoding norms. */
  protected volatile Similarity similarity;

  /** {@link MergeScheduler} to use for running merges. */
  protected volatile MergeScheduler mergeScheduler;

  /** Timeout when trying to obtain the write lock on init. */
  protected volatile long writeLockTimeout;

  /** {@link IndexingChain} that determines how documents are
   *  indexed. */
  protected volatile IndexingChain indexingChain;

  /** {@link Codec} used to write new segments. */
  protected volatile Codec codec;

  /** {@link InfoStream} for debugging messages. */
  protected volatile InfoStream infoStream;

  /** {@link MergePolicy} for selecting merges. */
  protected volatile MergePolicy mergePolicy;

  /** {@code DocumentsWriterPerThreadPool} to control how
   *  threads are allocated to {@code DocumentsWriterPerThread}. */
  protected volatile DocumentsWriterPerThreadPool indexerThreadPool;

  /** True if readers should be pooled. */
  protected volatile boolean readerPooling;

  /** {@link FlushPolicy} to control when segments are
   *  flushed. */
  protected volatile FlushPolicy flushPolicy;

  /** Sets the hard upper bound on RAM usage for a single
   *  segment, after which the segment is forced to flush. */
  protected volatile int perThreadHardLimitMB;

  /** {@link Version} that {@link IndexWriter} should emulate. */
  protected final Version matchVersion;

  /** True if segment flushes should use compound file format */
  protected volatile boolean useCompoundFile = IndexWriterConfig.DEFAULT_USE_COMPOUND_FILE_SYSTEM;

看过以后，咱们就会发现这个类里面，除了版本号和分词器是普通的成员变量外，其余field都有一个volite关键字修饰，从这个特色上，咱们其实就能够看出点猫腻，这个类的主要做用，除了保存一个全局的配置信息外，其实就是抽象了一些IndexWriterConfig一些通用的全局变量，注意这个全局指的是基于jvm主存可见的，意思就是只要这个类的某个属性发生改变，那么这个变化就会当即反映在主存中，这时候全部这个类的子类也就是IndexWriterConfig就会当即获取最新的动态信息，从而作出相应的改变。
jvm

其中重要的方法有设置最大的文档数、设置最大的缓冲大小，设置删除合并策略，设置是够开启符合索引，以及设置一些自定义的打分策略等等。测试

IndexWriterConfig是LiveIndexWriterConfig的子类，里面大部分field都是静态的变量，这个类的做用直接集成自它的父类，也是起到一个全局配置的做用，给IndexWriter提供了一系列初始化的配置参数。this

下面重点说下，IndexWriterConfig里面的一个静态内部类OpenMode的做用。
spa

其实这个类里面最重要的仍是它里面的三个枚举变量CREATE，APPEND，CREATE_OR_APPEND，另外此类还有2个方法values()，valueOf(String name)方法，前一个是返回全部的枚举变量，后一个是构造指定名称的OPENMODE，助于必须和三个枚举同样的字符串，才有可能构形成功。debug

下面来讲明这三个枚举常量的做用：code

CREATE模式：这个模式下，每次新建的索引都会先清空上次索引的目录，而后再新建当前的索引，注意能够不用事先建立索引目录，这个模式通常是测试时候用的。orm

APPEND模式：这个模式下，每次新添加的索引，会被追加到原来的索引里，有一点须要注意的是，若是这个索引路径不存在的话，这个操做将会致使报出一个异常，因此，使用此模式前，务必肯定你有一个已经建立好的索引。索引

CREATE_OR_APPEND模式：这个模式就是咱们默认的模式，也是比较安全或比较通用的模式，若是这个索引不存在，那么在此模式下就会新建一个索引目录，若是已存在，那么在添加文档的时候，直接会以Append方式追加到索引里，因此此模式下，并不会出现一些意外的的状况，因此大多数状况下，建议使用此模式，进行构件索引。