[转]FileSwitchDirectory实现原理与应用

时间 2020-09-11

标签 fileswitchdirectory 实现原理应用繁體版

原文原文链接

转至http://blog.csdn.net/duck_genuine/article/details/8006134html

FileSwitchDirectory实现原理与应用 java

FileSwitchDirectory是lucene的另外一种Directory实现类，从名字个就能够理解为文件切换的Directory实现，web

的确是针对lucene的不一样的索引文件使用不一样的Directory .借助FileSwitchDirectory整合不一样的Directory实现类的优势于一身。缓存

好比MMapDirectory,借助内存映射文件方式提升性能，但又要减小内存切换的可能，当索引太大的时候，内存映射也须要不断地切换，这样优势也可能变缺点，而以前的NIOFSDirectory实现java NIO的方式提升高并发性能，但又因高并发也会致使IO过多的影响，因此此次能够借助FileSwitchDirectory发挥他们两的优势。并发

MMapDirectory与NIOFSDirectory的实现差异。app

NIOFSDirectory----只是使用了直接内存读取文件缓存方式 ide

@Override
protected void newBuffer(byte[] newBuffer) {
super.newBuffer(newBuffer);
byteBuf = ByteBuffer.wrap(newBuffer);
}高并发

MMapDirectory------使用MMap技术映射文件，默认会映射1G的内存（64位）或者256m（32位系统））oop

MMapDiretory就是将文件映射到内存中。。使用的是MMap技术
this.buffers[bufNr] = rafc.map(MapMode.READ_ONLY, bufferStart, bufSize);性能

首先将索引目录里占比例比较小的文件使用MMapDirectory，这样几乎能够所有映射到内存里了。。而占有大比例的文档存储文件交因为NIOFSDirectory方式读取。

这个结合不错呀。。

FileSwitchDirectory实现代码解析

FileSwitchDirectory的代码很简单，由于能够理解为它就是一个Dao的入口也是个控制器，因此它并无具体的文件操纵实现。

先了解它的构造是：

[java] view plaincopy 
     
   
public FileSwitchDirectory(Set<String> primaryExtensions, Directory primaryDir, Directory secondaryDir, boolean doClose) {  
  this.primaryExtensions = primaryExtensions;  
  this.primaryDir = primaryDir;  
  this.secondaryDir = secondaryDir;  
  this.doClose = doClose;  
  this.lockFactory = primaryDir.getLockFactory();  
}  

首先是文件后缀的集合参数

主要的Directory

次要的Directory

是否关闭的时候调用

因此都是调用对应的Directory得到IndexInput 与IndexOuput

[java] view plaincopy 
     
   
@Override  
  public IndexInput openInput(String name) throws IOException {  
    return getDirectory(name).openInput(name);  
  }  

[java] view plaincopy 
     
   
@Override  
 public IndexOutput createOutput(String name) throws IOException {  
   return getDirectory(name).createOutput(name);  
 }  

经过文件名字取到对应的Directory

[java] view plaincopy 
     
   
private Directory getDirectory(String name) {  
  String ext = getExtension(name);  
  if (primaryExtensions.contains(ext)) {  
    return primaryDir;  
  } else {  
    return secondaryDir;  
  }  
}  

solr使用的DirectoryFactory实现

[java] view plaincopy 
     
   
/** 
 *  
 *  
 * 支持某些后缀文件不做映射优化，好比去掉fdt,fdx 
 *  
 *  
 *  
 *    
 <directoryFactory class="solr.MMapDirectoryFactoryExt"> 
    <str name="unmap">true</str> 
    <lst name="filetypes"> 
       <bool name="fdt">false</bool> 
       <bool name="fdx">false</bool> 
   </lst> 
 </directoryFactory> 
 * 
 */  
public class MMapDirectoryFactoryExt extends DirectoryFactory {  
    // filetypes不做映射  
    private Set<String> nonMappedFiles = new HashSet<String>();  
    // 是否使用不映射选择  
    private Boolean useUnmapHack = false;  
  
    public Directory open(String path) throws IOException {  
        MMapDirectory mmapDir = new MMapDirectory(new File(path));  
        mmapDir.setUseUnmap(useUnmapHack);  
        return new FileSwitchDirectory(nonMappedFiles, mmapDir, FSDirectory.open(new File(path)), true);  
    }  
  
    public void init(NamedList args) {  
        Object unmap, namedlist;  
        nonMappedFiles = new HashSet<String>();  
        if ((unmap = args.get("unmap")) instanceof Boolean)  
            useUnmapHack = (Boolean) unmap;  
        if ((namedlist = args.get("filetypes")) instanceof NamedList) {  
            NamedList filetypes = (NamedList) namedlist;  
            for (String type : IndexFileNames.INDEX_EXTENSIONS) {  
                Object mapped = filetypes.get(type);  
                if (Boolean.FALSE.equals(mapped))  
                    nonMappedFiles.add(type);  
            }  
        }  
    }  
}  

solrconfig.xml上的配置，使用于新的DirectoryFactory

[html] view plaincopy 
     
   
<directoryFactory class="solr.MMapDirectoryFactory">  
<str name="unmap">true</str>  
<lst name="filetypes">  
<bool name="fdt">false</bool>  
<bool name="tii">false</bool>  
</lst>  
</directoryFactory>  

线上的索引文件大小：

7.3G ./_y8b.fdt

201M ./_y8b.fdx

4.0K ./_y8b.fnm

1.8G ./_y8b.frq

76M ./_y8b.nrm

537M ./_y8b.prx

7.1M ./_y8b.tii

571M ./_y8b.tis

4.0K ./segments.gen

4.0K ./segments_1p

因为tii文件会加载到内存，因此这个不需要映射，fdt文件太大，主要是正向存储的数据，可使用NiOFSDirectory方式

还有一个文件frq文件好大，这个也是须要考虑的。

[java] view plaincopy 
     
   
public final void setMaxChunkSize(final int maxChunkSize) {  
  if (maxChunkSize <= 0)  
    throw new IllegalArgumentException("Maximum chunk size for mmap must be >0");  
  //System.out.println("Requested chunk size: "+maxChunkSize);  
  this.chunkSizePower = 31 - Integer.numberOfLeadingZeros(maxChunkSize);  
  assert this.chunkSizePower >= 0 && this.chunkSizePower <= 30;  
  //System.out.println("Got chunk size: "+getMaxChunkSize());  
}  

从上面的代码能够看出，最大也只能是1G大小。。。杯具。。