Selector 实现原理

本文主要经过对Selector的使用流程讲解来展开其中的实现原理。
首先先来段Selector最简单使用片断java

ServerSocketChannel serverChannel = ServerSocketChannel.open();
        serverChannel.configureBlocking(false);
        int port = 5566;
        serverChannel.socket().bind(new InetSocketAddress(port));
        Selector selector = Selector.open();
        serverChannel.register(selector, SelectionKey.OP_ACCEPT);
        while(true){
            int n = selector.select();
            if(n > 0) {
                Iterator<SelectionKey> iter = selector.selectedKeys().iterator();
                while (iter.hasNext()) {
                    SelectionKey selectionKey = iter.next();
                    ......
                    iter.remove();
                }
            }
        }

SocketChannel、ServerSocketChannel和Selector的实例初始化都经过SelectorProvider类实现,其中Selector是整个NIO Socket的核心实现。linux

ServerSocketChannel.open();数组

public static ServerSocketChannel open() throws IOException {
        return SelectorProvider.provider().openServerSocketChannel();
    }

SocketChannel.open();缓存

public static SocketChannel open() throws IOException {
        return SelectorProvider.provider().openSocketChannel();
    }

Selector.open();app

public static Selector open() throws IOException {
        return SelectorProvider.provider().openSelector();
    }

咱们来进一步的了解下SelectorProvider.provider()

public static SelectorProvider provider() {
        synchronized (lock) {
            if (provider != null)
                return provider;
            return AccessController.doPrivileged(
                new PrivilegedAction<>() {
                    public SelectorProvider run() {
                            if (loadProviderFromProperty())
                                return provider;
                            if (loadProviderAsService())
                                return provider;
                            provider = sun.nio.ch.DefaultSelectorProvider.create();
                            return provider;
                        }
                    });
        }
    }

若是配置了“java.nio.channels.spi.SelectorProvider”属性,则经过该属性值load对应的SelectorProvider对象,若是构建失败则抛异常。
若是SystemClassLoader中已经加载过了SelectorProvider类,则是直接使用。不然从系统类加载器中获取失败,则抛异常。
若是上面两种状况都不存在,则返回系统默认的SelectorProvider,即,sun.nio.ch.DefaultSelectorProvider.create();
随后在调用该方法,即SelectorProvider.provider()。则返回第一次调用的结果。socket

不一样系统对应着不一样的sun.nio.ch.DefaultSelectorProvider ide

这里咱们看linux下面的sun.nio.ch.DefaultSelectorProvider函数

public class DefaultSelectorProvider {

    /**
     * Prevent instantiation.
     */
    private DefaultSelectorProvider() { }

    /**
     * Returns the default SelectorProvider.
     */
    public static SelectorProvider create() {
        return new sun.nio.ch.EPollSelectorProvider();
    }

}

能够看见,linux系统下sun.nio.ch.DefaultSelectorProvider.create(); 会生成一个sun.nio.ch.EPollSelectorProvider类型的SelectorProvider,这里对应于linux系统的epollui

接下来看下 selector.open():

/**
     * Opens a selector.
     *
     * <p> The new selector is created by invoking the {[@link](https://my.oschina.net/u/393)
     * java.nio.channels.spi.SelectorProvider#openSelector openSelector} method
     * of the system-wide default {[@link](https://my.oschina.net/u/393)
     * java.nio.channels.spi.SelectorProvider} object.  </p>
     *
     * [@return](https://my.oschina.net/u/556800)  A new selector
     *
     * [@throws](https://my.oschina.net/throws)  IOException
     *          If an I/O error occurs
     */
    public static Selector open() throws IOException {
        return SelectorProvider.provider().openSelector();
    }

在获得sun.nio.ch.EPollSelectorProvider后调用openSelector()方法构建Selector,这里会构建一个EPollSelectorImpl对象。this

EPollSelectorImpl

class EPollSelectorImpl
    extends SelectorImpl
{

    // File descriptors used for interrupt
    protected int fd0;
    protected int fd1;

    // The poll object
    EPollArrayWrapper pollWrapper;

    // Maps from file descriptors to keys
    private Map<Integer,SelectionKeyImpl> fdToKey;
EPollSelectorImpl(SelectorProvider sp) throws IOException {
        super(sp);
        long pipeFds = IOUtil.makePipe(false);
        fd0 = (int) (pipeFds >>> 32);
        fd1 = (int) pipeFds;
        try {
            pollWrapper = new EPollArrayWrapper();
            pollWrapper.initInterrupt(fd0, fd1);
            fdToKey = new HashMap<>();
        } catch (Throwable t) {
            try {
                FileDispatcherImpl.closeIntFD(fd0);
            } catch (IOException ioe0) {
                t.addSuppressed(ioe0);
            }
            try {
                FileDispatcherImpl.closeIntFD(fd1);
            } catch (IOException ioe1) {
                t.addSuppressed(ioe1);
            }
            throw t;
        }
    }

EPollSelectorImpl构造函数完成:
     ① EPollArrayWrapper的构建,EpollArrayWapper将Linux的epoll相关系统调用封装成了native方法供EpollSelectorImpl使用。
     ② 经过EPollArrayWrapper向epoll注册中断事件

void initInterrupt(int fd0, int fd1) {
        outgoingInterruptFD = fd1;
        incomingInterruptFD = fd0;
        epollCtl(epfd, EPOLL_CTL_ADD, fd0, EPOLLIN);
    }

③ fdToKey:构建文件描述符-SelectorKeyImpl映射表
④ EPollSelectorImpl还持有已经注册到selector的Channel的SelectionKey。
EPollSelectorImpl —>  SelectorImpl

public abstract class SelectorImpl
    extends AbstractSelector
{

    // The set of keys with data ready for an operation
    protected Set<SelectionKey> selectedKeys;

    // The set of keys registered with this Selector
    protected HashSet<SelectionKey> keys;

EPollArrayWrapper

EPollArrayWrapper完成了对epoll文件描述符的构建,以及对linux系统的epoll指令操纵的封装。维护每次selector.select(…)的结果,即epoll_wait结果的epoll_event数组。
EPollArrayWrapper操纵了一个linux系统下epoll_event结构的本地数组。

* typedef union epoll_data {
*     void *ptr;
*     int fd;
*     __uint32_t u32;
*     __uint64_t u64;
*  } epoll_data_t;
*
* struct epoll_event {
*     __uint32_t events;
*     epoll_data_t data;
* };

epoll_event结构包含的数据成员(epoll_data_t data)和经过epoll_ctl注册到epoll的文件描述符是同样的。这里data.fd为咱们注册的文件描述符。这样咱们在处理事件的时候就能够使用文件描述符。

EPollArrayWrapper将Linux的epoll相关系统调用封装成了native方法供EpollSelectorImpl使用。

private native int epollCreate();
    private native void epollCtl(int epfd, int opcode, int fd, int events);
    private native int epollWait(long pollAddress, int numfds, long timeout,
                                 int epfd) throws IOException;

上述三个native方法就对应Linux下epoll相关的三个系统调用

// The fd of the epoll driver
    private final int epfd;

     // The epoll_event array for results from epoll_wait
    private final AllocatedNativeObject pollArray;

    // Base address of the epoll_event array
    private final long pollArrayAddress;
EPollArrayWrapper() throws IOException {
        // creates the epoll file descriptor
        epfd = epollCreate();

        // the epoll_event array passed to epoll_wait
        int allocationSize = NUM_EPOLLEVENTS * SIZE_EPOLLEVENT;
        pollArray = new AllocatedNativeObject(allocationSize, true);
        pollArrayAddress = pollArray.address();
    }

EPoolArrayWrapper构造函数,建立了epoll文件描述符。构建了一个用于存放epoll_wait返回结果的epoll_event数组。

ServerSocketChannel.open();

返回ServerSocketChannelImpl对象,构建linux系统下ServerSocket的文件描述符。 ServerSocketChannelImpl:

// Our file descriptor
    private final FileDescriptor fd;

    // fd value needed for dev/poll. This value will remain valid
    // even after the value in the file descriptor object has been set to -1
    private int fdVal;
ServerSocketChannelImpl(SelectorProvider sp) throws IOException {
        super(sp);
        this.fd =  Net.serverSocket(true);
        this.fdVal = IOUtil.fdVal(fd);
        this.state = ST_INUSE;
    }

ServerSocketChannelImpl (其实是AbstractSelectableChannel) 中持有全部已经注册到selector的SelectionKey对象,以下:

// Keys that have been created by registering this channel with selectors.
    // They are saved because if this channel is closed the keys must be
    // deregistered.  Protected by keyLock.
    //
    private SelectionKey[] keys = null;

serverChannel.register(selector, SelectionKey.OP_ACCEPT);

public final SelectionKey register(Selector sel, int ops,
                                       Object att)
        throws ClosedChannelException
    {
        synchronized (regLock) {
            if (!isOpen())
                throw new ClosedChannelException();
            if ((ops & ~validOps()) != 0)
                throw new IllegalArgumentException();
            if (blocking)
                throw new IllegalBlockingModeException();
            SelectionKey k = findKey(sel);
            if (k != null) {
                k.interestOps(ops);
                k.attach(att);
            }
            if (k == null) {
                // New registration
                synchronized (keyLock) {
                    if (!isOpen())
                        throw new ClosedChannelException();
                    k = ((AbstractSelector)sel).register(this, ops, att);
                    addKey(k);
                }
            }
            return k;
        }
    }

将事件注册到Selector中,并将SelectionKey放入ServerSocketChannel中的SelectionKey集合中。
👇 SelectorImpl. register

protected final SelectionKey register(AbstractSelectableChannel ch,
                                          int ops,
                                          Object attachment)
    {
        if (!(ch instanceof SelChImpl))
            throw new IllegalSelectorException();
        SelectionKeyImpl k = new SelectionKeyImpl((SelChImpl)ch, this);
        k.attach(attachment);
        synchronized (publicKeys) {
            implRegister(k);
        }
        k.interestOps(ops);
        return k;
    }

EPollSelectorImpl. implRegister

protected void implRegister(SelectionKeyImpl ski) {
        if (closed)
            throw new ClosedSelectorException();
        SelChImpl ch = ski.channel;
        int fd = Integer.valueOf(ch.getFDVal());
        fdToKey.put(fd, ski);
        pollWrapper.add(fd);
        keys.add(ski);
    }

① 将channel对应的fd(文件描述符)和对应的selectionKey放到fdToKey映射表中。
② 将channel对应的fd(文件描述符)添加到pollWrapper中,并初始化fd的事件为0 ( 强制初始更新事件为0,由于该事件可能存在于以前被杀死的注册。)
③ 将selectionKey所对应的channel的文件描述符加入到pollWrapper中
④ 将selectionKey放入到 SelectionKey HashSet中。
⑤ k.interestOps(int)也会调用调EPollSelectorImpl的putEventOps(…)将事件存储到EPollArrayWrapper对象的eventsLow或eventsHigh中。

SelectionKeyImpl:

public class SelectionKeyImpl
    extends AbstractSelectionKey
{

    final SelChImpl channel;                            // package-private
    public final SelectorImpl selector;

    // Index for a pollfd array in Selector that this key is registered with
    private int index;

    private volatile int interestOps;
    private int readyOps;

维护了channel (ServerSocketChannel or SocketChannel )和selector的关联关系,以及interesOps和readOps。

int n = selector.select();

public int select() throws IOException {
        return select(0);
    }

最终会调用到EPollSelectorImpl的doSelect

protected int doSelect(long timeout) throws IOException {
        if (closed)
            throw new ClosedSelectorException();
        processDeregisterQueue();
        try {
            begin();
            pollWrapper.poll(timeout);
        } finally {
            end();
        }
        processDeregisterQueue();
        int numKeysUpdated = updateSelectedKeys();
        if (pollWrapper.interrupted()) {
            // Clear the wakeup pipe
            pollWrapper.putEventOps(pollWrapper.interruptedIndex(), 0);
            synchronized (interruptLock) {
                pollWrapper.clearInterrupted();
                IOUtil.drain(fd0);
                interruptTriggered = false;
            }
        }
        return numKeysUpdated;
    }

先来看processDeregisterQueue():

void processDeregisterQueue() throws IOException {
        Set var1 = this.cancelledKeys();
        synchronized(var1) {
            if (!var1.isEmpty()) {
                Iterator var3 = var1.iterator();

                while(var3.hasNext()) {
                    SelectionKeyImpl var4 = (SelectionKeyImpl)var3.next();

                    try {
                        this.implDereg(var4);
                    } catch (SocketException var12) {
                        IOException var6 = new IOException("Error deregistering key");
                        var6.initCause(var12);
                        throw var6;
                    } finally {
                        var3.remove();
                    }
                }
            }

        }
    }
protected void implDereg(SelectionKeyImpl ski) throws IOException {
        assert (ski.getIndex() >= 0);
        SelChImpl ch = ski.channel;
        int fd = ch.getFDVal();
        fdToKey.remove(Integer.valueOf(fd));
        pollWrapper.remove(fd);
        ski.setIndex(-1);
        keys.remove(ski);
        selectedKeys.remove(ski);
        deregister((AbstractSelectionKey)ski);
        SelectableChannel selch = ski.channel();
        if (!selch.isOpen() && !selch.isRegistered())
            ((SelChImpl)selch).kill();
    }

该方法会处理已经注销的SelectionKey集合:
① 将已经注销的selectionKey从fdToKey( 文件描述与SelectionKeyImpl的映射表 )中移除
② 将selectionKey所表明的channel的文件描述符从pollWrapper中移除
③ 将selectionKey从selectionKey集合中移除,这样下次selector.select()就不会再讲该selectionKey注册到epoll中监听
④ 也会将selectionKey从对应的channel中注销
⑤ 最后若是对应的channel已经关闭而且没有注册其余的selector了,则将该channel关闭

接着咱们来看EPollArrayWrapper.poll(timeout):

int poll(long timeout) throws IOException {
        updateRegistrations();
        updated = epollWait(pollArrayAddress, NUM_EPOLLEVENTS, timeout, epfd);
        for (int i=0; i<updated; i++) {
            if (getDescriptor(i) == incomingInterruptFD) {
                interruptedIndex = i;
                interrupted = true;
                break;
            }
        }
        return updated;
    }

updateRegistrations()方法会将已经注册到该selector的事件(eventsLow或eventsHigh)经过调用epollCtl(epfd, opcode, fd, events); 注册到linux系统中。
这里epollWait就会调用linux底层的epoll_wait方法,并返回在epoll_wait期间有事件触发的entry的个数

再看updateSelectedKeys():

private int updateSelectedKeys() {
        int entries = pollWrapper.updated;
        int numKeysUpdated = 0;
        for (int i=0; i<entries; i++) {
            int nextFD = pollWrapper.getDescriptor(i);
            SelectionKeyImpl ski = fdToKey.get(Integer.valueOf(nextFD));
            // ski is null in the case of an interrupt
            if (ski != null) {
                int rOps = pollWrapper.getEventOps(i);
                if (selectedKeys.contains(ski)) {
                    if (ski.channel.translateAndSetReadyOps(rOps, ski)) {
                        numKeysUpdated++;
                    }
                } else {
                    ski.channel.translateAndSetReadyOps(rOps, ski);
                    if ((ski.nioReadyOps() & ski.nioInterestOps()) != 0) {
                        selectedKeys.add(ski);
                        numKeysUpdated++;
                    }
                }
            }
        }
        return numKeysUpdated;
    }

该方法会从经过EPollArrayWrapper中获取到有事件触发的SelectionKeyImpl对象,而后将SelectionKeyImpl放到selectedKey集合( 有事件触发的selectionKey集合,能够经过selector.selectedKeys()方法得到 )中,即selectedKeys。并设置SelectionKeyImpl中相关的readyOps值。
可是,这里要注意两点:
① 若是SelectionKeyImpl发现触发的事件已经存在于readyOps中了,则不会使numKeysUpdated++;这样会使得咱们没法得知该事件的变化
② 若是SelectionKeyImpl已经存在于selectedKey集合中,则不会讲该事件加入到readyOps中,也不会使numKeysUpdated++
👆以上两点都说明,为何咱们要在每次从selectedKey中获取到Selectionkey后,将其从selectedKey集合移除,就是为了当有事件触发使selectionKey能正确到放入selectedKey集合中,并正确的通知给调用者。

epoll原理

epoll是Linux下的一种IO多路复用技术,能够很是高效的处理数以百万计的socket句柄。

先看看使用c封装的3个epoll系统调用:

  • int epoll_create(int size) epoll_create创建一个epoll对象。参数size是内核保证可以正确处理的最大句柄数,多于这个最大数时内核可不保证效果。
  • int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event) epoll_ctl能够操做epoll_create建立的epoll,如将socket句柄加入到epoll中让其监控,或把epoll正在监控的某个socket句柄移出epoll。
  • int epoll_wait(int epfd, struct epoll_event *events,int maxevents, int timeout) epoll_wait在调用时,在给定的timeout时间内,所监控的句柄中有事件发生时,就返回用户态的进程。

大概看看epoll内部是怎么实现的:

  1. epoll初始化时,会向内核注册一个文件系统,用于存储被监控的句柄文件,调用epoll_create时,会在这个文件系统中建立一个file节点。同时epoll会开辟本身的内核高速缓存区,以红黑树的结构保存句柄,以支持快速的查找、插入、删除。还会再创建一个list链表,用于存储准备就绪的事件。
  2. 当执行epoll_ctl时,除了把socket句柄放到epoll文件系统里file对象对应的红黑树上以外,还会给内核中断处理程序注册一个回调函数,告诉内核,若是这个句柄的中断到了,就把它放到准备就绪list链表里。因此,当一个socket上有数据到了,内核在把网卡上的数据copy到内核中后,就把socket插入到就绪链表里。
  3. 当epoll_wait调用时,仅仅观察就绪链表里有没有数据,若是有数据就返回,不然就sleep,超时时马上返回。

epoll的两种工做模式:

  • LT:level-trigger,水平触发模式,只要某个socket处于readable/writable状态,不管何时进行epoll_wait都会返回该socket。
  • ET:edge-trigger,边缘触发模式,只有某个socket从unreadable变为readable或从unwritable变为writable时,epoll_wait才会返回该socket。

socket读数据

socket写数据

最后顺便说下在Linux系统中JDK NIO使用的是 LT ,而Netty epoll使用的是 ET。

参考

http://www.jianshu.com/p/0d497fe5484a
http://remcarpediem.com/2017/04/02/Netty源码-三-I-O模型和Java-NIO底层原理/