在第一章中,咱们看关于NioEventLoopGroup的初始化,咱们知道了NioEventLoopGroup对象中有一组EventLoop数组,而且数组中的每一个EventLoop对象都对应一个线程FastThreadLocalThread,那么这个线程是啥时候启动的呢?今天来继续研究下源码。。java
还记得这个方法么?就是initAndRegister方法中的register方法,这里有个if(eventLoop.inEventLoop())的逻辑判断,上一节咱们分析了,这里走else的逻辑,所以会执行eventLoop.execute方法,那么这个方法就是NioEventLoop启动的入口。咱们跟进这个execute方法,由于SingleThreadEventExecutor是NioEventLoop的子类,因此,会执行SingleThreadEventExecutor的execute方法:react
同理,依然执行的是else中的方法:首先是startThread()方法:git
而后调用doStartThread方法:github
看一下executor.execute方法,这个executor就是第一章说的ThreadPerTaskExecutor对象。所以executor就是调用的ThreadPerTaskExecutor这个类里面的:数组
以前分析过,这个newThread就是建立一个FastThreadLocalThread线程对象,所以这里就是开启一个线程。在这个线程中,将该线程对象赋值给SingleThreadEventExecutor对象的thread成员变量, thread = Thread.currentThread();至此,inEventLoop()方法将返回true了。。。而后接着执行SingleThreadEventExecutor.this.run();方法。进入该方法:app
protected void run() { for (;;) { try { switch (selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())) { case SelectStrategy.CONTINUE: continue; case SelectStrategy.SELECT: select(wakenUp.getAndSet(false)); // 'wakenUp.compareAndSet(false, true)' is always evaluated // before calling 'selector.wakeup()' to reduce the wake-up // overhead. (Selector.wakeup() is an expensive operation.) // // However, there is a race condition in this approach. // The race condition is triggered when 'wakenUp' is set to // true too early. // // 'wakenUp' is set to true too early if: // 1) Selector is waken up between 'wakenUp.set(false)' and // 'selector.select(...)'. (BAD) // 2) Selector is waken up between 'selector.select(...)' and // 'if (wakenUp.get()) { ... }'. (OK) // // In the first case, 'wakenUp' is set to true and the // following 'selector.select(...)' will wake up immediately. // Until 'wakenUp' is set to false again in the next round, // 'wakenUp.compareAndSet(false, true)' will fail, and therefore // any attempt to wake up the Selector will fail, too, causing // the following 'selector.select(...)' call to block // unnecessarily. // // To fix this problem, we wake up the selector again if wakenUp // is true immediately after selector.select(...). // It is inefficient in that it wakes up the selector for both // the first case (BAD - wake-up required) and the second case // (OK - no wake-up required). if (wakenUp.get()) { selector.wakeup(); } default: // fallthrough } cancelledKeys = 0; needsToSelectAgain = false; final int ioRatio = this.ioRatio; if (ioRatio == 100) { try { processSelectedKeys(); } finally { // Ensure we always run tasks. runAllTasks(); } } else { final long ioStartTime = System.nanoTime(); try { processSelectedKeys(); } finally { // Ensure we always run tasks. final long ioTime = System.nanoTime() - ioStartTime; runAllTasks(ioTime * (100 - ioRatio) / ioRatio); } } } catch (Throwable t) { handleLoopException(t); } // Always handle shutdown even if the loop processing threw an exception. try { if (isShuttingDown()) { closeAll(); if (confirmShutdown()) { return; } } } catch (Throwable t) { handleLoopException(t); } } }
竟然是个死循环,表面该线程就一直处于循环之中。less
该run方法主要作三件事一、首先轮询注册到reactor线程对用的selector上的全部的channel的IO事件。二、处理IO事件。三、处理异步任务队列。异步
一、检查是否有IO事件:ide
那个switch中的代码就是判断task队列中是否有任务的。oop
若是没有任务,就返回SelectStrategy.SELECT,接着执行select方法:
这个select的中的参数的意思就是将wakenUp表示是否应该唤醒正在阻塞的select操做,能够看到netty在进行一次新的loop以前,都会将wakenUp被设置成false。而后进入select方法:
private void select(boolean oldWakenUp) throws IOException { Selector selector = this.selector; try { int selectCnt = 0; long currentTimeNanos = System.nanoTime(); long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos); for (;;) { long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L; if (timeoutMillis <= 0) { if (selectCnt == 0) { selector.selectNow(); selectCnt = 1; } break; } // If a task was submitted when wakenUp value was true, the task didn't get a chance to call // Selector#wakeup. So we need to check task queue again before executing select operation. // If we don't, the task might be pended until select operation was timed out. // It might be pended until idle timeout if IdleStateHandler existed in pipeline. if (hasTasks() && wakenUp.compareAndSet(false, true)) { selector.selectNow(); selectCnt = 1; break; } int selectedKeys = selector.select(timeoutMillis); selectCnt ++; if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) { // - Selected something, // - waken up by user, or // - the task queue has a pending task. // - a scheduled task is ready for processing break; } if (Thread.interrupted()) { // Thread was interrupted so reset selected keys and break so we not run into a busy loop. // As this is most likely a bug in the handler of the user or it's client library we will // also log it. // // See https://github.com/netty/netty/issues/2426 if (logger.isDebugEnabled()) { logger.debug("Selector.select() returned prematurely because " + "Thread.currentThread().interrupt() was called. Use " + "NioEventLoop.shutdownGracefully() to shutdown the NioEventLoop."); } selectCnt = 1; break; } long time = System.nanoTime(); if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) { // timeoutMillis elapsed without anything selected. selectCnt = 1; } else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 && selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) { // The selector returned prematurely many times in a row. // Rebuild the selector to work around the problem. logger.warn( "Selector.select() returned prematurely {} times in a row; rebuilding Selector {}.", selectCnt, selector); rebuildSelector(); selector = this.selector; // Select again to populate selectedKeys. selector.selectNow(); selectCnt = 1; break; } currentTimeNanos = time; } if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS) { if (logger.isDebugEnabled()) { logger.debug("Selector.select() returned prematurely {} times in a row for Selector {}.", selectCnt - 1, selector); } } } catch (CancelledKeyException e) { if (logger.isDebugEnabled()) { logger.debug(CancelledKeyException.class.getSimpleName() + " raised by a Selector {} - JDK bug?", selector, e); } // Harmless exception - log anyway } }
首先,看下long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);这一行代码:嗯?delayNanos是什么鬼?跟进去看一下:
等等,peekScheduledTask又是什么鬼?再进去瞅瞅。。。。
哎呀,这个scheduledTaskQueue是什么队列?
哦,原来是一个优先级队列,实际上是一个按照定时任务将要执行的时间排序的一个队列。所以peekScheduledTask队列返回的是最近要执行的一个任务。因此,这个delayNanos返回的是到以一个定时任务的时间,若是定时任务队列没有值,那么默认就是1秒,即1000000000纳秒。所以selectDeadLineNanos就表示当前时间+到第一个要执行的定时任务的时间。
下面在select方法中又是一个循环,在循环中第一句:long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;这句话表示是否当前的定时任务队列中有任务的截止事件快到了(<=0.5ms):
若是当前的定时任务中的事件快到了(还有不到0.5ms的时间,定时任务就要执行了),而后就进入if里面,selectCnt表示的是执行select的次数。若是一次都没有select过,就立马进行selector.selectNow,该方法是非阻塞的,会立马返回,并将selectCnt设置为1,而后跳出循环。若是当前的定时任务中的事件的执行离当前时间还差0.5ms以上,则继续向下执行:
在这个if中,netty会判断任务队列中是否又任务而且wekenUp标记为是否被设置为了true,若是if知足了,代表任务队列已经有了任务,要结束本次的select的操做了,一样,立马进行selector.selectNow,并并将selectCnt设置为1,跳出循环。不然的话,将继续执行。
selector.select(timeoutMillis)是一个阻塞的select,阻塞时间就是当前时间到定时任务执行前的0.5ms的这一段时间。而后将selectCnt++。这里有个问题,若是离第一个定时任务执行还有20分钟,那这个方法岂不是要阻塞接近20分钟么?是的,没错,那若是这个时候,任务队列里又了任务了怎么办:
因此当有外部线程向任务队列中放入任务的时候,selector会唤醒阻塞的select操做。
等阻塞的select执行完成后,netty会判断是否已经有IO时间或者oldWakeUp为true,或者用户主动唤醒了select,或者task队列中已经有任务了或者第一个定时任务将要被执行了,知足其中一个条件,则代表要跳出本次的select方法了。
netty会在每次进行阻塞select以前记录一下开始时时间currentTimeNanos,在select以后记录一下结束时间,判断select操做是否至少持续了timeoutMillis秒(这里将time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos改为time - currentTimeNanos >= TimeUnit.MILLISECONDS.toNanos(timeoutMillis)或许更好理解一些),
若是持续的时间大于等于timeoutMillis,说明就是一次有效的轮询,重置selectCnt标志,代表选择超时,并无IO时间。
这里有一个NIO的空轮询bug,该bug会致使Selector一直空轮询,最终致使CPU飙升100%,nio Server不可用,那么这个else部分的逻辑就是netty规避空轮询的bug。若是阻塞select返回了,并非超时返回的,那么就说明已经出现了空轮询现象,那么就进入了该else逻辑。该逻辑会判断空轮询的次数是否大于SELECTOR_AUTO_REBUILD_THRESHOLD这个数,这个数是多少呢?
默认是512次。即空轮询不能超过512次。若是超过了,那么就执行rebuildSelector方法,该方法的名字是要从新构建一个selector。的确是这样:
public void rebuildSelector() { if (!inEventLoop()) { execute(new Runnable() { @Override public void run() { rebuildSelector(); } }); return; } final Selector oldSelector = selector; //定义一个新的Selector对象 final Selector newSelector; if (oldSelector == null) { return; } try { //从新实例化该Selector对象 newSelector = openSelector(); } catch (Exception e) { logger.warn("Failed to create a new Selector.", e); return; } // Register all channels to the new Selector. int nChannels = 0; for (;;) { try { //遍历原有的selector上的key for (SelectionKey key: oldSelector.keys()) { //获取注册到selector上的NioServerSocketChannel Object a = key.attachment(); try { if (!key.isValid() || key.channel().keyFor(newSelector) != null) { continue; } int interestOps = key.interestOps(); //取消该key在旧的selector上的事件注册 key.cancel(); //将该key对应的channel注册到新的selector上 SelectionKey newKey = key.channel().register(newSelector, interestOps, a); if (a instanceof AbstractNioChannel) { // Update SelectionKey //从新绑定新key和channel的关系 ((AbstractNioChannel) a).selectionKey = newKey; } nChannels ++; } catch (Exception e) { logger.warn("Failed to re-register a Channel to the new Selector.", e); if (a instanceof AbstractNioChannel) { AbstractNioChannel ch = (AbstractNioChannel) a; ch.unsafe().close(ch.unsafe().voidPromise()); } else { @SuppressWarnings("unchecked") NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a; invokeChannelUnregistered(task, key, e); } } } } catch (ConcurrentModificationException e) { // Probably due to concurrent modification of the key set. continue; } break; } selector = newSelector; try { // time to close the old selector as everything else is registered to the new one oldSelector.close(); } catch (Throwable t) { if (logger.isWarnEnabled()) { logger.warn("Failed to close the old Selector.", t); } } logger.info("Migrated " + nChannels + " channel(s) to the new Selector."); }
而后用新的selector直接调用selectNow:
这就是Netty规避Nio空轮询的bug问题。至此NioEventLoop的线程启动(或者说netty的reactor线程)的检查是否有IO事件分析完了,下一章继续分析2和3两个知识点。