高并发场景下的httpClient优化使用

时间 2019-11-20

标签并发场景 httpclient 优化使用栏目系统网络繁體版

原文原文链接

1.背景

咱们有个业务，会调用其余部门提供的一个基于http的服务，日调用量在千万级别。使用了httpclient来完成业务。以前由于qps上不去，就看了一下业务代码，并作了一些优化，记录在这里。java

先对比先后：优化以前，平均执行时间是250ms；优化以后，平均执行时间是80ms，下降了三分之二的消耗，容器再也不动不动就报警线程耗尽了，清爽~nginx

2.分析

项目的原实现比较粗略，就是每次请求时初始化一个httpclient，生成一个httpPost对象，执行，而后从返回结果取出entity，保存成一个字符串，最后显式关闭response和client。咱们一点点分析和优化：程序员

2.1 httpclient反复建立开销

httpclient是一个线程安全的类，没有必要由每一个线程在每次使用时建立，全局保留一个便可。数据库

2.2 反复建立tcp链接的开销

tcp的三次握手与四次挥手两大裹脚布过程，对于高频次的请求来讲，消耗实在太大。试想若是每次请求咱们须要花费5ms用于协商过程，那么对于qps为100的单系统，1秒钟咱们就要花500ms用于握手和挥手。又不是高级领导，咱们程序员就不要搞这么大作派了，改为keep alive方式以实现链接复用！缓存

2.3 重复缓存entity的开销

本来的逻辑里，使用了以下代码：安全

HttpEntity entity = httpResponse.getEntity();
String response = EntityUtils.toString(entity);

这里咱们至关于额外复制了一份content到一个字符串里，而本来的httpResponse仍然保留了一份content，须要被consume掉，在高并发且content很是大的状况下，会消耗大量内存。服务器

3.实现

按上面的分析，咱们主要要作三件事：一是单例的client，二是缓存的保活链接，三是更好的处理返回结果。一就不说了，来讲说二。并发

提到链接缓存，很容易联想到数据库链接池。httpclient4提供了一个PoolingHttpClientConnectionManager 做为链接池。接下来咱们经过如下步骤来优化：socket

3.1 定义一个keep alive strategy

关于keep-alive，本文不展开说明，只提一点，是否使用keep-alive要根据业务状况来定，它并非灵丹妙药。还有一点，keep-alive和time_wait/close_wait之间也有很多故事。tcp

在本业务场景里，咱们至关于有少数固定客户端，长时间极高频次的访问服务器，启用keep-alive很是合适

再多提一嘴，http的keep-alive 和tcp的KEEPALIVE不是一个东西。回到正文，定义一个strategy以下：

ConnectionKeepAliveStrategy myStrategy = new ConnectionKeepAliveStrategy() {
    @Override
    public long getKeepAliveDuration(HttpResponse response, HttpContext context) {
        HeaderElementIterator it = new BasicHeaderElementIterator
            (response.headerIterator(HTTP.CONN_KEEP_ALIVE));
        while (it.hasNext()) {
            HeaderElement he = it.nextElement();
            String param = he.getName();
            String value = he.getValue();
            if (value != null && param.equalsIgnoreCase
               ("timeout")) {
                return Long.parseLong(value) * 1000;
            }
        }
        return 60 * 1000;//若是没有约定，则默认定义时长为60s
    }
};

3.2 配置一个PoolingHttpClientConnectionManager

PoolingHttpClientConnectionManager connectionManager = new PoolingHttpClientConnectionManager();
connectionManager.setMaxTotal(500);
connectionManager.setDefaultMaxPerRoute(50);//例如默认每路由最高50并发，具体依据业务来定

也能够针对每一个路由设置并发数。

3.3 生成httpclient

httpClient = HttpClients.custom()
                .setConnectionManager(connectionManager)
                .setKeepAliveStrategy(kaStrategy)
                .setDefaultRequestConfig(RequestConfig.custom().setStaleConnectionCheckEnabled(true).build())
                .build();

注意：使用setStaleConnectionCheckEnabled方法来逐出已被关闭的连接不被推荐。更好的方式是手动启用一个线程，定时运行closeExpiredConnections 和closeIdleConnections方法，以下所示。

public static class IdleConnectionMonitorThread extends Thread {
    
    private final HttpClientConnectionManager connMgr;
    private volatile boolean shutdown;
    
    public IdleConnectionMonitorThread(HttpClientConnectionManager connMgr) {
        super();
        this.connMgr = connMgr;
    }

    @Override
    public void run() {
        try {
            while (!shutdown) {
                synchronized (this) {
                    wait(5000);
                    // Close expired connections
                    connMgr.closeExpiredConnections();
                    // Optionally, close connections
                    // that have been idle longer than 30 sec
                    connMgr.closeIdleConnections(30, TimeUnit.SECONDS);
                }
            }
        } catch (InterruptedException ex) {
            // terminate
        }
    }
    
    public void shutdown() {
        shutdown = true;
        synchronized (this) {
            notifyAll();
        }
    }
    
}

3.4 使用httpclient执行method时下降开销

这里要注意的是，不要关闭connection。

一种可行的获取内容的方式相似于，把entity里的东西复制一份：

res = EntityUtils.toString(response.getEntity(),"UTF-8");
EntityUtils.consume(response1.getEntity());

可是，更推荐的方式是定义一个ResponseHandler，方便你我他，再也不本身catch异常和关闭流。在此咱们能够看一下相关的源码：

public <T> T execute(final HttpHost target, final HttpRequest request,
            final ResponseHandler<? extends T> responseHandler, final HttpContext context)
            throws IOException, ClientProtocolException {
        Args.notNull(responseHandler, "Response handler");

        final HttpResponse response = execute(target, request, context);

        final T result;
        try {
            result = responseHandler.handleResponse(response);
        } catch (final Exception t) {
            final HttpEntity entity = response.getEntity();
            try {
                EntityUtils.consume(entity);
            } catch (final Exception t2) {
                // Log this exception. The original exception is more
                // important and will be thrown to the caller.
                this.log.warn("Error consuming content after an exception.", t2);
            }
            if (t instanceof RuntimeException) {
                throw (RuntimeException) t;
            }
            if (t instanceof IOException) {
                throw (IOException) t;
            }
            throw new UndeclaredThrowableException(t);
        }

        // Handling the response was successful. Ensure that the content has
        // been fully consumed.
        final HttpEntity entity = response.getEntity();
        EntityUtils.consume(entity);//看这里看这里
        return result;
    }

能够看到，若是咱们使用resultHandler执行execute方法，会最终自动调用consume方法，而这个consume方法以下所示：

public static void consume(final HttpEntity entity) throws IOException {
        if (entity == null) {
            return;
        }
        if (entity.isStreaming()) {
            final InputStream instream = entity.getContent();
            if (instream != null) {
                instream.close();
            }
        }
    }

能够看到最终它关闭了输入流。

4.其余

经过以上步骤，基本就完成了一个支持高并发的httpclient的写法，下面是一些额外的配置和提醒：

4.1 httpclient的一些超时配置

CONNECTION_TIMEOUT是链接超时时间，SO_TIMEOUT是socket超时时间，这二者是不一样的。链接超时时间是发起请求前的等待时间；socket超时时间是等待数据的超时时间。

HttpParams params = new BasicHttpParams();
//设置链接超时时间
Integer CONNECTION_TIMEOUT = 2 * 1000; //设置请求超时2秒钟 根据业务调整
Integer SO_TIMEOUT = 2 * 1000; //设置等待数据超时时间2秒钟 根据业务调整

//定义了当从ClientConnectionManager中检索ManagedClientConnection实例时使用的毫秒级的超时时间
//这个参数指望获得一个java.lang.Long类型的值。若是这个参数没有被设置，默认等于CONNECTION_TIMEOUT，所以必定要设置。
Long CONN_MANAGER_TIMEOUT = 500L; //在httpclient4.2.3中我记得它被改为了一个对象致使直接用long会报错，后来又改回来了
 
params.setIntParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, CONNECTION_TIMEOUT);
params.setIntParameter(CoreConnectionPNames.SO_TIMEOUT, SO_TIMEOUT);
params.setLongParameter(ClientPNames.CONN_MANAGER_TIMEOUT, CONN_MANAGER_TIMEOUT);
//在提交请求以前 测试链接是否可用
params.setBooleanParameter(CoreConnectionPNames.STALE_CONNECTION_CHECK, true);
 
//另外设置http client的重试次数，默认是3次；当前是禁用掉（若是项目量不到，这个默认便可）
httpClient.setHttpRequestRetryHandler(new DefaultHttpRequestRetryHandler(0, false));

4.2 若是配置了nginx的话，nginx也要设置面向两端的keep-alive

如今的业务里，没有nginx的状况反而比较稀少。nginx默认和client端打开长链接而和server端使用短连接。注意client端的keepalive_timeout和keepalive_requests参数，以及upstream端的keepalive参数设置，这三个参数的意义在此也再也不赘述。

以上就是个人所有设置。经过这些设置，成功地将本来每次请求250ms的耗时下降到了80左右，效果显著。

完。