CentOS下dotnet Core使用HttpWebRequest进行HTTP通信,系统存在大量CLOSE_WAIT链接问题的分析,已解决。

环境:html

dotnet core 1.0.1nginx

CentOS 7.2web

今天在服务器巡检的时候,发现一个服务大量抛出异常服务器

异常信息为:网络

LockStatusPushError&&Message:One or more errors occurred. (An error occurred while sending the request. Too many open files)&InnerMessageAn error occurred while sending the request. Too many open files& at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Task.Wait()
at CommonHelper.HttpHelper.HttpRequest(String Url, String Method, String ContentType, Byte[] data, Encoding encoding)
at CommonHelper.HttpHelper.PostForm(String Url, Dictionary`2 para, Encoding encoding)
at CommonHelper.HttpHelper.PostForm(String Url, Dictionary`2 para)
at DeviceService.Program.LockStatusPushMethod()

首先推断,是程序打开文件(端口或者管道)太多致使的超过系统最大限制app

使用 ulimit -n 查看最大限制 发现 系统最大限制为65535 为正常值socket

使用 lsof | wc -l 查看当前打开文件数 发现执行很是缓慢,执行结果显示系统当前打开文件数500w++。。。。。async

继而查看dotnet程序打开文件数,发现为400w++tcp

lsof>>/tmp/lsof.log 把当前打开文件列表保存 以供问题判断。ide

 

文件导出后,发现 dotnet 程序有大量状态为 CLOSE_WAIT 的socket链接 目的地址为程序访问的HTTP服务器的80端口

dotnet    12208 20425    root  216r     FIFO                0,8       0t0    2273974 pipe
dotnet    12208 20425    root  217w     FIFO                0,8       0t0    2273974 pipe
dotnet    12208 20425    root  218u     IPv4            2274459       0t0        TCP txk-web:44336->txk-web:http (CLOSE_WAIT)
dotnet    12208 20425    root  219r     FIFO                0,8       0t0    2274460 pipe
dotnet    12208 20425    root  220w     FIFO                0,8       0t0    2274460 pipe
dotnet    12208 20425    root  221u     IPv4            2271144       0t0        TCP txk-web:44340->txk-web:http (CLOSE_WAIT)
dotnet    12208 20425    root  222r     FIFO                0,8       0t0    2273977 pipe
dotnet    12208 20425    root  223w     FIFO                0,8       0t0    2273977 pipe
dotnet    12208 20425    root  224u     IPv4            2274462       0t0        TCP txk-web:44344->txk-web:http (CLOSE_WAIT)
dotnet    12208 20425    root  225r     FIFO                0,8       0t0    2271147 pipe
dotnet    12208 20425    root  226w     FIFO                0,8       0t0    2271147 pipe
dotnet    12208 20425    root  227u     IPv4            2272624       0t0        TCP txk-web:44348->txk-web:http (CLOSE_WAIT)
dotnet    12208 20425    root  228r     FIFO                0,8       0t0    2272625 pipe
dotnet    12208 20425    root  229w     FIFO                0,8       0t0    2272625 pipe
dotnet    12208 20425    root  230u     IPv4            2273985       0t0        TCP txk-web:44352->txk-web:http (CLOSE_WAIT)
dotnet    12208 20425    root  231r     FIFO                0,8       0t0    2271150 pipe
dotnet    12208 20425    root  232w     FIFO                0,8       0t0    2271150 pipe
dotnet    12208 20425    root  233u     IPv4            2272627       0t0        TCP txk-web:44356->txk-web:http (CLOSE_WAIT)

定位缘由出如今HTTP访问上

继而查看程序的日志,发现须要程序访问的HTTP接口报500错误,

出现错误后程序会重试请求(逻辑上要求重试),重试间隔为100ms,过短致使短期内有太多请求

首先解释CLOSE_WAIT

 

对方主动关闭链接或者网络异常致使链接中断,这时我方的状态会变成CLOSE_WAIT 此时我方要关闭链接来使得链接正确关闭。

初步判断可能有以下缘由:

1.程序抛出异常后没有释放资源

2.dotnet core 底层的 bug

3.nginx代理强制关个人链接,又没有给我关闭的确认包

4.HTTP请求超时(这个基本没可能,HTTP接口在本机)

 

接下来首先看代码,个人HTTP访问方法代码以下:

private static byte[] HttpRequest(string Url, string Method, string ContentType, byte[] data, Encoding encoding)
{
    WebResponse response = null;
    HttpWebRequest request = null;
    byte[] result = null;
    try
    {
        request = (HttpWebRequest)WebRequest.Create(Url);
        request.Headers["UserAgent"] = @"Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)";
        request.Accept = @"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8";
        request.Method = Method;
        request.ContentType = ContentType;
        if (data != null)
        {
            var reqStreamAsync = request.GetRequestStreamAsync();
            //reqStreamAsync.Wait();
            using (Stream reqStream = reqStreamAsync.Result)
            {
                reqStream.Write(data, 0, data.Length);
                reqStream.Dispose();
            }
        }
        var reqAsync = request.GetResponseAsync();
        //reqAsync.Wait();
        using (response = reqAsync.Result)
        {
            using (Stream stream = response.GetResponseStream())
            {
                List<byte> byteArr = new List<byte>();
                int tmp = -1;
                while ((tmp = stream.ReadByte()) >= 0)
                {
                    byteArr.Add((byte)tmp);
                }
                result = byteArr.ToArray();
                stream.Dispose();
            }
            response.Dispose();
        }
    }
    catch (Exception ex)
    {
        throw;
    }
    finally
    {
        if (request != null)
        {
            request.Abort();
            request = null;
        }
        if (response != null)
        {
            response.Dispose();
            response = null;
        }
    }
    return result;
}

看到代码 第一想法是 HttpWebRequest 没有套using也没有Dispose(),

可是尝试后发现,这个类根本就没有实现IDisposable接口,也无法手工释放,

百度以后获得结论,只能Abort(),添加到finally,顺便给WebResponse增长Dispost(),从新尝试 -------- 无效。

 

以后修改了Centos的/etc/sysctl.conf

增长对keepalive相关配置进行尝试

net.ipv4.tcp_keepalive_time=60
net.ipv4.tcp_keepalive_probes=2
net.ipv4.tcp_keepalive_intvl=2

 

而后 sysctl -p 从新加载配置,再次尝试 -------- 问题依旧。

 

以后又感受是程序没有释放HttpWebRequest,

在HTTP访问方法的finally中加入GC.Collect(),但愿强制回收 -------- 仍是没用。

 

最终已经放弃寻找问题,直接把重试的地方增长延时,若是http请求出错,Thread.Sleep(10000);

临时解决此问题。

 

问题最终没有完美解决。

但愿各位若是谁能知道问题缘由,与我讨论,谢谢

 

2017.04.07 更新

今天更换HttpClient进行HTTP通信

发现问题解决了。。。。

代码以下,欢迎指正~

private async static Task<byte[]> HttpRequest(string Url, HttpMethodEnum HttpMethod, string ContentType, byte[] data)
{
    byte[] result = null;
    try
    {
        using (HttpClient http = new HttpClient())
        {
            http.DefaultRequestHeaders.Add("User-Agent", @"Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)");
            http.DefaultRequestHeaders.Add("Accept", @"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");

            HttpResponseMessage message = null;
            if (HttpMethod == HttpMethodEnum.POST)
            {
                using (Stream dataStream = new MemoryStream(data ?? new byte[0]))
                {
                    using (HttpContent content = new StreamContent(dataStream))
                    {
                        content.Headers.Add("Content-Type", ContentType);
                        message = await http.PostAsync(Url, content);
                    }
                }
            }
            else if (HttpMethod == HttpMethodEnum.GET)
            {
                message = await http.GetAsync(Url);
            }
            if (message != null && message.StatusCode == System.Net.HttpStatusCode.OK)
            {
                using (message)
                {
                    using (Stream responseStream = await message.Content.ReadAsStreamAsync())
                    {
                        if (responseStream != null)
                        {
                            byte[] responseData = new byte[responseStream.Length];
                            responseStream.Read(responseData, 0, responseData.Length);
                            result = responseData;
                        }
                    }
                }
            }
        }
    }
    catch (Exception ex)
    {

        throw;
    }
    return result;
}
相关文章
相关标签/搜索