以修改somaxconn举例:html
1.暂时性修改(系统重启后保存不了)linux
step 1git
echo 2048 > /proc/sys/net/core/somaxconn
step 2github
sysctl -p
2.永久性修改,在/etc/sysctl.conf中添加以下编程
step 1windows
net.core.somaxconn = 2048
step 2数组
sysctl -p
如下文件的所在目录为/proc/sys/net/ipv4 或 /proc/sys/net/core/ (Centos Linux release 7.2.1511)缓存
[TCP/IP详解 卷一(中文 第二版) P464]
reference
重传超过阈值tcp_retries1,主要的动做就是更新路由缓存cookie
[TCP/IP详解 卷一(中文 第二版) P464]网络
[TCP/IP详解 卷一(中文 第二版) P464]
For SYN segments, net.ipv4.tcp_syn_retries and net.ipv4.tcp_synack_retries bounds the number of retransmissions of SYN segments; their default value is 5 (roughly 180s).
[TCP/IP详解 卷一(中文 第二版) P446]
和FIN_WAIT_2有关
[TCP/IP详解 卷一(中文 第二版) P455]
If there is not enough room on the queue for the new connection, the TCP delays responding to the SYN, to give the application a chance to catch up. Linux is somewhat unique in this behavior—it persists in not ignoring incoming connections if it possibly can. If the net.ipv4.tcp_abort_on_overflow system control variable is set, new incoming connections are reset with a reset segment.
[TCP/IP详解 卷一(中文 第二版) P458]
When a connection request arrives (i.e.,the SYN segment), the system-wide parameter tcp_max_syn_backlog is checked (default 1000). If the number of connections in the SYN_RCVD state would exceed this threshold, the incoming connection is rejected.
TCP Timestamps Option (TSopt):
结构:
+-------+-------+---------------------+---------------------+ |Kind=8 | 10 | TS Value (TSval) |TS Echo Reply (TSecr)| +-------+-------+---------------------+---------------------+ 1 1 4 4 The Timestamps option carries two four-byte timestamp fields. The Timestamp Value field (TSval) contains the current value of the timestamp clock of the TCP sending the option. The Timestamp Echo Reply field (TSecr) is only valid if the ACK bit is set in the TCP header; if it is valid, it echos a times- tamp value that was sent by the remote TCP in the TSval field of a Timestamps option. When TSecr is not valid, its value must be zero. The TSecr value will generally be from the most recent Timestamp option that was received; however, there are exceptions that are explained below.
默认开启, 做用:1.更加精准的测量RTT; 2.防回绕序列号(PAWS)
tcp_tw_reuse
By enabling net.ipv4.tcp_tw_reuse, Linux will reuse an existing connection in the TIME-WAIT state for a new outgoing connection if the new timestamp is strictly bigger than the most recent timestamp recorded for the previous connection: an outgoing connection in the TIME-WAIT state can be reused after just one second.
Q : 重用(reuse)什么
A : connection, 内核中的相关套接字数据结构
Q : 谁重用这些数据结构
A : 处于TIME_WAIT状态的一方,再一次发起相同链接(TCP套接字四元组一致)的时候,进行重用。
Q : 具体流程以及为何依赖tcp_timestamps
A : 见以下分析
Once a new connection replaces the TIME-WAIT entry [time 1], the SYN segment of the new connection is ignored (thanks to the timestamps) [time 2] and won’t be answered by a RST [time 3] but only by a retransmission of the FIN and ACK segment [time 3]. The FIN segment will then be answered with a RST (because the local connection is in the SYN-SENT state)[time 4] which will allow the transition out of the LAST-ACK state. The initial SYN segment will eventually be resent (after one second) because there was no answer and the connection will be established without apparent error, except a slight delay:
tcp_tw_recyle
建议不要打开该选项
Starting from Linux 4.10 (commit 95a22caee396), Linux will randomize timestamp offsets for each connection, making this option completely broken, with or without NAT.
须要了解内核套接字的数据结构:TODO
[TCP/IP详解 卷一(中文 第二版) P455]
当net.ipv4.tcp_syncookies = 1, 表示开启SYN Cookies。 当出现SYN等待队列溢出时,启用cookies来处理,可防范SYN攻击,默认为0,表示关闭。
[TCP/IP详解 卷一(中文 第二版) P482]
默认开启
[TCP/IP详解 卷一(中文 第二版) P478]
[TCP/IP详解 卷一(中文 第二版) P455]
Each listening endpoint has a fixed-length queue of connections that have been completely accepted by TCP (i.e., the three-way handshake is complete) but not yet accepted by the application. The application specifies a limit to this queue, commonly called the backlog. This backlog must be between 0 and a system-specific maximum called net.core.somaxconn, inclusive (default 128).
TODO
net.core.rmem_default = 262144 // 单个链接的读缓存(其实,读缓存仍是动态变化的,这是一个上限) net.core.rmem_max = 16777216 // 当调用setsockopt设置最大读缓存时,不能超过rmem_max net.core.wmem_default = 262144 net.core.wmem_max = 16777216
设置好最大缓存限制后就高枕无忧了吗?对于一个TCP链接来讲,可能已经充分利用网络资源,使用大窗口、大缓存来保持高速传输了。好比在长肥网络中,缓存上限可能会被设置为几十兆字节,但系统的总内存倒是有限的,当每个链接都全速飞奔使用到最大窗口时,1万个链接就会占用内存到几百G了,这就限制了高并发场景的使用,公平性也得不到保证。咱们但愿的场景是,在并发链接比较少时,把缓存限制放大一些,让每个TCP链接开足马力工做;当并发链接不少时,此时系统内存资源不足,那么就把缓存限制缩小一些,使每个TCP链接的缓存尽可能的小一些,以容纳更多的链接。
linux为了实现这种场景,引入了自动调整内存分配的功能,由tcp_moderate_rcvbuf配置决定,以下:
net.ipv4.tcp_moderate_rcvbuf = 1
默认tcp_moderate_rcvbuf配置为1,表示打开了TCP内存自动调整功能。若配置为0,这个功能将不会生效(慎用)。
当咱们在编程中对链接设置了SO_SNDBUF、SO_RCVBUF,将会使linux内核再也不对这样的链接执行自动调整功能!
net.ipv4.tcp_rmem = 8192 87380 16777216 net.ipv4.tcp_wmem = 8192 65536 16777216 net.ipv4.tcp_mem = 8388608 12582912 16777216
tcp_rmem[3]数组表示任何一个TCP链接上的读缓存上限,其中tcp_rmem[0]表示最小上限(好比,使用调用setsockopt设置最大读缓存时,若其值小于8192,那么最大读缓存会被设置为8192),tcp_rmem[1]表示初始上限(注意,它会覆盖适用于全部协议的rmem_default配置),tcp_rmem[2]表示最大上限。
tcp_wmem[3]数组表示写缓存,与tcp_rmem[3]相似,再也不赘述。
tcp_mem[3]数组就用来设定TCP内存的总体使用情况,因此它的值很大(它的单位也不是字节,而是页--4K或者8K等这样的单位!)。这3个值定义了TCP总体内存的无压力值、压力模式开启阀值、最大使用值。以这3个值为标记点则内存共有4种状况:
一、只要系统TCP的整体内存超了 tcp_mem[2] ,新内存分配都会失败。
二、tcp_rmem[0]或者tcp_wmem[0]优先级也很高,只要条件1不超限,那么只要链接内存小于这两个值,就保证新内存分配必定成功。
三、只要整体内存不超过tcp_mem[0],那么新内存在不超过链接缓存的上限时也能保证分配成功。
四、tcp_mem[1]与tcp_mem[0]构成了开启、关闭内存压力模式的开关。在压力模式下,链接缓存上限可能会减小。在非压力模式下,链接缓存上限可能会增长,最多增长到tcp_rmem[2]或者tcp_wmem[2]。
tcp_adv_win_scale
tcp_allowed_congestion_control
tcp_app_win
tcp_autocorking
tcp_available_congestion_control
tcp_base_mss
tcp_challenge_ack_limit
tcp_congestion_control
tcp_early_retrans
tcp_ecn
tcp_fack
tcp_fastopen
tcp_fastopen_key
tcp_frto
tcp_invalid_ratelimit
tcp_keepalive_intvl
tcp_keepalive_probes
tcp_keepalive_time
tcp_limit_output_bytes
tcp_low_latency
tcp_max_orphans
tcp_max_ssthresh
tcp_max_tw_buckets
tcp_mem
tcp_min_tso_segs
tcp_moderate_rcvbuf
tcp_mtu_probing
tcp_no_metrics_save
tcp_notsent_lowat
tcp_orphan_retries
tcp_reordering
tcp_retrans_collapse
tcp_rfc1337
tcp_rmem
tcp_slow_start_after_idle
tcp_stdurg
tcp_thin_dupack
tcp_thin_linear_timeouts
tcp_tso_win_divisortcp_tw_recycletcp_window_scalingtcp_wmemtcp_workaround_signed_windowsudp_memudp_rmem_minudp_wmem_minxfrm4_gc_thresh