在经典的TCP/IP网络编程书籍中都介绍过这样一种模型:编程
“服务器在某知名端口监听,并fork若干子进程,当有新的链接请求到来时在子进程中经过accept调用获取新链接并进行处理”;服务器
听起来一切瓜熟蒂落,但仔细想一想就会有不少疑问,好比“父子进程属于两个不一样的进程空间,父进程中监听的端口如何在子进程中accept?”;网络
另外网上还有一些讨论,好比“多个进程在同一个描述符上accept时会产生“惊群”效应”;架构
一切又扑朔迷离起来了。socket
本文将以此为背景,经过实践和源码相结合的方式来一探究竟。tcp
本文所采用的服务器模型以下:函数
int main(int argc, char *argv[]){ socket(); bind(); listen(); fork(); if( parent ){ accept(); } else if( child ){ accept(); } else{ /*error*/ } return 0; }
这里比文章开头介绍的架构更进一步,咱们在父进程中也调用了accept(),看看是个什么情形。oop
首先,启动服务器:性能
$ ps -ef | grep server yyy 6182 3573 0 18:08 pts/3 00:00:00 ./tcp_server_tem yyy 6183 6182 0 18:08 pts/3 00:00:00 ./tcp_server_tem $ sudo netstat -antp | grep 54321 tcp 0 0 192.168.31.162:54321 0.0.0.0:* LISTEN 6182/tcp_server_tem
使用ps命令查看,父进程(6182)和子进程(6183)均已经正常启动,而且netstat命令中只显示了父进程(6182)监听在指定的端口上(54321)。测试
若是只有父进程在该端口上监听,那么子进程中是如何作到成功accept的呢?
咱们知道,socket的实质也是描述符,那么就深刻进程所拥有的描述符表中看一下吧:
$ ls -l /proc/6182/fd total 0 lrwx------ 1 yyy yyy 64 Feb 3 18:08 0 -> /dev/pts/3 lrwx------ 1 yyy yyy 64 Feb 3 18:08 1 -> /dev/pts/3 lrwx------ 1 yyy yyy 64 Feb 3 18:08 2 -> /dev/pts/3 lrwx------ 1 yyy yyy 64 Feb 3 18:08 3 -> socket:[50899] $ ls -l /proc/6183/fd total 0 lrwx------ 1 yyy yyy 64 Feb 3 18:08 0 -> /dev/pts/3 lrwx------ 1 yyy yyy 64 Feb 3 18:08 1 -> /dev/pts/3 lrwx------ 1 yyy yyy 64 Feb 3 18:08 2 -> /dev/pts/3 lrwx------ 1 yyy yyy 64 Feb 3 18:08 3 -> socket:[50899] $ cat /proc/net/tcp | grep 50899 1: A21FA8C0:D431 00000000:0000 0A 00000000:00000000 00:00000000 00000000 1001 0 50899 1 0000000000000000 100 0 0 10 -1
能够看到,虽然netstat中没有显示,但其实父子进程都拥有该监听socket(子进程是经过fork时的描述符拷贝而从父进程中继承来的),并指向同一个节点(50899)。这样在父子进程中就均可以经过对应的描述符来操做内核中对应的同一个sock对象了。
好了,下面启动客户端来看一下效果(测试中使用的客户端和服务器均跑在同一台物理机器上):
server 6183 accept clientsock 4 server 6183 recv zero server 6183 now do accept! server 6182 accept clientsock 4 server 6182 recv zero server 6182 now do accept! server 6183 accept clientsock 4 server 6183 recv zero server 6183 now do accept! server 6182 accept clientsock 4 server 6182 recv zero server 6182 now do accept!
神奇,父子进程都可以经过accept获取新链接,而且看起来仍是交替进行的。
那么accept函数到底是如何实现的呢,仍是得要去协议栈的源码里面扒一扒才行啊。
Kernel 3.16.1。
TCP/IP协议栈中,accept系统调用对应的实现函数是inet_csk_accept:
net/ipv4/inet_connection_sock.c /* * This will accept the next outstanding connection. */ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err) { struct inet_connection_sock *icsk = inet_csk(sk); struct request_sock_queue *queue = &icsk->icsk_accept_queue; struct sock *newsk; struct request_sock *req; int error; lock_sock(sk); /* We need to make sure that this socket is listening, * and that it has something pending. */ error = -EINVAL; if (sk->sk_state != TCP_LISTEN) goto out_err; /* Find already established connection */ if (reqsk_queue_empty(queue)) { long timeo = sock_rcvtimeo(sk, flags & O_NONBLOCK); /* If this is a non blocking socket don't sleep */ error = -EAGAIN; if (!timeo) goto out_err; error = inet_csk_wait_for_connect(sk, timeo); if (error) goto out_err; } req = reqsk_queue_remove(queue); newsk = req->sk; sk_acceptq_removed(sk); if (sk->sk_protocol == IPPROTO_TCP && queue->fastopenq != NULL) { spin_lock_bh(&queue->fastopenq->lock); if (tcp_rsk(req)->listener) { /* We are still waiting for the final ACK from 3WHS * so can't free req now. Instead, we set req->sk to * NULL to signify that the child socket is taken * so reqsk_fastopen_remove() will free the req * when 3WHS finishes (or is aborted). */ req->sk = NULL; req = NULL; } spin_unlock_bh(&queue->fastopenq->lock); } out: release_sock(sk); if (req) __reqsk_free(req); return newsk; out_err: newsk = NULL; req = NULL; *err = error; goto out; } EXPORT_SYMBOL(inet_csk_accept);
若是调用时尚未能够accept的链接且使用了阻塞模式的话,则会进入inet_csk_wait_for_connect函数:
/* * Wait for an incoming connection, avoid race conditions. This must be called * with the socket locked. */ static int inet_csk_wait_for_connect(struct sock *sk, long timeo) { struct inet_connection_sock *icsk = inet_csk(sk); DEFINE_WAIT(wait); int err; /* * True wake-one mechanism for incoming connections: only * one process gets woken up, not the 'whole herd'. * Since we do not 'race & poll' for established sockets * anymore, the common case will execute the loop only once. * * Subtle issue: "add_wait_queue_exclusive()" will be added * after any current non-exclusive waiters, and we know that * it will always _stay_ after any new non-exclusive waiters * because all non-exclusive waiters are added at the * beginning of the wait-queue. As such, it's ok to "drop" * our exclusiveness temporarily when we get woken up without * having to remove and re-insert us on the wait queue. */ for (;;) { prepare_to_wait_exclusive(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE); release_sock(sk); if (reqsk_queue_empty(&icsk->icsk_accept_queue)) timeo = schedule_timeout(timeo); lock_sock(sk); err = 0; if (!reqsk_queue_empty(&icsk->icsk_accept_queue)) break; err = -EINVAL; if (sk->sk_state != TCP_LISTEN) break; err = sock_intr_errno(timeo); if (signal_pending(current)) break; err = -EAGAIN; if (!timeo) break; } finish_wait(sk_sleep(sk), &wait); return err; }
这里使用了等待队列来完成任务,而且注释中说的很清楚了,采用了“wake-one”的机制,不会发生“whole herd”,也就是“惊群”的状况。
还有另一种模型,就是accept以后再fork,而后在父进程中关闭accept套接字,在子进程中关闭监听套接字,这样作的缺点在于fork系统调用的性能损耗,但好在如今的fork实现了“copy-on-write”机制,就再也不展开说了。
立刻就要过年放假了,下一篇文章应该就是猴年了,祝你们新年快乐!