本文将介绍在Linux系统中,数据包是如何一步一步从网卡传到进程手中的。linux
若是英文没有问题,强烈建议阅读后面参考里的两篇文章,里面介绍的更详细。git
本文只讨论以太网的物理网卡,不涉及虚拟设备,而且以一个UDP包的接收过程做为示例.github
本示例里列出的函数调用关系来自于kernel 3.13.0,若是你的内核不是这个版本,函数名称和相关路径可能不同,但背后的原理应该是同样的(或者有细微差异)
网卡须要有驱动才能工做,驱动是加载到内核中的模块,负责衔接网卡和内核的网络模块,驱动在加载的时候将本身注册进网络模块,当相应的网卡收到数据包时,网络模块会调用相应的驱动程序处理数据。api
下图展现了数据包(packet)如何进入内存,并被内核的网络模块开始处理:网络
+-----+ | | Memroy +--------+ 1 | | 2 DMA +--------+--------+--------+--------+ | Packet |-------->| NIC |------------>| Packet | Packet | Packet | ...... | +--------+ | | +--------+--------+--------+--------+ | |<--------+ +-----+ | | +---------------+ | | 3 | Raise IRQ | Disable IRQ | 5 | | | ↓ | +-----+ +------------+ | | Run IRQ handler | | | CPU |------------------>| NIC Driver | | | 4 | | +-----+ +------------+ | 6 | Raise soft IRQ | ↓
软中断会触发内核网络模块中的软中断处理函数,后续流程以下socket
+-----+ 17 | | +----------->| NIC | | | | |Enable IRQ +-----+ | | +------------+ Memroy | | Read +--------+--------+--------+--------+ +--------------->| NIC Driver |<--------------------- | Packet | Packet | Packet | ...... | | | | 9 +--------+--------+--------+--------+ | +------------+ | | | skb Poll | 8 Raise softIRQ | 6 +-----------------+ | | 10 | | ↓ ↓ +---------------+ Call +-----------+ +------------------+ +--------------------+ 12 +---------------------+ | net_rx_action |<-------| ksoftirqd | | napi_gro_receive |------->| enqueue_to_backlog |----->| CPU input_pkt_queue | +---------------+ 7 +-----------+ +------------------+ 11 +--------------------+ +---------------------+ | | 13 14 | + - - - - - - - - - - - - - - - - - - - - - - + ↓ ↓ +--------------------------+ 15 +------------------------+ | __netif_receive_skb_core |----------->| packet taps(AF_PACKET) | +--------------------------+ +------------------------+ | | 16 ↓ +-----------------+ | protocol layers | +-----------------+
enqueue_to_backlog函数也会被netif_rx函数调用,而netif_rx正是lo设备发送数据包时调用的函数
因为是UDP包,因此第一步会进入IP层,而后一级一级的函数往下调:tcp
| | ↓ promiscuous mode && +--------+ PACKET_OTHERHOST (set by driver) +-----------------+ | ip_rcv |-------------------------------------->| drop this packet| +--------+ +-----------------+ | | ↓ +---------------------+ | NF_INET_PRE_ROUTING | +---------------------+ | | ↓ +---------+ | | enabled ip forword +------------+ +----------------+ | routing |-------------------->| ip_forward |------->| NF_INET_FORWARD | | | +------------+ +----------------+ +---------+ | | | | destination IP is local ↓ ↓ +---------------+ +------------------+ | dst_output_sk | | ip_local_deliver | +---------------+ +------------------+ | | ↓ +------------------+ | NF_INET_LOCAL_IN | +------------------+ | | ↓ +-----------+ | UDP layer | +-----------+
| | ↓ +---------+ +-----------------------+ | udp_rcv |----------->| __udp4_lib_lookup_skb | +---------+ +-----------------------+ | | ↓ +--------------------+ +-----------+ | sock_queue_rcv_skb |----->| sk_filter | +--------------------+ +-----------+ | | ↓ +------------------+ | __skb_queue_tail | +------------------+ | | ↓ +---------------+ | sk_data_ready | +---------------+
调用完sk_data_ready以后,一个数据包处理完成,等待应用层程序来读取,上面全部函数的执行过程都在软中断的上下文中。
应用层通常有两种方式接收数据,一种是recvfrom函数阻塞在那里等着数据来,这种状况下当socket收到通知后,recvfrom就会被唤醒,而后读取接收队列的数据;另外一种是经过epoll或者select监听相应的socket,当收到通知后,再调用recvfrom函数去读取接收队列的数据。两种状况都能正常的接收到相应的数据包。ide
了解数据包的接收流程有助于帮助咱们搞清楚咱们能够在哪些地方监控和修改数据包,哪些状况下数据包可能被丢弃,为咱们处理网络问题提供了一些参考,同时了解netfilter中相应钩子的位置,对于了解iptables的用法有必定的帮助,同时也会帮助咱们后续更好的理解Linux下的网络虚拟设备。函数
在接下来的几篇文章中,将会介绍Linux下的网络虚拟设备和iptables。ui
Monitoring and Tuning the Linux Networking Stack: Receiving Data
Illustrated Guide to Monitoring and Tuning the Linux Networking Stack: Receiving Data
NAPI