Basic knowledge of linux Traffic Control(TC)
本文将mark下Linux Traffic Control(TC)机制的相关notes。
What
流量控制Traffic Control简称TC,表示网络设备接收和发送数据包的排队机制。比如,数据包的接收速率、发送速率、多个数据包的发送顺序等。
Linux实现了流量控制子系统,它包括两部分:
- 内核部分的traffic control框架
- 用户态的规则配置工具:iproute2软件包中的tc程序
它们有些类似于内核态的netfilter框架和用户态的iptables程序。
Traffic Control的作用包括以下几种:
- 调整(Shaping): 通过推迟数据包发送来控制发送速率,只用于网络出方向(egress)
- 时序(Scheduling):调度不同类型数据包发送顺序,比如在交互流量和批量下载类型数据包之间进行发送顺序的调整。只用于网络出方向(egress)
- 监督(Policing): 根据到达速率决策接收还是丢弃数据包,用于网络入方向(ingress)
- 丢弃(Dropping): 根据带宽丢弃数据包,可以用于出入两个方向
Components
qdisc
Simply put, a qdisc is a scheduler. Every output interface needs a scheduler of some kind, and the default scheduler is a FIFO. Other qdiscs available under Linux will rearrange the packets entering the scheduler’s queue in accordance with that scheduler’s rules.
The qdisc is the major building block on which all of Linux traffic control is built, and is also called a queuing discipline.
The classful qdiscs can contain classes, and provide a handle to which to attach filters.
The classless qdiscs can contain no classes, nor is it possible to attach filter to a classless qdisc.
要实现对数据包接收和发送的这些控制行为,需要使用队列结构来临时保存数据包。在Linux实现中,把这种包括数据结构和算法实现的控制机制抽象为结构队列规程:Queuing discipline,简称为qdisc。qdisc对外暴露两个回调接口enqueue和dequeue分别用于数据包入队和数据包出队,而具体的排队算法实现则在qdisc内部隐藏。
A qdisc has two operations:
- enqueue requests so that a packet can be queued up for later transmission
- dequeue requests so that one of the queued-up packets can be chosen for immediate transmission
class
Classes only exist inside a classful qdisc (e.g., HTB and CBQ). Classes are immensely flexible and can always contain either multiple children classes or a single child qdisc.
Any class can also have an arbitrary number of filters attached to it, which allows the selection of a child class or the use of a filter to reclassify or drop traffic entering a particular class.
A leaf class is a terminal class in a qdisc. It contains a qdisc (default FIFO) and will never contain a child class. Any class which contains a child class is an inner class (or root class) and not a leaf class.
filter
A filter is used by a classful qdisc to determine in which class a packet will be enqueued.
Full picture
基于qdisc, class和filter三种元素可以构建出非常复杂的树形qdisc结构,极大扩展流量控制的能力。
对于树形结构的qdisc, 当数据包流至最顶层qdisc时,会层层向下递归进行调用。如,父对象(qdisc/class)的enqueue回调接口被调用时,其上所挂载的所有filter依次被调用,直到一个filter匹配成功。然后将数据包入队到filter所指向的class,具体实现则是调用class所配置的Qdisc的enqueue函数。没有成功匹配filter的数据包分类到默认的class中。
如图:
Usage
handle
Every class and classful qdisc requires a unique identifier within the traffic control structure. This unique identifier is known as a handle and has two constituent members, a major number and a minor number. These numbers can be assigned arbitrarily by the user in accordance with the following rules.
The numbering of handles for classes and qdiscs
major
This parameter is completely free of meaning to the kernel. The user may use an arbitrary numbering scheme, however all objects in the traffic control structure with the same parent must share a major handle number. Conventional numbering schemes start at 1 for objects attached directly to the root qdisc.minor
This parameter unambiguously identifies the object as a qdisc if minor is 0. Any other value identifies the object as a class. All classes sharing a parent must have unique minor numbers.
The special handle ffff:0 is reserved for the ingress qdisc.
The handle is used as the target in classid and flowid phrases of tc filter statements. These handles are external identifiers for the objects, usable by userland applications. The kernel maintains internal identifiers for each object.
man
可以查询man tc、man tc-u32、man tc-htb等man手册。
Example
As a simple example, in order to limit bandwidth of individual IP addresses stored in CLIENT_IP
shell variable, with limitations like the following:
- device name = eth0
- total bandwidth available/allowed for the device = 1000kbps up to 1500kbps
- default bandwidth (for clients that do not fall into our filters) = 1kbps up to 2kbps
- bandwidth of
CLIENT_IP
= 100kbps - Maximum bandwidth of
CLIENT_IP
(if there is more bandwidth available) = 200kbps
Commands below would suffice:1
2
3
4
5tc qdisc add dev eth0 root handle 1: htb default 10
tc class add dev eth0 parent 1: classid 1:1 htb rate 1000kbps ceil 1500kbps
tc class add dev eth0 parent 1:1 classid 1:10 htb rate 1kbps ceil 2kbps
tc class add dev eth0 parent 1:1 classid 1:11 htb rate 100kbps ceil 200kbps
tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32 match ip src ${CLIENT_IP} flowid 1:11
参考资料: