WO2002015520A1 - Procedes et dispositifs pour l'ordonnancement de paquets - Google Patents
Procedes et dispositifs pour l'ordonnancement de paquets Download PDFInfo
- Publication number
- WO2002015520A1 WO2002015520A1 PCT/CA2000/000937 CA0000937W WO0215520A1 WO 2002015520 A1 WO2002015520 A1 WO 2002015520A1 CA 0000937 W CA0000937 W CA 0000937W WO 0215520 A1 WO0215520 A1 WO 0215520A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- packet
- data packets
- data
- packets
- scheduhng
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2425—Traffic characterised by specific attributes, e.g. priority or QoS for supporting services specification, e.g. SLA
- H04L47/2433—Allocation of priorities to traffic types
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/20—Traffic policing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/56—Queue scheduling implementing delay-aware scheduling
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q11/00—Selecting arrangements for multiplex systems
- H04Q11/04—Selecting arrangements for multiplex systems for time-division multiplexing
- H04Q11/0428—Integrated services digital network, i.e. systems for transmission of different types of digitised signals, e.g. speech, data, telecentral, television signals
- H04Q11/0478—Provisions for broadband connections
Definitions
- This invention relates to the transmission of data over communications networks including wide area networks. More specifically, this invention relates to methods and apparatus for scheduling data packets for transmission over a data link.
- the scheduhng methods and apparatus may be used in systems for providing a plurality of differentiated services each providing a different level of Quality of Service (“QoS”) over wide area networks.
- QoS Quality of Service
- the scheduling methods and apparatus have particular application in Internet Protocol (“IP”) networks.
- Telecommunications networks are evolving toward a connectionless model from a model whereby the networks provide end-to- end connections between specific points.
- the individual connections can be tailored to provide a desired bandwidth for communications between the end points of the connections. This is not possible in a connectionless network.
- the connectionless model is desirable because it saves the overhead implicit in setting up connections between pairs of endpoints and also provides opportunities for making more efficient use of the network infrastructure through statistical gains.
- IP Internet Protocol
- end-to-end connections for carrying data packets between certain parts of the network.
- the end-to-end connections may be provided by technologies such as Asynchronous Transfer Mode (“ATM”), Time Division Multiplexing (“TDM”) and SONET/SDH.
- ATM Asynchronous Transfer Mode
- TDM Time Division Multiplexing
- SONET/SDH SONET/SDH
- a Wide Area Network is an example of a network in which the methods of the invention may be applied.
- WANs are used to provide interconnections capable of carrying many different types of data between geographically separated nodes.
- the same WAN may be used to transmit video images, voice conversations, e-mail messages, data to and from database servers, and so on.
- Some of these services place different requirements on the WAN.
- transmitting a video signal for a video conference requires fairly large bandwidth, short delay (or "latency"), small delay jitter, and reasonably small data loss ratio.
- transmitting e-mail messages or apphcation data can generally be done with lower bandwidth but can tolerate no data loss. Further, it is not usually critical that e-mail be delivered instantly. E-mail services can usually tolerate longer latencies and lower bandwidth than other services.
- a typical WAN comprises a shared network which is connected by access links to two or more geographically separated customer premises.
- Each of the customer premises may include one or more devices connected to the network. More typically each customer premise has a number of computers connected to a local area network ("LAN').
- the LAN is connected to the WAN access link at a service point.
- the service point is generally at a "demarcation" unit or “interface device” which collects data packets from the LAN which are destined for transmission over the WAN and sends those packets across the access link.
- the demarcation unit also receives data packets coming from the WAN across the access link and forwards those data packets to destinations on the LAN.
- QoS Quality of Service
- QoS Quality of Service
- Network resources which are committed to a specific link are not readily shared, even if there are times when the link is not using all of the resources which have been allocated to it.
- committing resources to specific end- to-end links reduces or eliminates the ability to achieve statistical gains.
- Statistical gains arise from the fact that it is very unlikely that every apphcation on a network will be generating a maximum amount of network traffic at the same time. If applications are not provided with dedicated end-to-end connections but share bandwidth then each apphcation can, in theory, share equally in the available bandwidth.
- bandwidth may be allocated in a manner that bears no relationship to the requirements of individual apphcations or to the relative importance of the applications.
- latencies There are similar inequities in the latencies in the delivery of data packets over the network.
- QoS refers to a set of parameters which describe the required traffic characteristics of a data connection.
- QoS refers to a set of one or more of the following interrelated parameters which describe the way that a data connection treats data packets generated by an application: Minimum Bandwidth - a minimum rate at which a data connection must be capable of forwarding data originating from the apphcation. The data connection might be incapable of forwarding data at a rate faster than the minimum bandwidth but should always be capable of forwarding data at a rate equal to the rate specified by the minimum bandwidth;
- Maximum Delay a maximum time taken for data from an apphcation to completely traverse the data connection. QoS requirements are met only if data packets traverse the data connection in a time equal to or shorter than the maximum delay; Maximum Loss - a maximum fraction of data packets from the application which may not be successfully transmitted across the data connection; and,
- Jitter - a measure of how much variation there is in the delay experienced by different packets from the apphcation being transmitted across the data connection. In an ideal case, where all packets take exactly the same amount of time to traverse the data connection, the jitter is zero. Jitter may be defined, for example, as any one of various statistical measures of the width of a distribution function which expresses the probability that a packet will experience a particular delay in traversing the data connection.
- Service providers who provide access to WANs wish to provide their customers with Service Level Agreements rather than raw bandwidth. This will permit the service providers to take advantage of statistical gain to more efficiently use the network infrastructure while maintaining levels of QoS that customers require. To do this, the service providers need a way to manage and track usage of these different services.
- Applications connected to a network generate packets of data for transmission on the network. In providing different levels of service it is necessary to be able to sort or "classify" data packets from one or more applications into different classes which will be accorded different levels of service. The data packets can then be transmitted in a way which maintains the required QoS for each apphcation. Data packets generated by one or more applications may belong to the same class.
- This invention provides methods and apparatus for scheduling the forwarding of data packets over a data link.
- the methods of the invention involve receiving classified data packets.
- the methods include selecting one of a plurality of data packets by selecting an ehgible group of data packets and determining whether data packets in the ehgible group all belong to classes having the same priority or belong to classes having different priorities. If the data packets in the eligible group belong to two or more classes having different priorities the method selects one data packet by applying a selection criterion to an ehgible sub-group containing those one or more data packets in the ehgible group which belong to classes having a highest priority.
- the method selects one data packet by applying a selection criterion to all data packets in the ehgible group.
- the method provides reduced queuing delays for packets belonging to higher priority classes.
- the selection criterion comprises a first to finish selection criterion.
- the method preferably includes maintaining a virtual time value. Selecting the eligible group preferably comprises selecting packets having a start time less than or equal to the virtual time value.
- the invention may be practised with a plurahty of scheduhng engines interlinked to form a hierarchical tree, the tree including at least a parent scheduhng engine and a plurahty of child scheduhng engines hnked to the parent scheduling engine.
- the parent scheduling engine selects one data packet from the data packets being held by the child scheduling engines.
- the data packet belonging to the high priority class is made available for selection by the parent scheduhng engine in place of the already selected data packet.
- the invention also provides apparatus for scheduhng transmission of data packets on a data hnk, the apparatus comprises: a) a memory capable of holding a plurality of data packets queued in a plurality of queues; b) means for keeping a start time, a finish time and a priority for a packet at a head of each of the queues; c) a scheduling engine adapted to select one packet from a plurahty of packets at the heads of the queues, the scheduhng engine comprising: i) a counter for maintaining a virtual time for the scheduhng engine; ii) means for comparing the start time for each packet to the virtual time for the scheduhng engine to select an eligible group of packets; hi) means for comparing the priorities of packets in the eligible group of packets and eliminating from the ehgible group packets having a priority lower than a priority for another packet in the eligible group; and, iv) means for selecting one packet from the ehg
- FIG. 1 is a schematic view of a wide area network according to the invention which comprises enterprise service point (“ESP”) devices for providing packet scheduhng functions according to the invention;
- ESP enterprise service point
- Figure 2 is a schematic view illustrating two flows in a communications network according to the invention
- Figure 3 is a dia ram illustrating the various data fields in a prior art IP data packet
- Figure 4 is a schematic view showing an example a pohcy which may be implemented with the methods and apparatus of the invention
- Figure 5 is a schematic view of apparatus for scheduhng packets according to the invention
- Figure 5A is a schematic illustration showing a structure of a scheduler according to the invention
- Figure 6 is a flow chart illustrating a method according to the invention by which leaf scheduhng engines may select and transmit packets
- Figure 6A is a flow chart iUustrating a method according to the invention by which non-leaf scheduhng engines may select and transmit packets
- Figure 7 is a diagram of a scheduler implemented by a number of hierarchically arranged scheduhng engines according to the invention.
- Figure 8 is a flow chart illustrating a simplified embodiment of the invention.
- This invention may be applied in many different situations where data packets are scheduled and dispatched.
- the following description discusses the apphcation of the invention to scheduhng onward transmission of data packets received at an Enterprise Service Point ("ESP").
- ESP Enterprise Service Point
- the invention is not hmited to use in connection with ESP devices but can be applied in almost any situation where classified data packets are scheduled and dispatched.
- FIG 1 shows a generalized view of a pair of LANs 20, 21 connected by a WAN 22.
- Each LAN 20, 21 has an Enterprise Service Point unit ("ESP") 24 which connects LANs 20, 21 to WAN 22 via an access hnk 26.
- LAN 20 may, for example, be an Ethernet network or a token ring network.
- Access hnk 26 may, for example, be an Asynchronous Transfer Mode (“ATM”) hnk.
- Each LAN has a number of connected devices 28 which are capable of generating and/or receiving data for transmission on the LAN.
- Devices 28 typically include network connected computers.
- various devices 28 on network 20 may establish connections with devices 28 on network 21 and vice versa.
- Each connection may be called a session.
- Each session comprises one or more flows.
- FIG. 2 illustrates a session between a computer 28A on network 20 and a computer 28B on network 21.
- the session comprises two flows 32 and 33.
- Flow 32 originates at computer 28A and goes to computer 28B through WAN 22.
- Flow 33 originates at computer 28B and goes to computer 28A over WAN 22.
- Computers 28A and 28B each have an address. Most typically data in a great number of flows will pass through each ESP 24 in any short period.
- Each flow consists of a series of data packets.
- the data packets may have different sizes.
- Each packet comprises a header portion which contains information about the packet and a payload or datagram.
- the packets may be Internet protocol ("IP") packets.
- FIG 3 illustrates the format of an IP packet 35 according to the currently implemented IP version 4.
- Packet 35 has a header 36 and a data payload 38.
- the header contains several fields.
- the "version” field contains an integer which identifies the version of IP being used.
- the current IP version is version 4.
- the "header length” field contains an integer which indicates the length of header 36 in 32 bit words.
- the "type of service” field contains a number which can be used to indicate a level of Quality of Service required by the packet.
- the “total length” field specifies the total length of packet 35.
- the "identification” field contains a number which identifies the data in payload 38.
- the “flags” field contains 3 bits which are used to determine whether the packet can be fragmented.
- the "time-to-live”field contains a number which is decremented as the packet is forwarded. When this number reaches zero the packet may be discarded.
- the "protocol” field indicates which upper layer protocol apphes to packet 35.
- the "header checksum” field contains a checksum which can be used to verify the integrity of header 36.
- the "source address” field contains the IP address of the sending node.
- the “destination address” field contains the IP address of the destination node.
- the “options” field may contain information related to packet 35.
- Each ESP 24 receives streams of packets from its associated LAN and from WAN 22. These packets typicahy belong to at least several different flows. The combined bandwidth of the input ports of an ESP 24 is typicahy greater than the bandwidth of any single output port of ESP 24. Therefore, ESP 24 typically represents a queuing point where packets belonging to various flows may become backlogged while waiting to be transmitted through a port of ESP 24. Backlogs may occur at any output port of ESP 24. While this invention is preferably used to manage the scheduhng of packets at ah output ports of ESP 24, the invention could be used at any one or more output ports of ESP 24.
- ESP 24 must determine which over access link 26, in which order, to make the best use of the bandwidth available in access hnk 26 and to provide guaranteed levels of service to individual flows. To do this, ESP 24 must be able to classify each packet, as it arrives, according to certain rules. ESP 24 can then identify those packets which are to be given priority access to link 26. After the packets are classified they can be scheduled for transmission.
- This invention provides methods and apparatus for scheduhng the transmission of packets for transmission over a data connection in a data communication network.
- packets transmitted via the data connection may be carried over an ATM hnk.
- Incoming packets are sorted by a classifier into classes according to a policy which includes a set of classification rules. The rules set conditions on the values of one or more parameters which characterize the packets which belong to each class.
- a packet is assigned to a class if the parameter values for that packet match the conditions set by the classification rules for the class.
- the policy also estabhshes a QoS level which will be accorded to the packets in each of the different classes.
- Data packets in some classes may be treated differently from data packets in other classes to provide guaranteed levels of QoS to applications which generate data packets in selected classes.
- ESP 24 For example, There is a pohcy for the port of ESP 24 connected to outgoing hnk 26. There may be separate pohcies classifying and scheduhng packets which are received at an ESP 24 from a data link 26 and which are destined for each one of the one or more ports of ESP 24 connected to a LAN. The methods and apparatus of the invention may also be used in other network devices which schedule the forwarding of data packets.
- any suitable classifier may be used to classify data packets for scheduling according to this invention.
- the classification methods and apparatus described in a co-pending commonly owned apphcation entitled METHODS AND APPARATUS FOR PACKET CLASSIFICATION WITH MULTI-LEVEL DATA STRUCTURE which is incorporated herein by reference, or the methods and apparatus described in METHODS AND APPARATUS FOR PACKET CLASSIFICATION WITH MULTIPLE ANSWER SETS which is incorporated herein by reference may be used to classify packets so that the packets may be scheduled by the methods and apparatus of this invention.
- ESP 24 may hold backlogged data packets which are waiting to be forwarded to a destination and which are classified in one or more of the classes.
- the relationship between different classes in a pohcy and the QoS accorded to different classes may be represented by a "classification tree" or "pohcy” tree 39 (Fig. 4).
- the leaf nodes of one or more pohcy trees 39 correspond to the individual classes identified by the classification rules of the pohcy. Other nodes of the pohcy tree may also be called classes.
- FIG. 4 schematically illustrates one possible pohcy tree 39.
- Pohcy tree 39 has a number of leaf nodes 40, 42, 44, 46.
- class 40 contains voice traffic.
- Class 40 may be termed a "real time" class because it is important to dehver packets in class 40 quickly enough to allow a real time voice conversation between two people. Packets in class 40 wih be scheduled so that each flow in class 40 will be guaranteed sufficient bandwidth to support a real time voice session. This may be done, for example, by specifying a particular minimum amount of bandwidth to be shared by the packets classified in class 40. Each flow in class 40 will be guaranteed a level of QoS sufficient for voice communication.
- Classes 42 and 44 contain flows of Hyper Text Transfer Protocol ("HTTP") packets.
- Class 42 contains HTTP flows which originate in MARKETING. MARKETING may be, for example, sources 28 associated with a company's marketing department. Other HTTP flows fall into class 44.
- classes 42 and 44 will share between themselves at least 40% of the bandwidth. 15% of the bandwidth is allocated to satisfy the flows of class 40. The other 45% of the bandwidth is allocated to class 46 which covers aU other flows. Of the bandwidth shared by classes 42 and 44, at least 30% is allocated to class 42 and at least 70% is allocated to class 44. The actual bandwidth available at a node may be greater than the minimum bandwidth allocated by pohcy 39.
- packets coming through node 42 may enjoy more than 30% of the bandwidth of node 48 which is shared between nodes 42 and 44 if there is no backlog of packets at node 44 (i.e. node 44 is not using ah of the minimum bandwidth to which it is entitled). If, for example, at some time there are no packets for transmission which are associated with node 44 then aU of the bandwidth shared by nodes 42 and 44 is available to packets associated with node 42.
- a pohcy tree typicahy has two or more levels.
- the pohcy tree 39 of Figure 4 has 3 levels. Nodes which are in the same level are ah separated from hnk 26 by the same number of nodes above them in pohcy tree 39.
- Nodes 40, 46 and 48 may be termed "second" level nodes because they are one node removed from hnk 26.
- Nodes 42 and 44 are third level nodes which are two nodes removed from hnk 26, and so on.
- lower level nodes of pohcy tree 39 are depicted as being above higher level nodes. Nodes in pohcy tree 39 are connected to one another as indicated in Figure 2 by lines 41. A higher level node connected to a lower level node by a hne 41 is said to be a child of the higher level node. A lower level node connected to a higher level node by a hne 41 is said to be a parent of the lower level node.
- the policy represented by a policy tree 39 may specify QoS by providing a desired distribution of bandwidth between different higher level nodes which depend from the same lower level node. This may be done, for example, by specifying absolute amounts of bandwidth to be provided to individual higher level nodes, specifying percentages of available bandwidth to be shared by each of two or more higher level nodes (as described above with respect to nodes 42 and 44), a combination of these measures or any equivalent measure.
- packets are classified and inserted into a scheduler which has a structure mirroring that of the pohcy tree. The packets enter the scheduler at a leaf node corresponding to the class. From there, the packets "percolate" from node to node up through the scheduler, until they reach a node corresponding to the root node of the pohcy tree. From there, the packets are sent out on the data hnk.
- Scheduler 50 schedules the transmission of the packet out an output port.
- Scheduler 50 uses the pohcy associated with the port to determine the sequence in which to send any packets which are backlogged waiting to be sent through the output port.
- a scheduler 50 receives each incoming packet 51 together with a class identifier 53 generated by a classifier 52 (step 102). Scheduler 50 then places each packet in a queue 55 (step 104). Each queue 55 is associated with a leaf class. The particular queue 55 into which a packet is inserted is determined by the classification of the packet and, possibly, by the flow to which the packet belongs. Each queue 55 may contain zero, one, or more packets. Each active flow may have its own queue or, in the alternative, the packets for two or more flows may ah be directed to a single queue.
- Queues 55 do not need to be physical queues in the sense that all packets in each queue 55 are located in sequence in the same storage device. Queues 55 are logical first in, first out (“FIFO") queues. Packets 51 are stored somewhere in a storage device accessible to scheduler 50. In Figure 5, the packets are stored in an RAM memory 64 accessible to scheduler 50. Scheduler 50 maintains a record of what packets 51 belong to each queue 55 and what is the order of packets 51 within each queue 55. Scheduler 50 selects packets which are at the heads of their respective queues 55 and a forwarder 58 associated with scheduler 50 sequentially transmits the selected packets over a data link 26. As is known in the art, data link 26 may include an adaptation layer. Each packet 51 may be transmitted on data hnk 26 as one or more data packets of the type carried by data link 26.
- the scheduler 50 of this invention preferably has a structure which mirrors that of a policy tree 39.
- Scheduler 50 has a scheduhng engine 60 corresponding to each node of policy tree 39.
- the scheduhng engines 60 are connected by data pathways 61which permit one scheduling engine to forward data packets to its parent scheduling engine. It is not necessary for data packets 51 to be physically transmitted from one scheduhng engine 60 to another. It is only necessary for information identifying individual data packets 51 to be sent from one scheduhng engine 60 to another.
- the data packet 51 in question could continue to reside in the same location in a storage device, such as RAM 64, until it is forwarded by forwarder 58.
- Each group 56 of queues 55 corresponds to a leaf class in the pohcy tree 39.
- a scheduhng engine 60 corresponding to each leaf node selects packets from the queue(s) 55 in the ⁇ roup 56 corresponding to the same leaf node for passing to the scheduling engine 60 corresponding to the parent of the leaf node (a "parent scheduhng engine”).
- leaf scheduhng engine 60A selects packets from the group 56 consisting of queues 55A, 55B, and 55C to be passed to parent scheduhng engine 60B along data path 61A.
- a child scheduhng engine 60 corresponding to a first node of a pohcy tree 39 can pass responsibihty for data packets 51 to a parent scheduhng engine 60 which corresponds to the parent node of the first node of the pohcy tree.
- a parent scheduhng engine corresponding to a first node of a policy tree can receive data packets 51 from one or more child scheduling engines which correspond to child nodes of the first node of the pohcy tree.
- a scheduling engine 60 may be a child of another scheduling engine 60 and, at the same time, may be a parent of one or more other scheduhng engines 60.
- Scheduler 50 passes responsibility for each packet 51 from one scheduhng engine 60 to another upwards through the tree in stages until the packet 51 is associated with scheduhng engine 60C which corresponds to the first level node 49 of pohcy tree 39.
- the scheduhng engine 60C associated with the first level node 49 of policy tree 39 selects packets from its child scheduhng engines to be sent out the logical output port by forwarder 58.
- Each scheduling engine 60 can pass one packet at time to its parent (lower level) scheduhng engine.
- a scheduhng engine 60 which receives packets from more than one source (e.g. which corresponds to a node in a pohcy tree which has two or more child nodes. or which corresponds to a leaf node having a plurahty of corresponding queues) interleaves packets from the different sources so that aU packets 51 will eventually be passed by the scheduling engine 60.
- Packets 51 are transmitted through a scheduhng engine 60 at a rate i?that corresponds to the bandwidth assigned to the scheduhng engine in pohcy tree 39.
- the bandwidth assigned to a parent scheduhng engine 60 must be equal to the aggregate bandwidth ahocated to the child scheduhng engines 60 of that parent scheduling engine.
- the bandwidth assigned to a leaf scheduling engine 60 is shared equally by all queues associated with the leaf scheduhng engine. Each queue is assigned a bandwidth R q of:
- R lc is the bandwidth for the leaf class and N q is the number of queues associated with the leaf class.
- a leaf scheduling engine 60 cannot fairly allocate bandwidth by simply transmitting one or more packets 51 from each active queue 55 with the number of packets 51 transmitted from each queue in a ratio equal to the proportion of bandwidth available for each one of the active queues.
- a notion of time is used to measure whether packets are being transmitted at an assigned rate. If a packet 51 of length L were transmitted at a rate R, its transmission wih be completed after an interval /given by:
- Each scheduling engine 60 maintains a virtual time Which advances by the interval /each time it passes a packet to its parent scheduhng engine (or to forwarder 58 in the case of scheduling engine 60C). Each interval is calculated from the length of the packet being passed.
- the virtual time of each scheduhng engine 60 is initialized to 0 when scheduler 50 is initiahzed.
- the virtual time of each scheduhng engine 60 is stored in an associated memory 64A as shown in Figure 5.
- each leaf scheduling engine 60 calculates a start time ⁇ and a finish time -F or packets 51 at the heads of its queues 55 (step 106).
- the start and finish times for a packet can be considered to be measures of when a packet 51 at the head of a queue 55 should ideaUy start to be transmitted and when it should finish transmission, ⁇ and .Fare used by leaf scheduling engines 60 to select which packet to transmit next.
- a packet 51 When a packet 51 first reaches the head of a queue 55, it is assigned a start time «-?and a finish time F.
- a packet 51 can reach the head of a queue 55 by being placed into an empty queue 55.
- the packet 51 is assigned the virtual time of the leaf scheduler 60 to which the queue belongs as its start time.
- the other way a packet 51 can reach the head of a queue 55 is for it to replace a previous packet 51 that has just been transmitted out of the queue.
- the start time of the packet 51 will be set to the finish time of the previous packet 51.
- the finish time for the packet 51 When the start time for a packet 51 is known then the finish time for the packet 51 will be given by the equation:
- Scheduler 50 keeps a record of for each scheduling engine 60 and also keeps records of Sand Fiox the packets at the head of each non-empty queue 55 managed by scheduler 50.
- start times Sand finish times F ⁇ or each queue are calculated on the basis of the rate Rj/N q .
- leaf schedulers 60 extract packets from queues 55 and forward those extracted packets at a rate R Ic .
- the virtual time Ffor the leaf scheduler 60 is advanced on the basis of the rate R lc .. This means that the values of Sand Fior a packet at the head of a queue 55 will tend to be in the future relative to the virtual time Vo ⁇ the associated leaf scheduhng engine 60. This gives the leaf scheduling engine 60 time to service any other queues 55.
- each leaf scheduhng engine 60 selects a next packet to be transmitted by using the start and finish times of the packets at the heads of the queues 55 associated with the leaf class.
- each leaf scheduhng engine 60 selects a group of eligible packets 51 from the group of aU packets 51 at the heads of the queues 55 in the group 56 associated with that leaf scheduhng engine 60 (step 110).
- the eligible group comprises a set of packets which are ehgible for transmission according to an eligibility criterion.
- the set of eligible packets is constructed by selecting those packets which have a start time S smaller than or equal to the virtual time Voi the scheduler 60.
- the ehgible packets are packets whose predicted start times have passed. If the scheduling engine 60 does not send a packet 51 from that queue 55 soon, the queue 55 wih not have the benefit of the bandwidth calculated by equation (1). If a packet 51 at the head of a queue 55 is not ehgible, its start time is greater than the virtual time Vo£ the scheduhng engine 60. This indicates that the queue 55 has already received the benefit of its assigned bandwidth. If there are no ehgible packets in any queue 55 associated with a leaf class (i.e.
- the virtual time Vof the scheduling engine 60 associated with the leaf class is advanced to the start time Sof the packet or packets with the earliest start time S.
- a set of ehgible packets is then identified by applying the eligibility criteria to the packets using the new virtual time V (step 110).
- the leaf scheduhng engine 60 will select for transmission the ehgible packet 51 which meets a selection criterion (step 114).
- the selection criterion is a first to finish selection criterion so that the eligible packet that has the earliest finish time Fis selected.
- An alternative, less preferable, approach is to use a selection criterion which selects for transmission the eligible packet with the earliest start time S. If two or ore packets have the same finish time (or start time), scheduling engine 60 may select one of the two or more packets at random (step 114).
- leaf scheduhng engine 60 simply selects for transmission the packet which has the smallest finish time F(or earhest start time S) without considering ehgibihty.
- finish time or start time
- the use of only finish time (or start time) provides coarsegrained control over bandwidth usage, but there will be short term fluctuations either side of the assigned bandwidth.
- leaf scheduhng engine 60 selects a packet 51
- the selected packet 51 is removed from its queue 55 and is held at leaf scheduling engine 60.
- leaf scheduling engine 60 In preferred embodiments of the invention only a single packet 51 can be held at a scheduhng engine 60.
- the selected packet will be passed to the parent of the leaf scheduhng engine 60 (step 122).
- the virtual time Vo ⁇ the leaf scheduhng engine 60 will be updated (step 125) and leaf scheduhng engine 60 wih select a new packet 51 (step 114) from a queue 55 for eventual transmission.
- scheduling engines 60 corresponding to non-leaf classes use a similar method to select a packet for transmission as shown in Figure 6A.
- Each scheduling engine 60 which corresponds to a non-leaf class selects packets 51 from among those packets 51 which are being held by its child scheduhng engine(s) 60 (step 109).
- each child scheduling engine 60 assigns new start and finish times to a packet 51 when the packet is transferred to the child scheduling engine 60. If a child scheduhng engine 60 passes a packet to its parent scheduling engine 60 and immediately receives a new packet 51 in the same operation then the new packet 51 is assigned a start time that is the same as the finish time of the previously passed packet. Otherwise, the virtual time 1 of the child scheduling engine 60 is set equal to that of the parent scheduhng engine 60 and the new packet 51 is assigned a start time equal to the newly assigned virtual time Vof e child scheduling engine 60.
- First level scheduhng engine 60C has no parent scheduhng engine 60. Scheduhng engine 60C does not need to maintain start and finish times for the packet that it is holding because forwarder 58 simply forwards the packets held by scheduling engine 60C as quickly as possible.
- R cc is the data rate assigned to the child scheduhng engine in pohcy tree 39.
- the start and finish times of packets ⁇ lheld at ah scheduhng engines 60 are stored in associated memory 64A.
- Start and finish times for a packet 51 being held at a child scheduling engine 60 are calculated on the basis of the rate R cc .
- a parent scheduling engine 60 is assigned a greater data rate R pc in pohcy tree 39 than its child scheduling engines.
- the virtual time of the parent scheduling engine 60 will advance on the basis of the rate R pc . This means that the packet's calculated start and finish times wih tend to be in the future relative to the virtual time of the parent scheduhng engine. This gives the parent class time to service other child scheduhng engines.
- Each leaf class of pohcy tree 39 has a priority.
- Each packet that passes through a leaf scheduhng engine 60 is assigned the priority of the leaf class.
- Information identifying the priority of a packet is passed to each scheduhng engine 60 which handles the packet.
- a scheduler 50 may support two or more levels of priority.
- a simple two level priority scheme as shown in the priority tree of Figure 4, designates high priority classes as "real-time” and lower priority classes as "best effort".
- a non-leaf scheduling engine 60 selects the next packet to be transmitted to its parent scheduling engine 60 from among the zero or more packets which are being held by its child scheduhng engines 60.
- the non-leaf scheduhng engine 60 uses the priority, start time, and finish time of the two or more packets to select one packet to hold and eventually transmit to its parent scheduhng engine 60.
- high priority is assigned to classes that require small transmission delays.
- Lower priorities are assigned to classes that can tolerate larger delays.
- Each parent scheduhng engine 60 selects a group of packets which are ehgible for transmission according to an ehgibility criterion.
- the set of eligible packets is constructed by identifying those packets being held by child scheduling engines 60 of the parent scheduhng engine 60 whose start times are smaller than or equal to the virtual time of the parent scheduhng engine 60 (step 110). In other words a packet is ehgible if its predicted start time has passed.
- the virtual time of the parent scheduling engine is advanced to the start time of the packet or packets being held by child scheduling engines 60 which have the earhest start time.
- the set of ehgible packets is then identified based on the new virtual time (step 110).
- the parent scheduhng engine 60 determines whether the eligible packets all have the same priority or have different priorities (step 112). If the set of ehgible packets includes packets which have two or more different priorities, parent scheduhng engine 60 identifies the highest priority assigned to one or more packets in the ehgible set. Any packet in the ehgible set which does not have the highest priority is removed from the set (step 118).
- a scheduling engine 60 could take priority into consideration while identifying eligible packets. The ehgible set would then contain only those packets which have a start time which makes them eligible to be transmitted and which also have a highest priority.
- the parent scheduling engine 60 selects one packet to pass on next to its parent scheduling engine according to a selection criterion (step 114 or 120). For example, in preferred embodiments of the invention, the scheduhng engine 60 selects for transmission the highest priority ehgible packet 51 which has the earhest finish time. A less preferable selection criterion selects the highest priority ehgible packet with the earhest start time. If two or more packets have the same finish time (or start time), the scheduling engine 60 may select one of the packets at random.
- Parent scheduhng engines 60 could use a simplified method which does not use start time to determine ehgibility.
- Figure 8 ihustrates this simplified embodiment of the invention being used in a situation where packets have one of two priority levels. Each packet may be a high priority (or "real time") packet or a low priority (or "best effort") packet.
- Simplified method 200 begins by selecting aU high priority packets which are currently queued (step 204). The method continues by passing the one high priority packet having the smallest finish time nstep 206). In the alternative, step 206 could pass the packet having the smaUest start time S.
- step 208 selects all queued low priority packets (step 208) and continues by forwarding the low priority packet with the smallest finish time nstep 210).
- step 210 could pass the packet having the smallest start time S. If there are no packets in any queue then the scheduhng engine simply waits.
- the steps of selecting and forwarding high priority packets may be performed as a single step (e.g. if there are any queued high priority packets, selecting and forwarding the queued high priority packet with the smallest finish time) as indicated by 207 and the step of selecting and forwarding the low priority packet may also be performed as a single step (e.g.
- the packet moves from a child scheduhng engine 60 to the scheduling engine which is the parent of that child scheduling engine 60 (step 122) then the virtual time of the child scheduhng engine is updated (step 125) and the child scheduhng engine will select a new packet.
- first level scheduhng engine 60C which may be termed a "root" scheduhng engine does not have a parent class that pulls packets upwards. Instead a forwarder 58 iteratively retrieves packets from root scheduling engine 60C and sends the packets out the logical output port. Each time a packet is retrieved by scheduler 58, root scheduling engine 60C selects another packet from among packets being held by its child scheduling engines for transmission. There are two main different ways of implementing scheduler
- Scheduler 50 could be a single entity that traverses pohcy tree 39, stopping at each node to provide the function of each scheduling engine 60.
- Such a scheduler 50 could be implemented as software running on a general purpose CPU or it could be implemented as a hardware device (e.g. an ASIC).
- scheduler 50 could be implemented as a set of much simpler entities, with a separate entity providing the function of each scheduhng engine 60.
- Each simple scheduling engine 60 could be implemented as a software entity running on a general purpose CPU.
- each simple scheduler could be implemented as a hardware entity and combined with other simple schedulers into a parallel processing hardware device.
- Scheduler 150 has 9 leaf scheduhng engines, 160A through 1601. Each leaf scheduling engine receives packets which have been classified in a particular class by a classifier. Scheduler 150 has 5 non-leaf scheduhng engines 160J through 160N. Each scheduhng engine uses the methods of the invention to select and hold one data packet. That one packet is then available for selection by the parent of the scheduhng engine holding the packet.
- leaf scheduhng engines 160D and 160G correspond to real time classes.
- the other leaf scheduhng engines correspond to best effort classes.
- scheduhng engines could pass a newly arrived high priority packet in place of an already selected lower priority packet.
- the virtual time ⁇ at scheduling engine 160E is updated after the higher priority packet is sent.
- the already selected lower priority packet retains its place in line and will be forwarded to scheduhng engine 160L next (as long as another higher priority packet does not arrive in the meantime).
- high priority packets can flow quickly upward through scheduler 150 along lines 137.
- This alternative embodiment of the invention provides lower latency for high priority packets at the possible expense of unfairness to lower priority packets.
- This method for expediting the scheduling of high priority data packets may be combined with the simplified method for selecting data packets, which is described above.
- each non-leaf scheduhng engine 60 may be capable of holding a packet for each of two or more priority levels supported by scheduler 50.
- each non-leaf scheduhng engine 60 would be capable of holding two packets. Since leaf scheduling engines 60 are associated with a single priority in preferred embodiments of the invention it is not necessary for leaf scheduhng engines 60 to hold more than a single packet at a time. Each scheduling engine 60 continues to have a single virtual time. Each packet that is held by a non-leaf scheduhng engine 60 has its own start and finish time.
- a parent scheduling engine 60 selects a packet from one of its child scheduhng engines 60, it initially considers only the highest priority packets being held by the child scheduling engines 60. If none of those packets are ehgible, it considers the next highest priority packets being held by the child scheduling engines 60. The parent scheduhng engine 60 continues checking for packets of ever lower priority until it finds an eligible packet. If no ehgible packets are found, but the child scheduling engines 60 are holding on to one or more packets, the virtual time of the parent scheduling engine 60 is advanced to the earliest start time of those packets being held. The selection algorithm is repeated again starting at the highest priority.
- Another advantage of preferred embodiments of this invention is that unused bandwidth in one part of a pohcy tree can be used by another part of the pohcy tree.
- a sub-tree of the pohcy tree may hold no packets.
- At the top of the sub-tree wih be a single class which does not hold a packet. Its parent class will use the bandwidth assigned to the subtree by selecting packets from its other child classes more frequently.
- leaf scheduling engine 60 responsible for selecting and forwarding packets from two or more sets of queues containing packets classified in two or more different classes. Where packets classified in the two or more different classes have different priorities then the leaf scheduling engine could be implemented in a manner similar to that described above for a non-leaf scheduling en ine. While this approach is not generally desirable it does provide a method for scheduling packets in a manner that simultaneously takes into consideration bandwidth allocation and priorities.
- the invention For example, where it is desired to forward data packets which may be classified in a high priority class or a lower priority data packets over a data link, one could practice the invention by providing a plurahty of queues each capable of holding one or more of the data packets. If there is a data packet which is classified in a high priority class at the head of any of the queues, that data packet, or another data packet at the head of a queue and classified in a class having the same high priority should be sent next. The method therefore selects one data packet from a first eligible group consisting of the one or more data packets which are at heads of the queues and are classified in the one or more equally high priority classes to forward over the data hnk.
- the method preferably apphes a first to finish selection criterion to the data packets in the first eligible group. If there are no data packets in the first eligible group but there are data packets in the queues which are classified in one or more lower priority classes, the method selects one data packet from a second ehgible group consisting of data packets which are at heads of the queues and are classified in the one or more lower priority classes to forward over the data link. Once again, the method preferably applies a first to finish selection criterion to data packets in the second eligible group. The selected data packet is then forwarded over the data link.
- Preferred implementations of the invention may include a computer system programmed to execute a method of the invention.
- the invention may also be provided in the form of a program product.
- the program product may comprise any medium which carries a set of computer-readable signals corresponding to instructions which, when run on a computer, cause the computer to execute a method of the invention.
- the program product may be distributed in any of a wide variety of forms.
- the program product may comprise, for example, physical media such as floppy diskettes, CD ROMs, DVDs, hard disk drives, flash RAM or the like or transmission-type media such as digital or analog communication links. Accordingly, the scope of the invention is to be construed in accordance with the substance defined by the following claims.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CA2000/000937 WO2002015520A1 (fr) | 2000-08-17 | 2000-08-17 | Procedes et dispositifs pour l'ordonnancement de paquets |
AU2000266748A AU2000266748A1 (en) | 2000-08-17 | 2000-08-17 | Packet scheduling methods and apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CA2000/000937 WO2002015520A1 (fr) | 2000-08-17 | 2000-08-17 | Procedes et dispositifs pour l'ordonnancement de paquets |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2002015520A1 true WO2002015520A1 (fr) | 2002-02-21 |
Family
ID=4143077
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CA2000/000937 WO2002015520A1 (fr) | 2000-08-17 | 2000-08-17 | Procedes et dispositifs pour l'ordonnancement de paquets |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2000266748A1 (fr) |
WO (1) | WO2002015520A1 (fr) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6560230B1 (en) | 1999-02-01 | 2003-05-06 | Redback Networks Inc. | Packet scheduling methods and apparatus |
WO2004042953A1 (fr) | 2002-11-07 | 2004-05-21 | Lg Electronics Inc. | Procede de deplacement d'une fenetre de reception dans un reseau a acces radio |
WO2005060174A1 (fr) * | 2003-12-15 | 2005-06-30 | Telefonaktiebolaget L M Ericsson (Publ) | Commande d'acces au support distribuee pour systemes d'acces large bande |
WO2006017753A2 (fr) | 2004-08-05 | 2006-02-16 | Cisco Technology, Inc. | Propagation de taux de programmation minimums garantis |
RU2354063C2 (ru) * | 2003-12-15 | 2009-04-27 | Телефонактиеболагет Л М Эрикссон (Пабл) | Распределенное управление доступом к среде передачи для систем широкополосного доступа |
EP1774714B1 (fr) * | 2004-08-05 | 2012-09-05 | Cisco Technology, Inc. | Planificateur hierarchique a voies de planification multiples |
US8325736B2 (en) | 2004-01-14 | 2012-12-04 | Cisco Technology, Inc. | Propagation of minimum guaranteed scheduling rates among scheduling layers in a hierarchical schedule |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0859492A2 (fr) * | 1997-02-07 | 1998-08-19 | Lucent Technologies Inc. | Système de files d'attente équitable et pondérée avec reallocation de bande passante excessive |
WO1998045976A2 (fr) * | 1997-04-04 | 1998-10-15 | Ascend Communications, Inc. | Procede et dispositif servant a programmer un paquet hierarchique |
US6075791A (en) * | 1997-10-28 | 2000-06-13 | Lucent Technologies Inc. | System for guaranteeing data transfer rates and delays in packet networks |
-
2000
- 2000-08-17 AU AU2000266748A patent/AU2000266748A1/en not_active Abandoned
- 2000-08-17 WO PCT/CA2000/000937 patent/WO2002015520A1/fr active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0859492A2 (fr) * | 1997-02-07 | 1998-08-19 | Lucent Technologies Inc. | Système de files d'attente équitable et pondérée avec reallocation de bande passante excessive |
WO1998045976A2 (fr) * | 1997-04-04 | 1998-10-15 | Ascend Communications, Inc. | Procede et dispositif servant a programmer un paquet hierarchique |
US6075791A (en) * | 1997-10-28 | 2000-06-13 | Lucent Technologies Inc. | System for guaranteeing data transfer rates and delays in packet networks |
Non-Patent Citations (1)
Title |
---|
FLOYD S ET AL: "LINK-SHARING AND RESOURCE MANAGEMENT MODELS FOR PACKET NETWORKS", IEEE / ACM TRANSACTIONS ON NETWORKING, IEEE INC. NEW YORK, US, vol. 3, no. 4, 1 August 1995 (1995-08-01), pages 365 - 386, XP000520857, ISSN: 1063-6692 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6560230B1 (en) | 1999-02-01 | 2003-05-06 | Redback Networks Inc. | Packet scheduling methods and apparatus |
WO2004042953A1 (fr) | 2002-11-07 | 2004-05-21 | Lg Electronics Inc. | Procede de deplacement d'une fenetre de reception dans un reseau a acces radio |
WO2005060174A1 (fr) * | 2003-12-15 | 2005-06-30 | Telefonaktiebolaget L M Ericsson (Publ) | Commande d'acces au support distribuee pour systemes d'acces large bande |
RU2354063C2 (ru) * | 2003-12-15 | 2009-04-27 | Телефонактиеболагет Л М Эрикссон (Пабл) | Распределенное управление доступом к среде передачи для систем широкополосного доступа |
US7639651B2 (en) | 2003-12-15 | 2009-12-29 | Telefonaktiebolaget L M Ericsson (Publ) | Distributed medium access control for broadband access systems |
KR101078987B1 (ko) * | 2003-12-15 | 2011-11-01 | 텔레폰악티에볼라겟엘엠에릭슨(펍) | 광대역 접속 시스템을 위한 분산 매체 접근 제어 |
US8325736B2 (en) | 2004-01-14 | 2012-12-04 | Cisco Technology, Inc. | Propagation of minimum guaranteed scheduling rates among scheduling layers in a hierarchical schedule |
WO2006017753A2 (fr) | 2004-08-05 | 2006-02-16 | Cisco Technology, Inc. | Propagation de taux de programmation minimums garantis |
EP1774721A4 (fr) * | 2004-08-05 | 2011-03-09 | Cisco Tech Inc | Propagation de taux de programmation minimums garantis |
EP1774714B1 (fr) * | 2004-08-05 | 2012-09-05 | Cisco Technology, Inc. | Planificateur hierarchique a voies de planification multiples |
Also Published As
Publication number | Publication date |
---|---|
AU2000266748A1 (en) | 2002-02-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6560230B1 (en) | Packet scheduling methods and apparatus | |
US6567408B1 (en) | Methods and apparatus for packet classification with multi-level data structure | |
US6529508B1 (en) | Methods and apparatus for packet classification with multiple answer sets | |
US6980555B2 (en) | Policy change characterization method and apparatus | |
US7283472B2 (en) | Priority-based efficient fair queuing for quality of service classification for packet processing | |
US8089959B2 (en) | Method and apparatus to schedule packets through a crossbar switch with delay guarantees | |
US7065569B2 (en) | System and method for remote traffic management in a communication network | |
US6449650B1 (en) | Methods and apparatus for deploying quality of service policies on a data communication network | |
US6598034B1 (en) | Rule based IP data processing | |
US8064344B2 (en) | Flow-based queuing of network traffic | |
US7564872B1 (en) | Apparatus and methods for providing event-based data communications device configuration | |
EP0980168A2 (fr) | Procédé de transfert universelle et réseaux avec commutateur distribué | |
WO2002015521A1 (fr) | Procedes et appareils destines a une classification de paquets au moyen d'une structure de donnees multi-niveau | |
WO2004045167A1 (fr) | Procede de selection d'une liaison logique pour un paquet dans un routeur | |
US6795441B1 (en) | Hierarchy tree-based quality of service classification for packet processing | |
WO2002015520A1 (fr) | Procedes et dispositifs pour l'ordonnancement de paquets | |
EP1471694B1 (fr) | Procédé de dimensionnement d'un largeur de bande dans réseaux de voix-sur-IP | |
US7339953B2 (en) | Surplus redistribution for quality of service classification for packet processing | |
US20080244075A1 (en) | High performance real-time data multiplexer | |
Kryvinska et al. | An analytical approach to the efficient real-time events/services handling in converged network environment | |
WO2002015488A1 (fr) | Procedes et systeme de classement par paquets comportant de multiples ensembles de reponses | |
WO2002015462A1 (fr) | Procedes et appareil permettant de developper des politiques de qualite de service sur un reseau de communication de donnees | |
US20040071144A1 (en) | Method and system for distributed single-stage scheduling | |
US20070133561A1 (en) | Apparatus and method for performing packet scheduling using adaptation round robin | |
Callegati et al. | Engineering Packet-Switched Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |