US9667498B2 - Self-adaptive control system for dynamic capacity management of latency-sensitive application servers - Google Patents
Self-adaptive control system for dynamic capacity management of latency-sensitive application servers Download PDFInfo
- Publication number
- US9667498B2 US9667498B2 US14/450,148 US201414450148A US9667498B2 US 9667498 B2 US9667498 B2 US 9667498B2 US 201414450148 A US201414450148 A US 201414450148A US 9667498 B2 US9667498 B2 US 9667498B2
- Authority
- US
- United States
- Prior art keywords
- servers
- active
- server
- capacity
- cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 238000000034 method Methods 0.000 claims abstract description 47
- 230000015654 memory Effects 0.000 claims description 28
- 230000008859 change Effects 0.000 claims description 20
- 230000004044 response Effects 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 8
- 238000005457 optimization Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 5
- 238000013459 approach Methods 0.000 claims description 2
- 230000006399 behavior Effects 0.000 abstract description 7
- 230000006855 networking Effects 0.000 abstract description 4
- 238000013341 scale-up Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 19
- 230000008569 process Effects 0.000 description 12
- 238000003860 storage Methods 0.000 description 11
- 239000000306 component Substances 0.000 description 9
- 230000008878 coupling Effects 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 4
- 238000005859 coupling reaction Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000000737 periodic effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 241001522296 Erithacus rubecula Species 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000004134 energy conservation Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000010355 oscillation Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0896—Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0805—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
- H04L43/0817—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/70—Admission control; Resource allocation
- H04L47/72—Admission control; Resource allocation using reservation actions during connection setup
- H04L47/726—Reserving resources in multiple paths to be used simultaneously
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1031—Controlling of the operation of servers by a load balancer, e.g. adding or removing servers that serve requests
Definitions
- Server clusters comprising a group of linked servers generally employ load balancing methods for workload management with the goal of reducing response times and thereby improving performance.
- a load balancing method improves the performance of a server cluster by distributing requests initiated from clients among available servers in the cluster.
- Various load balancing methods are known in the prior art for improving the performance of a server cluster. For example, the round robin load balancing method passes each request from a client to the next server in the cluster, eventually distributing requests evenly among all the servers.
- the least usage load balancing method balances load among the servers in the cluster by tracking the utilization per server and directing new requests to servers with the least utilization. While load balancing methods generally improve the performance of a server cluster, they do not address the need for power and energy conservation in server clusters. As energy costs for data centers continue to rise, technology for optimizing both energy and performance metrics of server clusters is needed.
- FIG. 1 is a block diagram illustrating an architecture of an example server cluster implementing a self-adaptive control system including a centralized controller for dynamic server capacity management for performance and power optimization in the server cluster.
- FIG. 2 is a block diagram illustrating active and inactive servers in a server pool in accordance with a first embodiment.
- FIG. 3 is a block diagram illustrating active and inactive servers in a server pool in accordance with a second embodiment.
- FIG. 4 is a block diagram illustrating a self-adaptive control system for dynamic server capacity management for performance and power optimization in a server cluster.
- FIG. 5 is a block diagram illustrating aggregation of resource utilization information from active servers in a server cluster by a centralized controller of a self-adaptive control system for dynamic server capacity management.
- FIG. 6 is a block diagram illustrating example components of a centralized controller of a self-adaptive control system for dynamic server capacity management.
- FIG. 7 is a logic flow diagram illustrating an example method of determining an optimal number of active servers by a centralized controller of a self-adaptive control system for dynamic server capacity management.
- FIG. 8 is a graphical diagram plotting a measured variable against a controlled variable for a server-type to model the relationship between the measured variable and the controlled variable for the server-type.
- FIG. 9 is a graphical diagram illustrating a normalized number of idle servers in a 24-hour window for an example server cluster deploying a centralized controller of a self-adaptive control system for dynamic server capacity management.
- FIG. 10 is a graphical diagram illustrating normalized power consumption for an example server cluster with and without a centralized controller of a self-adaptive control system for dynamic server capacity management.
- FIG. 11 shows a diagrammatic representation of a computer system within which a set of instructions, for causing the computer system to perform any one or more of the methodologies discussed herein, can be executed.
- Web applications such as social networking applications, are very sensitive to server latency or response time. Enough server capacity is therefore needed to guarantee good response times, particularly during peak hours, so as not to impact user experience.
- data center operators provision a fixed number of servers per cluster to meet the estimated peak workload.
- most web applications also have time-varying workloads, with daily workload cycles affected by user access patterns.
- many of the servers in the cluster operate at medium to high CPU utilization for only about 20-30% of the time on average.
- many of the servers operate at low CPU utilization.
- low CPU utilization is very inefficient. For example, a particular type of server can consume about 60 W of power when it is idle (i.e., with 0 requests per second (RPS)).
- the power consumption jumps up to about 130 W when it runs at low CPU utilization (with small RPS).
- the server runs at medium to high CPU utilization, it consumes only slightly more power at about 150 W (with medium-level RPS). Therefore, the average sever utilization for a cluster with a fixed number of servers can be low, resulting in wasted power.
- it is preferable to avoid running a server at low CPU utilization.
- Dynamic capacity management which seeks to match the number of active servers to the current level of workload, can reduce some of the wasted power and thereby increase the efficiency of a data center.
- Existing dynamic capacity management techniques are based on heuristics which rely on empirical evaluation and manual tuning to try to estimate or predict the change of request rate (e.g., RPS) or workload. These techniques usually work well for applications that are relatively static.
- RPS request rate
- These techniques usually work well for applications that are relatively static.
- web applications like social networking applications the application behavior and systems are dynamically evolving. For example, some social networking applications can do code push multiple times a day. Similarly, the underlying system or hardware can be upgraded from time to time. These changes can alter the response times and CPU utilization characteristics of servers.
- the dynamic capacity management technology disclosed herein utilizes classic control theory to adapt efficiently to continuous changes of request rates, and application and system behaviors to scale cluster capacity, allocating or deallocating servers to accept request traffic as needed.
- the disclosed technology works effectively for server clusters of various sizes including very large scale server clusters.
- the self-adaptive control system includes a centralized controller that uses current information relating to an operating parameter (e.g., latency, CPU utilization, request queue) aggregated from a number of active servers in a cluster and historical information relating to the operating parameter to predict a change in workload and determine an optimal number of active servers needed to handle the change in workload efficiently while reducing latency and maximizing energy savings. Based on the optimal number of active servers, the self-adaptive control system can scale up or down the server capacity to obtain energy and power savings. A load balancing system can then distribute traffic among the active servers using any load balancing methods (e.g., round-robin, weighted round-robin, random).
- load balancing methods e.g., round-robin, weighted round-robin, random).
- the centralized controller deployed in a cluster continuously maintains just the right amount of server capacity in the cluster to adapt to time-varying workloads (e.g., workload surges or drops) and changing application and system behaviors (e.g., caused by software or hardware changes), and in doing so, optimizes both the latency and efficiency characteristics of the cluster.
- workloads e.g., workload surges or drops
- application and system behaviors e.g., caused by software or hardware changes
- FIG. 1 is a block diagram illustrating an architecture of an example server cluster implementing a self-adaptive control system including a centralized controller for dynamic server capacity management for performance and power optimization in the server cluster.
- a core component of the architecture of the self-adaptive control system 100 is a centralized controller 105 that implements the decision logic.
- the centralized controller 105 can be an add-on to a typical load balancing system 110 which directs traffic to a pool of web servers 115 .
- the centralized controller 105 can be implemented within the load balancing system 110 .
- the centralized controller 105 can be implemented as a separate component that communicates with the load balancing system 110 .
- the server pool 115 represents a server cluster and include a number of servers (e.g., application servers or web servers) that can be classified into active servers which take traffic and inactive servers which do not.
- the load balancing system 110 sends request traffic to all active servers in the server pool 115 .
- the centralized controller 105 forms a feedback control loop that starts by collecting utilization information 120 (e.g., latency, CPU utilization, request queue) from all active servers in the server pool 115 . The collected information is used as a feedback signal and fed to the centralized controller 105 .
- the centralized controller 105 then makes a decision on the optimal active pool size 125 and passes the decision to the load balancing system 110 .
- the number of active servers in the server pool 115 is scaled up or down to match the optimal active pool size.
- the load balancing system 110 then sends traffic 130 to the server pool 115 , concentrating the traffic to only the active servers in the server pool 115 .
- FIG. 2 is a block diagram illustrating active and inactive servers in a server pool in accordance with a first embodiment.
- the server pool 215 can include a group of active servers 220 and a group of inactive servers 225 .
- the optimal size of the active server pool (N ACTIVE ) is determined by the centralized controller 105 of FIG. 1 .
- the rest of the servers in the pool of servers 215 are inactive servers 225 that can be placed on hot standby (e.g., idle state), cold standby (e.g., turned off, powered down, for example, by switching to sleep mode) or loaned out to other applications for asynchronous jobs.
- hot standby e.g., idle state
- cold standby e.g., turned off, powered down, for example, by switching to sleep mode
- loaned out to other applications for asynchronous jobs e.g., loaned out to other applications for asynchronous jobs.
- the operating system can do very efficient power optimization. So, by concentrating traffic to only the active servers and leaving the inactive servers on hot standby, significant energy savings can be achieved. For example, as described before, a typical type of server consumes about 60 W of power when it is completely idle. It would consume about 130 W when it runs at low CPU utilization and only slightly more power ( ⁇ 150 W) when it runs at medium CPU utilization. Therefore, it is more power efficient to run one server at, for example, 40% and one at idle (or in deep sleep mode, powered off or repurposed to run asynchronous jobs), as opposed to running both servers at 20% utilization. It should be noted that in various embodiments, various measurement thresholds or ranges can be established for low, medium and high CPU utilizations.
- inactive servers into deep sleep mode or even turning them off (cold standby) would give more energy savings as compared to placing the inactive servers in hot standby.
- hot standby servers which can wake up instantaneously, the cold standby servers can take some time to completely wake up.
- the server pool 315 can include a group of active servers 320 of optimal size N ACTIVE , a group of inactive servers 330 on hot standby and a group of inactive servers 325 on cold standby or asynchronous jobs.
- the group of inactive servers 330 on hot standby act as a buffer that enables the cluster to scale its capacity rapidly when needed.
- the buffer can be an elastic buffer with an optimal size, N BUFFER , decided by the centralized controller 105 of FIG. 1 based on the current trend of workload change.
- FIG. 4 is a block diagram illustrating a self-adaptive control system for dynamic server capacity management for performance and power optimization in a server cluster.
- the self-adaptive control system 400 is a closed loop feedback control system that uses a feedback loop to determine an optimal number of active servers necessary to adapt to the varying workload, application and system behavior changes, while maximizing energy savings or efficiency gain opportunity and avoiding overconcentration of traffic in a way that could affect the response time and impact user experience.
- the self-adaptive control system 400 includes a transformed controller 430 (e.g., the centralized controller 105 , 505 , 605 ) having a controller 405 and a controlled system 415 .
- the controlled system 415 is an active server in the cluster.
- the dynamics of the control system 415 is first modeled to determine the relationship between the measured process variable (y) and the control variable (u).
- the output y can either be a latency or a proxy like CPU utilization.
- the transformed (external) control signal u is the normalized percentage change in per-server request per second (RPS).
- the internal control signal (s) is the reverse of percentage change in active server pool size.
- the disturbance input ( ⁇ ) accounts for all un-measureable workload or application changes.
- the error signal (e) is the control error and provides a difference between the measured variable (y) and a reference (or target or set point) (y ref ) to which the measured variable should converge to.
- the controlled system 415 can be modeled by determining the relationship between the measured variable y and control signal u.
- the correlation between the measured variable y and the control signal u can be inferred or estimated from empirical data collected from the cluster.
- the graphical diagram of FIG. 8 can be generated by plotting the measured variable (CPU utilization y) against the controlled variable (RPS u) for one type of server to model the relationship between the measured variable and the controlled variable for that server type. From FIG. 8 , the line 805 shows the estimated piece-wise linear model of the controlled system 415 and can be expressed by equation (1) below.
- y h*x+c (1)
- Equation (1) x is normalized RPS, y is CPU utilization, h is a slope of the linear model and c is a constant.
- the CPU utilization at k+1 th control period depends on the CPU utilization and request rate at k th control period.
- the controller 405 can be designed.
- the controller 405 is based on PI (Proportional-Integral) theory.
- the controller 405 can be designed as a P (Proportional) controller or PID (Proportional Integral Differentiator) controller.
- K i and K p are the control gains for the PI controller 405 .
- a control signal to be applied to the controlled system is thus a sum of the previous control signal, a P-term which is proportional to the error and an I-term which is proportional to the integral of the error.
- control parameters values K i and K p
- K i and K p can be selected to meet certain design constraints (e.g., fast response time, no oscillation).
- One example set of control parameters can be:
- FIG. 5 is a block diagram illustrating aggregation of resource utilization information from active servers in a server cluster by a centralized controller of a self-adaptive control system for dynamic server capacity management.
- the centralized controller 505 can be implemented on a master server (server 0 or node 0) that is like any other servers or nodes 1-N in the cluster.
- the master server may not be turned off and can periodically query any of the active servers among the 1-N servers in the cluster for resource utilization information.
- Each of the servers 1-N can have a resource utilization monitor to measure one or more operating parameters, for example, CPU utilization, latency or response time, disk utilization, network utilization and/or the like.
- a power monitor 510 e.g., a power meter
- the centralized controller 505 can be implemented on a dedicated machine to manage the servers 1-N in the cluster.
- the centralized controller 505 can receive periodic reports including resource utilization information from the active servers in the cluster.
- the resource utilization information from the resource utilization monitors of the active servers in the cluster and average power consumption by the servers in the cluster are examples of feedback 520 collected by the centralized controller 505 .
- the power 525 is drawn by the servers 1-N from the power supply 515 and can be used to determine the efficiency of the cluster during a time period.
- the centralized controller 505 uses the feedback and control theory (e.g., PI or PID control theory described with reference to FIG. 4 ) to determine how many servers to turn up or activate and sends a control signal 530 to turn up or turn down one or more servers in the cluster.
- FIG. 6 is a block diagram illustrating example components of a centralized controller of a self-adaptive control system for dynamic server capacity management.
- the centralized controller 605 implements the decision logic or algorithm based on control theory.
- the centralized controller 605 is implemented either on a dedicated machine (e.g., a dedicated server) or one of the servers in the cluster.
- the centralized controller 605 can include a resource utilization information aggregator 610 , a decision engine 615 , a server state manager 620 having an inactive server state manager 625 and a power consumption calculator 630 . More or less components may be present in other embodiments of the centralized controller 605 .
- the resource utilization information aggregator 610 can aggregate resource utilization information from the active servers in a cluster. In some embodiments, the resource utilization information aggregator 610 can query each active server in the cluster for resource utilization information. Alternatively, in other embodiments, the resource utilization information aggregator 610 can receive periodic reports including resource utilization information from each active server in the cluster.
- the resource utilization information collected by the resource utilization information aggregator can include, for example, request rates, CPU utilization, power consumption, latency, disk utilization, network utilization or any other metric measured by each active server.
- the resource utilization information can also include total power consumption measured by a power meter (e.g., power meter 510 of FIG. 5 ).
- the decision engine 615 implements a control algorithm based on the self-adaptive control system as described in detail with respect to FIG. 4 .
- the decision engine 615 can determine the percent change in request per second per server in the current control cycle (u k+1 ).
- the total request per second coming in to the system was 5000, then the request per second per server was 250 (i.e., 5000/20).
- the decision engine 615 can then determine the required capacity to meet the CPU utilization target of 50% and to process 275 RPS per server to be 18 (i.e., 5000/275 ⁇ 18). As the current capacity is 20 servers and the required capacity is 18 servers, the decision engine 615 can determine that 2 of the active servers can be turned down for energy savings, without compromising latency. The decision engine 615 would then repeat the same process during the next control cycle.
- the server state manager 620 can receive the output from the decision engine 615 and in response can adjust the current capacity to meet the required capacity by turning down active servers if the current capacity is greater than the required capacity or turning up additional servers if the current capacity is lower than the required capacity.
- the server state manager 620 can place the two surplus active servers in hot standby or cold standby or even loan them out to other applications that are not latency-sensitive, so that only 18 of the servers remain active in the cluster.
- the decision engine 615 can also determine how many of the inactive servers should be placed in hot standby.
- the buffer size for the number of servers in hot standby can be determined by the decision engine 615 based on the current trend of workload changes.
- the decision engine 615 can estimate the total load (RPS) based on an estimate of time it takes for a server in cold standby to transition into hot standby. The estimate depends on whether the server is powered off, in deep sleep or being used for asynchronous jobs. For example, if the server is in deep sleep, the estimate will be the time required to wake up the server. Similarly, if the server is running asynchronous jobs, the estimate will be the time required to quit the job, clean up and get ready to accept traffic.
- RPS total load
- the decision engine 615 can then estimate the total load 2 minutes into the future and translate the estimated total load to the number of servers to be turned on.
- the inactive server state manager 625 can use the buffer size determined by the decision engine to allocate or deallocate one or more inactive servers to or from hot standby to maintain the buffer size requirement on inactive servers on hot standby.
- the centralized controller 605 can include a power consumption calculator 630 that can determine the average power consumption for a cluster during the last control cycle using the following formula:
- the centralized controller 605 can be coupled to one or more database tables.
- the historical resource usage data can be stored in the database table 635 to enable the centralized controller 605 to retrieve historical resource usage data for use in determining the optimal number of active servers.
- the centralized controller 605 can also access other database tables to store and/or retrieve power consumption data, control parameters, modeling data, other logged data, and/or the like.
- FIG. 7 is a logic flow diagram illustrating an example method of determining an optimal number of active servers by a centralized controller of a self-adaptive control system for dynamic server capacity management.
- the example method starts at block 705 at the beginning of a control cycle.
- a centralized controller e.g., centralized controller 605 of FIG. 6 . determines the current resource utilization for a server cluster. The centralized controller can make that determination by querying the active servers in the cluster for current resource utilization information such as response times or latency, CPU utilization, etc. Each active server in the cluster includes a resource utilization monitor to measure the resource utilization metrics.
- the centralized controller determines a change in the resource utilization based on the current resource utilization and a target resource utilization.
- the centralized controller determines a percent change in per server request rate (e.g., RPS) based at least in part on the change in resource utilization and control theory (e.g., PI or PID control theory). For example, equation (4) described above can be used to determine the change in per server request rate.
- the centralized controller determines an optimal number of active servers based, at least in part, on the percent change in per server request rate.
- the centralized controllers issues commands to turn up or turn down one or more servers in the cluster in order to increase or decrease the current number of active servers to match the optimal number of active servers.
- a load balancer would then route traffic to the active servers using a load balancing method.
- the centralized controller would then repeat the process in the next control cycle 735 . In this manner, the centralized controller can continuously adjust the active server pool size to adapt to changes in request rates and system and application behavior.
- FIG. 9 is a graphical diagram illustrating a normalized number of idle servers in a 24-hour window for an example server cluster deploying a centralized controller of a self-adaptive control system for dynamic server capacity management.
- the inactive servers are left running idle (i.e., powered on but receiving no traffic). Those inactive servers can also be put into deep-sleep modes or even powered off to provide more energy savings. Nonetheless, as shown in FIG. 9 , leaving inactive servers idle results in significant energy savings.
- the y-axis is the normalized number of servers put into inactive mode during a 24 hour cycle and the x-axis is time.
- the numbers are normalized by the maximum number of idle servers which occurred around the peak hour for the cluster (e.g., noon). In this example, none of the servers in the cluster could be put into power-saving mode around noon. However, at other times, the centralized controller can place as many as 100 additional servers into inactive mode, providing significant energy savings.
- FIG. 10 is a graphical diagram illustrating normalized power consumption for an example server cluster with and without a centralized controller of a self-adaptive control system for dynamic server capacity management.
- the y-axis is the normalized power consumption relative to the daily maximum power draw and the x-axis is time.
- the line with reference numeral 1010 is the base case without the centralized controller and the line with reference numeral 1020 shows the power draw with the centralized controller. From FIG. 10 , it can be seen that with the centralized controller, the cluster uses about 27% less power around midnight. As expected, the power saving is 0% around peak hours for the cluster (e.g., during noon time). The average power saving over a 24 hour cycle can be about 10-15% for different clusters. In a system with a large number of clusters, the above result can mean a significant amount of energy saved.
- FIG. 11 shows a diagrammatic representation of a computer system within which a set of instructions, for causing the computer system to perform any one or more of the methodologies discussed herein, can be executed.
- the centralized controller e.g., 505 , 605
- servers in the cluster e.g., servers in the cluster
- the load balancing system e.g., 110
- the computer system 1100 generally includes a processor 1105 , main memory 1110 , non-volatile memory 1115 , and a network interface device 1120 .
- Various common components e.g., cache memory
- the computer system 1100 is intended to illustrate a hardware device on which any of the components depicted in the example of FIGS. 1-6 (and any other components described in this specification) and methods described in the example of FIG. 7 can be implemented.
- the computer system 1100 be of any applicable known or convenient type.
- the components of the computer system 1100 can be coupled together via a bus 1125 or through some other known or convenient device.
- the processor 1105 may be, for example, a conventional microprocessor such as an Intel Pentium microprocessor or Motorola power PC microprocessor.
- Intel Pentium microprocessor or Motorola power PC microprocessor.
- computer system-readable (storage) medium or “computer-readable (storage) medium” include any type of device that is accessible by the processor.
- the memory 1110 is coupled to the processor 1105 by, for example, a bus 1125 such as a PCI bus, SCSI bus, or the like.
- the memory 1110 can include, by way of example but not limitation, random access memory (RAM), such as dynamic RAM (DRAM) and static RAM (SRAM).
- RAM random access memory
- DRAM dynamic RAM
- SRAM static RAM
- the memory 1110 can be local, remote, or distributed.
- the bus 1125 also couples the processor 1105 to the non-volatile memory 1115 and drive unit.
- the non-volatile memory 1115 is often a magnetic floppy or hard disk, a magnetic-optical disk, an optical disk, a read-only memory (ROM), such as a CD-ROM, EPROM, or EEPROM, a magnetic or optical card, SD card, or another form of storage for large amounts of data. Some of this data is often written, by a direct memory access process, into memory during execution of software in the computer system 1100 .
- the non-volatile memory 1115 can be local, remote, or distributed.
- the non-volatile memory can be optional because systems can be created with all applicable data available in memory.
- a typical computer system will usually include at least a processor, memory, and a device (e.g., a bus) coupling the memory to the processor.
- Software is typically stored in the non-volatile memory 1115 and/or the drive unit 1145 . Indeed, for large programs, it may not even be possible to store the entire program in the memory. Nevertheless, it should be understood that for software to run, if necessary, it is moved to a computer readable location appropriate for processing, and for illustrative purposes, that location is referred to as the memory 1110 in this disclosure. Even when software is moved to the memory for execution, the processor will typically make use of hardware registers to store values associated with the software, and local cache. Ideally, this serves to speed up execution.
- a software program is assumed to be stored at any known or convenient location (from non-volatile storage to hardware registers) when the software program is referred to as “implemented in a computer-readable medium.”
- a processor is considered to be “configured to execute a program” when at least one value associated with the program is stored in a register readable by the processor.
- the bus 1125 also couples the processor to the network interface device 1120 .
- the interface can include one or more of a modem or network interface. It will be appreciated that a modem or network interface can be considered to be part of the computer system 1100 .
- the interface can include an analog modem, isdn modem, cable modem, token ring interface, satellite transmission interface (e.g., “direct PC”), or other interfaces for coupling a computer system to other computer systems.
- the interface can include one or more input and/or output devices 1135 .
- the I/O devices can include, by way of example but not limitation, a keyboard, a mouse or other pointing device, disk drives, printers, a scanner, speaker, DVD/CD-ROM drives, disk drives, and other input and/or output devices, including a display device.
- the display device 1130 can include, by way of example but not limitation, a cathode ray tube (CRT), liquid crystal display (LCD), LED display, a projected display (such as a heads-up display device), a touchscreen or some other applicable known or convenient display device.
- the display device 1130 can be used to display text and graphics. For simplicity, it is assumed that controllers of any devices not depicted in the example of FIG. 8 reside in the interface.
- the computer system 1100 can be controlled by operating system software that includes a file management system, such as a disk operating system.
- a file management system such as a disk operating system.
- operating system software with associated file management system software is the family of operating systems known as Windows® from Microsoft Corporation of Redmond, Wash., and their associated file management systems.
- Windows® is the family of operating systems known as Windows® from Microsoft Corporation of Redmond, Wash.
- Windows® is the family of operating systems known as Windows® from Microsoft Corporation of Redmond, Wash.
- Another example of operating system software with its associated file management system software is the Linux operating system and its associated file management system.
- the file management system is typically stored in the non-volatile memory 1115 and/or drive unit 1145 and causes the processor to execute the various acts required by the operating system to input and output data and to store data in the memory, including storing files on the non-volatile memory 1115 and/or drive unit 1145 .
- the computer system operates as a standalone device or may be connected (e.g., networked) to other computer systems.
- the computer system may operate in the capacity of a server or a client computer system in a client-server network environment, or as a peer computer system in a peer-to-peer (or distributed) network environment.
- the computer system may be a server computer (e.g., a database server), a client computer, a personal computer (PC), a tablet PC, a laptop computer, a set-top box (STB), a personal digital assistant (PDA), a cellular telephone, an iPhone, a Blackberry, a processor, a telephone, a web appliance, a network router, switch or bridge, or any computer system capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that computer system.
- server computer e.g., a database server
- client computer e.g., a personal computer (PC), a tablet PC, a laptop computer, a set-top box (STB), a personal digital assistant (PDA), a cellular telephone, an iPhone, a Blackberry, a processor, a telephone, a web appliance, a network router, switch or bridge, or any computer system capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that
- While the computer system-readable medium or computer system-readable storage medium 1150 is shown in an exemplary embodiment to be a single medium, the term “computer system-readable medium” and “computer system-readable storage medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions.
- the term “computer system-readable medium” and “computer system-readable storage medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the computer system and that cause the computer system to perform any one or more of the methodologies of the presently disclosed technique and innovation.
- routines executed to implement the embodiments of the disclosure may be implemented as part of an operating system or a specific application, component, program, object, module, or sequence of instructions referred to as “computer programs.”
- the computer programs typically comprise one or more instructions set at various times in various memory and storage devices in a computer, and that, when read and executed by one or more processing units or processors in a computer, cause the computer to perform operations to execute elements involving the various aspects of the disclosure.
- Computer system-readable storage media examples include but are not limited to recordable type media such as volatile and non-volatile memory devices, floppy and other removable disks, hard disk drives, optical disks (e.g., Compact Disk Read-Only Memory (CD ROMS), Digital Versatile Disks, (DVDs), etc.), SD cards, among others.
- recordable type media such as volatile and non-volatile memory devices, floppy and other removable disks, hard disk drives, optical disks (e.g., Compact Disk Read-Only Memory (CD ROMS), Digital Versatile Disks, (DVDs), etc.), SD cards, among others.
- CD ROMS Compact Disk Read-Only Memory
- DVDs Digital Versatile Disks
- SD cards among others.
- the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense, as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to.”
- the terms “connected,” “coupled,” or any variant thereof means any connection or coupling, either direct or indirect, between two or more elements; the coupling of connection between the elements can be physical, logical, or a combination thereof.
- the words “herein,” “above,” “below,” and words of similar import when used in this application, shall refer to this application as a whole and not to any particular portions of this application.
- words in the above Detailed Description using the singular or plural number may also include the plural or singular number respectively.
- the word “or,” in reference to a list of two or more items, covers all of the following interpretations of the word: any of the items in the list, all of the items in the list, and any combination of the items in the list.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Environmental & Geological Engineering (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Power Sources (AREA)
Abstract
Description
y=h*x+c (1)
y k+1 =y k +h*u k (2)
y k+1 =y k +h*u k (2)
e k+1 =y k+1 −y ref (3)
u k+1 =u k +K p*(e k+1 −e k)+K i *e k+1 (4)
Claims (18)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/450,148 US9667498B2 (en) | 2013-12-20 | 2014-08-01 | Self-adaptive control system for dynamic capacity management of latency-sensitive application servers |
US15/493,532 US10212220B2 (en) | 2013-12-20 | 2017-04-21 | Self-adaptive control system for dynamic capacity management of latency-sensitive application servers |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361919363P | 2013-12-20 | 2013-12-20 | |
US14/450,148 US9667498B2 (en) | 2013-12-20 | 2014-08-01 | Self-adaptive control system for dynamic capacity management of latency-sensitive application servers |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/493,532 Continuation US10212220B2 (en) | 2013-12-20 | 2017-04-21 | Self-adaptive control system for dynamic capacity management of latency-sensitive application servers |
Publications (2)
Publication Number | Publication Date |
---|---|
US20150180719A1 US20150180719A1 (en) | 2015-06-25 |
US9667498B2 true US9667498B2 (en) | 2017-05-30 |
Family
ID=53401334
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/450,148 Active 2035-01-01 US9667498B2 (en) | 2013-12-20 | 2014-08-01 | Self-adaptive control system for dynamic capacity management of latency-sensitive application servers |
US15/493,532 Active US10212220B2 (en) | 2013-12-20 | 2017-04-21 | Self-adaptive control system for dynamic capacity management of latency-sensitive application servers |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/493,532 Active US10212220B2 (en) | 2013-12-20 | 2017-04-21 | Self-adaptive control system for dynamic capacity management of latency-sensitive application servers |
Country Status (1)
Country | Link |
---|---|
US (2) | US9667498B2 (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170277607A1 (en) * | 2016-03-23 | 2017-09-28 | GM Global Technology Operations LLC | Fault-tolerance pattern and switching protocol for multiple hot and cold standby redundancies |
US9979617B1 (en) * | 2014-05-15 | 2018-05-22 | Amazon Technologies, Inc. | Techniques for controlling scaling behavior of resources |
US10212220B2 (en) | 2013-12-20 | 2019-02-19 | Facebook, Inc. | Self-adaptive control system for dynamic capacity management of latency-sensitive application servers |
US10489269B2 (en) * | 2016-07-22 | 2019-11-26 | Walmart Apollo, Llc | Systems, devices, and methods for generating terminal resource recommendations |
US11108685B2 (en) | 2019-06-27 | 2021-08-31 | Bank Of America Corporation | Intelligent delivery of data packets within a network transmission path based on time intervals |
US11553047B2 (en) * | 2018-11-30 | 2023-01-10 | International Business Machines Corporation | Dynamic connection capacity management |
US11797287B1 (en) | 2021-03-17 | 2023-10-24 | Amazon Technologies, Inc. | Automatically terminating deployment of containerized applications |
US11853807B1 (en) * | 2020-12-01 | 2023-12-26 | Amazon Technologies, Inc. | Cluster scaling based on task state information |
US11989586B1 (en) | 2021-06-30 | 2024-05-21 | Amazon Technologies, Inc. | Scaling up computing resource allocations for execution of containerized applications |
US11995466B1 (en) | 2021-06-30 | 2024-05-28 | Amazon Technologies, Inc. | Scaling down computing resource allocations for execution of containerized applications |
US12190144B1 (en) | 2020-06-22 | 2025-01-07 | Amazon Technologies, Inc. | Predelivering container image layers for future execution of container images |
Families Citing this family (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2016081119A (en) * | 2014-10-10 | 2016-05-16 | 富士通株式会社 | Information processing system, control method thereof, and control program of control apparatus |
US10282236B2 (en) * | 2015-04-21 | 2019-05-07 | International Business Machines Corporation | Dynamic load balancing for data allocation to servers |
US10171572B2 (en) * | 2016-01-20 | 2019-01-01 | International Business Machines Corporation | Server pool management |
CN107707583B (en) * | 2016-08-08 | 2020-11-17 | 环旭电子股份有限公司 | Cloud data transmission system and dynamic distribution method thereof |
US10102085B2 (en) * | 2016-08-25 | 2018-10-16 | GM Global Technology Operations LLC | Coordinated multi-mode allocation and runtime switching for systems with dynamic fault-tolerance requirements |
CN107819605A (en) * | 2016-09-14 | 2018-03-20 | 北京百度网讯科技有限公司 | Method and apparatus for the switching server in server cluster |
US10182033B1 (en) * | 2016-09-19 | 2019-01-15 | Amazon Technologies, Inc. | Integration of service scaling and service discovery systems |
US10135916B1 (en) | 2016-09-19 | 2018-11-20 | Amazon Technologies, Inc. | Integration of service scaling and external health checking systems |
US10409647B2 (en) | 2016-11-04 | 2019-09-10 | International Business Machines Corporation | Management of software applications based on social activities relating thereto |
CN106713055B (en) * | 2017-02-27 | 2019-06-14 | 电子科技大学 | An energy-saving deployment method for virtual CDN |
US20190182980A1 (en) * | 2017-12-07 | 2019-06-13 | Facebook, Inc. | Server rack placement in a data center |
EP3502890A1 (en) * | 2017-12-22 | 2019-06-26 | Bull SAS | Method for managing resources of a computer cluster by means of historical data |
CN108508743B (en) * | 2018-06-25 | 2021-06-01 | 长沙理工大学 | Novel quasi-PI predictive control method of time-lag system |
EP3612011A1 (en) * | 2018-08-14 | 2020-02-19 | ABB Schweiz AG | Method of controlling cooling in a data centre |
CN109873718A (en) * | 2019-01-23 | 2019-06-11 | 平安科技(深圳)有限公司 | A kind of container self-adapting stretching method, server and storage medium |
US11470176B2 (en) * | 2019-01-29 | 2022-10-11 | Cisco Technology, Inc. | Efficient and flexible load-balancing for clusters of caches under latency constraint |
US11169855B2 (en) * | 2019-12-03 | 2021-11-09 | Sap Se | Resource allocation using application-generated notifications |
CN113407297B (en) * | 2020-03-17 | 2023-12-26 | 中国移动通信集团浙江有限公司 | Container management method and device and computing equipment |
WO2022018466A1 (en) * | 2020-07-22 | 2022-01-27 | Citrix Systems, Inc. | Determining server utilization using upper bound values |
US11711282B2 (en) | 2020-12-16 | 2023-07-25 | Capital One Services, Llc | TCP/IP socket resiliency and health management |
US11632432B2 (en) | 2021-06-09 | 2023-04-18 | International Business Machines Corporation | Dynamic overflow processing in a multi-user computing environment |
CN114039854A (en) * | 2021-10-26 | 2022-02-11 | 北京航天科工世纪卫星科技有限公司 | Satellite dynamic bandwidth self-adaptive adjusting method based on PID algorithm |
CN114327023B (en) * | 2021-12-30 | 2023-08-15 | 上海道客网络科技有限公司 | Energy saving method, system, computer medium and electronic equipment of Kubernetes cluster |
CN118093301A (en) * | 2022-11-28 | 2024-05-28 | 中兴通讯股份有限公司 | Method and device for regulating temperature of server cluster |
US12164966B1 (en) * | 2023-07-12 | 2024-12-10 | Snowflake Inc. | Dynamic task allocation and datastore scaling |
CN117376423B (en) * | 2023-12-08 | 2024-03-12 | 西南民族大学 | Deep learning reasoning service scheduling method, system, equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100050171A1 (en) * | 2008-08-21 | 2010-02-25 | Vmware, Inc. | Resource management system and apparatus |
US20110208875A1 (en) * | 2010-02-24 | 2011-08-25 | Crescendo Networks Ltd. | Reducing energy consumption of servers |
US20140059367A1 (en) * | 2010-11-04 | 2014-02-27 | International Business Machines Corporation | Saving power by managing the state of inactive computing devices |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9667498B2 (en) | 2013-12-20 | 2017-05-30 | Facebook, Inc. | Self-adaptive control system for dynamic capacity management of latency-sensitive application servers |
-
2014
- 2014-08-01 US US14/450,148 patent/US9667498B2/en active Active
-
2017
- 2017-04-21 US US15/493,532 patent/US10212220B2/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100050171A1 (en) * | 2008-08-21 | 2010-02-25 | Vmware, Inc. | Resource management system and apparatus |
US20110208875A1 (en) * | 2010-02-24 | 2011-08-25 | Crescendo Networks Ltd. | Reducing energy consumption of servers |
US20140059367A1 (en) * | 2010-11-04 | 2014-02-27 | International Business Machines Corporation | Saving power by managing the state of inactive computing devices |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10212220B2 (en) | 2013-12-20 | 2019-02-19 | Facebook, Inc. | Self-adaptive control system for dynamic capacity management of latency-sensitive application servers |
US9979617B1 (en) * | 2014-05-15 | 2018-05-22 | Amazon Technologies, Inc. | Techniques for controlling scaling behavior of resources |
US20170277607A1 (en) * | 2016-03-23 | 2017-09-28 | GM Global Technology Operations LLC | Fault-tolerance pattern and switching protocol for multiple hot and cold standby redundancies |
US9952948B2 (en) * | 2016-03-23 | 2018-04-24 | GM Global Technology Operations LLC | Fault-tolerance pattern and switching protocol for multiple hot and cold standby redundancies |
US10489269B2 (en) * | 2016-07-22 | 2019-11-26 | Walmart Apollo, Llc | Systems, devices, and methods for generating terminal resource recommendations |
US11553047B2 (en) * | 2018-11-30 | 2023-01-10 | International Business Machines Corporation | Dynamic connection capacity management |
US11792275B2 (en) | 2018-11-30 | 2023-10-17 | International Business Machines Corporation | Dynamic connection capacity management |
US11108685B2 (en) | 2019-06-27 | 2021-08-31 | Bank Of America Corporation | Intelligent delivery of data packets within a network transmission path based on time intervals |
US12190144B1 (en) | 2020-06-22 | 2025-01-07 | Amazon Technologies, Inc. | Predelivering container image layers for future execution of container images |
US11853807B1 (en) * | 2020-12-01 | 2023-12-26 | Amazon Technologies, Inc. | Cluster scaling based on task state information |
US11797287B1 (en) | 2021-03-17 | 2023-10-24 | Amazon Technologies, Inc. | Automatically terminating deployment of containerized applications |
US11989586B1 (en) | 2021-06-30 | 2024-05-21 | Amazon Technologies, Inc. | Scaling up computing resource allocations for execution of containerized applications |
US11995466B1 (en) | 2021-06-30 | 2024-05-28 | Amazon Technologies, Inc. | Scaling down computing resource allocations for execution of containerized applications |
Also Published As
Publication number | Publication date |
---|---|
US20170223100A1 (en) | 2017-08-03 |
US10212220B2 (en) | 2019-02-19 |
US20150180719A1 (en) | 2015-06-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10212220B2 (en) | Self-adaptive control system for dynamic capacity management of latency-sensitive application servers | |
US8104041B2 (en) | Computer workload redistribution based on prediction from analysis of local resource utilization chronology data | |
US9715397B2 (en) | Methods and apparatuses for controlling thread contention | |
KR101624765B1 (en) | Energy-aware server management | |
US9618997B2 (en) | Controlling a turbo mode frequency of a processor | |
US8001403B2 (en) | Data center power management utilizing a power policy and a load factor | |
Zhang et al. | Dynamic energy-aware capacity provisioning for cloud computing environments | |
Liu et al. | Sleepscale: Runtime joint speed scaling and sleep states management for power efficient data centers | |
CN103823718B (en) | Resource allocation method oriented to green cloud computing | |
CN110832434B (en) | Method and system for frequency regulation of a processor | |
US9323301B2 (en) | Computing system voltage control | |
WO2016171950A1 (en) | Multivariable control for power-latency management to support optimization of data centers or other systems | |
Wang et al. | An energy-efficient power management for heterogeneous servers in data centers | |
Kant et al. | Enhancing data center sustainability through energy-adaptive computing | |
Bergamaschi et al. | Data center power and performance optimization through global selection of p-states and utilization rates | |
CN113946428B (en) | Processor dynamic control method, electronic device and storage medium | |
Liu et al. | Fast power and energy management for future many-core systems | |
US9389919B2 (en) | Managing workload distribution among computer systems based on intersection of throughput and latency models | |
Xiong et al. | Online power-aware deployment and load distribution optimization for application server clusters | |
Kant | Distributed energy adaptive computing | |
Xiong et al. | Power-aware dynamic deployment under CPU utilization guarantee for application server cluster | |
Andro-Vasko et al. | Online competitive control of power-down systems with adaptation | |
Xiong et al. | Online real-time energy consumption optimization with resistance to server switch jitter for server clusters | |
Kant | Supply and demand coordination in energy adaptive computing | |
Lo | Reconciling High Efficiency with Low Latency in the Datacenter |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: FACEBOOK, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WU, QIANG;KUMAR, SANJEEV;KADLOOR, SACHIN;REEL/FRAME:036386/0599 Effective date: 20141106 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
AS | Assignment |
Owner name: META PLATFORMS, INC., CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:FACEBOOK, INC.;REEL/FRAME:058175/0211 Effective date: 20211028 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |