• No results found

Reducing the Annual Cost of a Telecommunications Data Center

N/A
N/A
Protected

Academic year: 2021

Share "Reducing the Annual Cost of a Telecommunications Data Center"

Copied!
12
0
0

Loading.... (view fulltext now)

Full text

(1)

Introduction

The facilities managers for a large internet service provider have known for a while that one of their data centers is over-cooled. Over-cooling translates into unnecessary en-ergy consumption and expense, so the man-agers knew that some changes to the data center were needed. Several options were possible, such as shutting down one or more of the cooling units. Many questions

arose, however. For example, what would be the consequences of shutting down a CRAC? Would it be possible to shut down two? If so, which two? Could the supply temperatures be increased? To answer these questions, the operators decided to use computational fluid dy-namics (CFD), a tool that uses airflow predictions to demonstrate how effec-tively the cooling air reaches – and re-moves heat from - the equipment in the room. Using CFD-based modeling tech-niques for quantifying the efficiency of the data center, different energy-saving

strategies can be compared before physical changes to the room are made.

Problem Description

The CFD modeling is done using CoolSim software from Applied Math Modeling. The raised-floor data center is 4720 sq. ft. in size (80 ft x 59 ft) and makes use of a ceiling ple-num return. The 2 ft supply pleple-num, 15 ft

Reducing the Annual Cost of a

Telecommunications Data Center

By Paul Bemis and Liz Marshall, Applied Math Modeling Inc., Concord, NH

March, 2011

Applied Math Modeling White Paper

(2)

room height, and 5 ft ceiling plenum com-bine to form a space that is 22 ft high. (See Figure 1.) The data center contains ten rows of equipment with either 17 or 21 racks per row. The heat loads in the racks vary from 10W up to 7.8 kW, as shown in Figure 2. The total IT heat load in the room is 363 kW with a density of 6.9 W/sq. ft.

Five Liebert DS105AU CRACs are posi-tioned on opposing walls for a total of ten CRACs in the room. These direct expansion (DX) cooling units are controlled in zones, each of which consists of two opposing CRACs (Figure 3). The data center was originally designed for a 1 MW heat load, but in its current use, the IT load is only about one-third of that value (363 kW). As-suming a 20°F temperature rise across all racks, 57,020 CFM of cooling air is needed for the present heat load. Each CRAC

deliv-ers 14,500 CFM, so with all ten CRACs operating, a total of 145,000 CFM is generated, which is almost three times the needed amount. Thus in normal operating mode, two zones are disabled, so that only six of the ten CRACs are in use. The disabled cooling zones, 2 and 4, are shown in Figure 3. For this con-figuration, the six active CRACs supply 87,000 CFM of cooling air, which is about 50% more than required for the heat load. The total cooling capacity of the six CRACs exceeds the heat load by approximately the same amount. Measurements of supply air tem-peratures place the range between 50°F and 68°F. Return temperatures are also available for comparison with values predicted by the CFD simulation. Once a CFD model is cre-ated and validcre-ated, alternative energy optimi-zation scenarios can be investigated, includ-ing disablinclud-ing additional CRACs, hot or cold aisle containment, and modification to the CRAC cooling parameters. In this study, the first and third of these options will be consid-ered.

Preliminary Results for the Baseline Case

The first CFD model created for this study (the baseline case) corresponds to the data center operating in normal mode, with six CRACs operational, as shown in Figure 3. Boundary conditions for the simulation

(3)

clude the heat load and flow rate associated with each rack and the supply temperature and flow rate associated with each CRAC. The measured supply and return tempera-tures are shown in Table 1, along with the predicted return temperatures

from the CFD model. In all but one case, the predicted tempera-tures are below the measured values. Often, when the CFD model under-predicts the return temperature on every CRAC, it means that either the heat loads are under-represented or the CRAC flowrates are too high. In this data center, it could be one of these factors or a combi-nation of both, but the effect is small, since the error is below

5% in all cases. A validation of the preliminary model, such as this, is an important step if modifications are to be made. A demonstration that the base model accu-rately captures the physics to within an acceptable margin of error means that it can be used to correctly predict trends if one or more changes are made.

Contours of rack inlet tem-perature for the baseline case are shown in Figure 4. The temperatures all fall below the ASHRAE recommended maximum value of 80.6°F. The maximum rack inlet temperature is a good metric to follow when comparing cooling strategies. For an over-cooled data center, however, the minimum rack inlet temperature is also im-portant to follow. According to the ASH-RAE guidelines, the rack inlet temperature

Figure 3: Five cooling zones, each of which consists of a pair of opposing CRACs; under normal operating conditions, Zones 2 and 4 are shut down

Zone # 1 2 3 4 5 CRAC #1 2 3 4 5 6 7 8 9 10 CRAC # Measured Supply Tempera-ture (F) Measured Return Tempera-ture (F) Predicted Return Tempera-ture (F) Error (%) 1 51.6 70.0 67 4.3 2 50.7 69.8 67 4.0 5 53.5 71.2 69 3.1 6 53.8 71.6 69 3.6 9 67.9 75.4 74 1.9 10 62.4 73.7 74 2.3

(4)

should not go below 64.4°F, al-though the allowed minimum value is 59°F. For the baseline case, at least half of the racks have inlet temperatures that are too cold. Data Center Metrics

PUE and DCIE

A number of metrics have been de-fined in recent years that can be used to gauge the efficiency of a data center. Metrics can also be used to test whether changes to the data center bring about reduced (or increased) power demands. One of the most popular metrics is the

Power Utilization Effectiveness, or PUE, de-fined as the ratio of total facility power to total IT power.

The total facility power includes that needed to run the CRACs (chillers and fans), IT equipment, battery backup systems, lighting, and any other heat-producing devices. Thus PUE is always greater than 1, but values that are close to 1 are better than those that are not. A typical value is 1.8, a good value is 1.4, and an excellent value is 1.2.

COP

The largest contributor to the total facility power is the cooling system, comprised of the heat exchangers (chillers, condensers and cooling fluid pumps, for example) and fans. The heat exchanger portion of the CRAC is a heat pump, whose job it is to move heat from one location (inside the room) to another

(outside). Heat pumps are rated by their coef-ficient of performance, or COP. The COP is the ratio of the heat moved by the pump to the work done by the pump to perform this task. The work done by the pump encom-passes the heat exchanger work and does not include the CRAC fans. The COP can also be expressed as a power ratio, making use of the rate at which heat is moved (in Watts, say) or work is done (again, in Watts).

Using more practical terms, the COP is the ratio of the total room heat load to the power needed to run the chillers, condensers and other heat rejection equipment. For data cen-ter cooling equipment, COP values range from 2 to 5, with larger numbers correspond-ing to better heat pumps. Note that an alter-native definition of COP could be made for the data center as a whole, rather than just for the heat rejection system. In this alternative

Figure 4: Rack inlet temperatures for the baseline case, in which the CRACs in Zones 2 and 4 are inactive

(5)

definition, the work done would include the power used to run the CRAC fans. For the purposes of this paper, the traditional defini-tion of COP is used.

Return Temperature IndexTM

The Return Temperature Index, a trademark of ANCIS Inc. (www.ancis.us), is a percent-age based on the ratio of the total demand air flow rate to the total supply air flow rate.

Alternatively, it can be computed using the ratio of the average temperature drop across the CRACs to the average temperature rise across the racks. In either case, a value of 100% indicates a perfectly balanced airflow configuration, where the supply equals the demand. Values with RTI < 100% have ex-cess cooling airflow, so short-circuiting across the CRACs exists. Values with RTI > 100% have a deficit of cooling air, so there is recirculation from the rack exhausts to the rack inlets. It is best to have RTI values that are less than, but close to 100%.

Rack Cooling Index®

The Rack Cooling Index, a registered trade-marked of ANCIS Inc., is computed using the average number of degrees that the rack inlet temperature falls above (or below) the ASHRAE recommended temperature range (64.4°F to 80.6°F). One index is defined for temperatures above the range (RCIHI) and

another for temperatures below the range (RCILO). For the high side:

where

TR_HI is the ASHRAE recommended

maximum temperature (80.6°F)

TA_HI is the ASHRAE allowed maximum

temperature (90°F)

Ti is the maximum inlet temperature on

the ith rack

n is the number of racks with Ti > TR_HI

N is the total number of racks in the sam-ple

The index on the low side is similarly de-fined:

where

TR_LO is the ASHRAE recommended

minimum temperature (64.4°F)

TA_LO is the ASHRAE allowed minimum

temperature (59°F)

Ti is the minimum inlet temperature on

the ith rack

n is the number of racks with Ti < TR_LO

N is the total number of racks in the sam-ple

(6)

Metrics for the Baseline Data Center Using the metrics defined above, the baseline data canter configuration can now be evalu-ated using a combination of measurements and CFD results. Because the cooling system is controlled in five separate zones, the facil-ity managers have been able to measure the electric power needed to run the heat rejec-tion system (the CRAC power minus the fans). The measured value, 269.1 kW, is a snapshot of one day’s power demand for the three normally functioning zones. They have also determined that each CRAC uses 8kW to run its fan, so the total CRAC fan power is 48 kW. Combining these, the total measured cooling power is 269.1+48= 317.1kW. The total rack heat load in the room is 363 kW, and this includes PDUs, which are rack mounted. If 5% of this value is assumed for additional support infrastructure (lights, etc.), the total IT heat load in the room is

363+18.2=381.2kW.

Taking the most conservative approach de-scribed above, the CRAC fan power will be included in the room heat load. Assuming that all of the CRAC fan power will eventu-ally be converted to heat, the total room heat load becomes 381.2 + 48 = 429.2kW.

The ratio of the total room heat load to the power needed to run the heat rejection sys-tem (269.1 kW) is the COP:

This value is low, indicating that the data center could support more equipment for the

amount of power being delivered to the cool-ing system. Alternatively, it suggests that shutting down one or more of the CRACs is an option to be considered.

To calculate the PUE, the total facility power is needed. This is simply the total cooling power (317.1 kW) plus the total room heat load (429.2 kW), or 746.3 kW.

Dividing the total facility power by the total IT heat load (382.1 kW), the PUE is:

The return temperature index can be com-puted using the boundary conditions used for the CRACs and IT equipment. The total sup-ply air flow from the CRACs is 87,000 CFM. The demand air flow from the IT load (363 kW) is 57, 020 CFM. Assuming the addi-tional 5% of heat load, the demand air flow should be adjusted by 5%, bringing the total to 59,871 CFM. The ratio of the adjusted demand air flow to the supply air flow is:

Consistent with earlier calculations, the RTI also suggests that the data center is cooled. The degree to which it is over-cooled is indicated by the high and low rack cooling indices. An analysis of the CFD re-sults using Eq. (4) and (5) yields

(7)

and

A value of 100% for RCIHI means that no racks have inlet temperatures above the rec-ommended maximum value. A value less than 0 for RCILO indicates that the average number of degrees below the recommended minimum value is greater than the number of degrees between the recommended and al-lowable minimum values. In other words, the inlet temperatures on the whole are much too cold. The metrics calculated for the baseline case are summarized in Table 2. Estimating the Baseline Data Center Costs

Before considering changes to the data cen-ter, the cost of running the facility in its pre-sent state is estimated. To determine the cost, the total facility power is needed along with the cost of electricity. Using 746.3kW as the total facility power and

$0.09 as the cost per kWh, the estimated annual cost of run-ning the data center is about $588,300, which is within 10% of the actual cost. While this value is not based on the CFD analysis, a similar calculation can be done for proposed modi-fications to the data center. Thus while a CFD analysis can be used to judge the efficacy of each design, the companion en-ergy calculation can be done to estimate the cost savings.

Modifying the Design

Disabling Zones

As a first step, each of the three active zones is disabled in a series of trials. These trials are solved concurrently on separate nodes at CoolSim’s remote simulation facility (RSF) using the CRAC Failure Analysis model. Trial 1 has Zones 1, 2, and 4 disabled, Trial 2 has Zones 2, 3, and 4 disabled, and Trial 3 has Zones 2, 4, and 5 disabled. For each of these trials, the maximum rack inlet tempera-ture is, at most, 75°F, well below the ASH-RAE recommended value of 80.6°F. Trial 1 has the highest rack inlet temperature, and contours for all of the racks for this case are shown in Figure 5. Note that when the left two zones are shut down, the temperature on that side of the room increases. Pathlines of the supply air in the plenum (Figure 6) show that jets from the opposing CRACs collide and deflect the cooling air to the left side of the room, keeping the rack temperatures in range. These trials illustrate that the simplest

% 0 

LO

RCI (10)

(8)

modification to the data center - shutting down one of the zones - will not adversely impact the equipment.

The data center metrics computed for Trial 1 show a great deal of improvement in energy efficiency and an associated cost savings. Because the amount of power needed to run the cooling system and CRAC fans is two-thirds of the earlier value, the total cooling power is reduced to 211.4 kW and the COP is increased to 2.3. The total facility power is reduced to 624.6 kW, leading to a decrease in the PUE to 1.64. The rack temperature index increases from 69% to 103%. Ideally, the RTI should be below 100%, but because an additional 5% of infrastructure equipment is included in the total heat load, the demand air flow rate is assumed to have a corre-sponding increase, which may be too much. (Additional heat from overhead lamps may be lost through the ceiling, for example.) The RCIHI index remains at 100%, indicating that

there are still no racks with tem-peratures above the recom-mended value. The RCILO index remains below 0, but only slightly. Thus while the rack inlet temperatures are not as cold as before, they are still colder than they need to be. Owing to the drop in the total facility power, the cost to run the data center also drops. The new an-nual cost is estimated to be $492,400, representing a savings of about $95,900. These results are summarized in Table 2.

Increasing the Supply Temperatures

One of the dominant factors in reducing data center energy consumption is air supply tem-perature. For every 1.8°F increase in supply air temperature, the efficiency of the heat pump improves by 3.5% (Design Considera-tions for Datacom Equipment Centers, At-lanta: ASHRAE, 2005). Further, by increas-ing the supply air temperature, the window of “free cooling” opens, since air-side or wa-ter-side economizers can be used on more days of the year. Economizers improve the efficiency of the cooling system by making use of the reservoir of outside air in the heat rejection process. If the temperature differ-ence between the supply air and outside air is reduced, the chillers and condensers in the heat rejection system can be augmented or even replaced by economizers, resulting in huge gains in the COP.

Because the data center is initially over-cooled, it is a prime candidate for increased supply temperature. Thus, as a second

(9)

fication, all of the supply temperatures are increased to 65°F. Recall that in the original configuration, measured temperatures were

used for the CRAC boundary conditions and all but two were below 60°F. Increasing all of the supply temperatures to 65°F should

Baseline Case Trial 0 Baseline Case Trial 1

IT Heat Load (kW) 363 363

Total IT Heat Load (kW) 381.2 381.2

CRAC Cooling Power (kW) 269.1 179.4

CRAC Fan Power (kW) 48 32

Total Room Heat Load (kW) 429.2 413.2

Total Cooling Power (kW) 317.1 211.4

Total Facility Power (kW) 746.3 624.6

COP 1.59 2.30

PUE 1.96 1.64

Total Supply Air Flow (CFM) 87,000 58,000

Total Demand Air Flow (CFM) 59,871 59,871

RTI (%) 69 103 RCIHI (%) 100 100 RCILO (%) <0 <0 Cost of Electricity ($/kW-Hr) 0.09 0.09 Annual Cost ($) 588,300 492,400 Savings ($) 95,900

(10)

alleviate the problems suggested by the RCILO index and improve the COP, which will save a significant amount of power. To properly assess such a proposed change, a CFD analysis is needed to determine if hot spots will form, impacting the performance at the upper end of the recommended range. Contours of the rack inlet temperatures for Trial 0 of this scenario – with Zones 2 and 4 disabled – are shown in Figure 7. The mini-mum and maximini-mum values for the contours are shown in the key on the left. Because the range (65°F to 78°F) falls with the ASHRAE recommended range (64.4°F to 80.6°F), all racks satisfy the condition and the RCIHI and RCILO values are both 100%. The average supply temperature for the baseline case with only two zones disabled is 57°F. Increasing the average supply temperature to 65°F (an 8°F increase) corresponds to a 15% increase in the COP, so the new value for this con-figuration is 1.84.

The previous analysis showed, however, that disabling an addi-tional zone results in potential savings of about $95,000 a year. Thus a CRAC failure analysis should be done with the 65°F sup-ply temperature boundary condi-tion to make sure that the rack inlet temperatures aren’t too high if one of the zones is disabled. In Figure 8, the rack inlet tempera-tures are shown for the trial where the maximum rack inlet tempera-ture is highest. It is again Trial 1 in which Zones 1, 2, and 4 are disabled. Based on the maximum value shown in the figure, some of the racks have temperatures above the ASHRAE recommended maximum of 80.6°F. A calculation of RCIHI supports this finding, with a value of 97.3%. RCI values between 95% and 100% are considered good for a data center. The value suggests that the average deviation in temperature above the recommended value is small, however, and this is indeed borne out by the detailed re-sults. Indeed, all racks have inlet tempera-tures that are well below the ASHRAE al-lowable maximum value (90°F). As ex-pected, RCILO has a value of 100%. With 60°F as the average supply temperature for Trial 1 in the baseline case, the increase in supply temperature for this case (5°F) corre-sponds to an increase in the COP to 2.53. Increasing the supply temperatures to 68°F results in RCIHI and RCILO indices of 100% for Trial 0. Furthermore, the COP increases to 1.94. For Trial 1, RCILO remains at 100%, but RCIHI drops to 84%. Even so, none of

(11)

the rack inlet temperatures goes above the ASHRAE allowable value. The COP increases to 2.66 for this sce-nario.

The total facility power can be computed for each of these cases, and from it, the annual cost of running the data center. A summary of COP values and associated costs for the various trials discussed in this section is presented in Table 3. Comparison of the Trail 0 results shows that between $28,500 and $37,000 can be saved by increasing the supply temperatures. Comparison of the Trial 1 results shows that an additional

Figure 8: Rack inlet temperatures corre-sponding to 65°F CRAC supply temperatures for Trial 1 where Zones 1, 2, and 4 are dis-abled

Baseline Trial 0 Supply 65°F Trial 0 Supply 68°F Trial 0

Average TSUPPLY (°F) 57 65 68

COP 1.59 1.84 1.94

Total Facility Power (kW) 746.3 710.0 698.8

Annual Cost ($) 588,300 559,800 551,000

Savings ($) 28,500 37,300

Baseline Trial 1 Supply 65°F Trial 1 Supply 68°F Trial 1

Average TSUPPLY (°F) 60 65 68

COP 2.3 2.53 2.66

Total Facility Power (kW) 624.6 608.7 600.4

Annual Cost ($) 492,400 479,900 473,400

Savings ($) 12,500 19,000

(12)

$12,500 to $19,000 can be saved by dis-abling one of the zones. Applying the sav-ings computed in Tables 2 and 3, the annual cost of the data center could be cut by at least $110,000 by disabling one of the zones and increasing the supply temperature to 65°F. Summary

Computational fluid dynamics and data cen-ter metrics have been used to study a data center for which a number of measurements were available. The ten CRACs in the room are controlled using five zones, with two CRACs in each zone. Because the heat load is less than the original planned value, the data center currently operates with only three

References

Related documents

Association between preoperative obstructive sleep apnea and Association between preoperative obstructive sleep apnea and preoperative positive airway pressure with

This model raises four challenges for research in usability evaluation: (a) what forms of design products give the best evaluations; (b) how do we most effectively focus an

Further potential limitations of geographically concentrated livestock production include (1) a possible intensification of animal health problems for cattle living in

Bullet Points- If we look at the fact file about Asian elephants again, we will notice that under the subheading of threats, bullet points have been used to break down the

En respuesta a todo lo planteado anteriormente, nuestro proyecto (ARMIF-2015) pretende analizar cómo se están construyendo competencias vinculadas al liderazgo, coordinación interna

5 Priznato: područje može biti proglašeno od države ili različitih organizacija ili skupina ljudi, no kao takvo mora biti na neki način priznato, primjerice navedeno u Svjetskoj

Apart from the uniform pricing constraint it considers the unregulated benchmark and symmetric price- regulation according to the welfare maximization scenario with only a

The objectives of this study were to identify polymorphism of transforming growth factor β2 (TGF- β2) gene associated with Salmonella pullorum resistance in Indonesian