Looks are deceiving, and in the data center they can be confusing. If you are doing a good job your Power Usage Efficiency (PUE) might get worse!
Energy consumption in the data center is the result of two interactions: the cumulative IT Load from all the information technology hardware combined with the facility power and cooling overhead. If IT does a good job of reducing equipment load by switching off orphan servers and deleting unnecessary data and switching off unused storage, facilities actually will look worse because the load will drop and the power and cooling equipment will be less efficient. It sounds bad, but actually it’s good as the total energy bill will go down, and so will CO2 emmissions.
Because IT refreshes all of its hardware on a three- to five-year cycle and buildings have depreciable lives of 25 years, IT has a much greater and more immediate chance to have an impact on data center energy by buying more efficient hardware products than anything facilities can do. Moreover, whatever cumulative energy savings might occur from facility energy efficiency initiatives over it’s life could be more than offset by the cost of a single data center failure which affects IT availability. This is why experienced data center facility managers are inherently risk averse.
There are five facility initiatives that can improve reliability and reduce facility overhead. Because they simultaneously address both reliability and energy consumption, the inherent risk associated with these initiatives is often justified. Each of the five initiatives is associated with cooling because the efficiency of mechanical systems is fundamentally different than electrical systems.
Electrical systems are linear in the sense that once installed, their efficiency can’t be greatly changed except by changes in the actual load that moves the system up or down on an efficiency curve. This means the only way to improve electrical efficiency is to increase the load or shut the systems down and install components that operate more efficiently. Most data centers are unwilling or unable to take the construction risk of replacing otherwise perfectly good components or systems with more energy-efficient equipment.
Significant mechanical-system efficiency improvement can be accomplished very differently than in electrical systems. Computer-room cooling efficiency can be dramatically improved by adjustment rather than construction. While some risk is involved, the magnitude of this risk can be controlled or limited to certain areas. And because of the inherent non-linearity of temperature and relative humidity, cooling stability is actually greatly improved by increasing energy efficiency.
This is like a car engine running on only three cylinders versus the fuel-efficiency benefits of getting it to run on all six. With only three cylinders working, such an engine would run rough and belch tremendous amounts of smoke so you would intuitively know something is dramatically wrong. Inefficient cooling doesn’t give warnings other than hot spots, and the most intuitive solutions like adding more cooling are likely to make things worse at great expense and construction risk.
The following five facility initiatives will increase cooling reliability and stability, reduce operating expenses and cut capital costs by tuning up the computer-room cooling investment you already own:
Correctly set the temperature and relative humidity set control points on the cooling units. This is the lowest-hanging fruit. The only environmental conditions in a computer room that count are at the air intake of the computer hardware. The fact that the hardware exhaust hot air is desirable. Cold intake air is actually bad for reliability because it causes water to condensate inside the hardware. Many sites have air colder than 59°F, which wastes energy and can result in condensation. A common myth is that colder temperatures provide thermal ride-through in the event of a power failure. This myth is unsupported by science. As heat densities have risen, heavily loaded computer rooms will now overheat within minutes and require uninterruptible cooling, not too-cold temperatures. The cold aisle should be cold, but coldest is not best.
Virtually all cooling units control temperature and relative humidity based on their return air. The return air temperature should be hot, not cold. It’s common to find cooling units where the control point is set for the wrong return condition. The user must compensate for the difference between supply and return. The temperature of the air leaving the cooling unit should be within a few degrees of the desired hardware air-intake temperature. A good equipment intake temperature is 72°F. This would mean the desired cooling unit leaving temperature would be 70°F. The cooling unit setpoint should be the cold aisle temperature plus the difference between the cold and hot aisle.
Have someone record the setpoints of your cooling unit inventory and report what they are. The lower they are from the desired setpoint, the greater the opportunity for significant energy savings. However, do not change the setpoint without first doing a computer room cooling tune-up as outlined in the next four initiatives.
–Determine the number of cooling units running as compared to the actual heat load. Our research has determined the typical computer room on average has three times more cooling running than is required by the actual heat load. Moreover, the computer rooms with the most excessive cooling had the highest percentage of hot spots. Reducing the amount of cooling (not including redundant units) to the heat load actually increases cooling stability and quality while saving energy. However, make sure you do a complete computer room tune-up before turning off any units. As part of this initiative, also change the controls on the cooling unit so the blower turns off if cooling fails. Without this change, the cooling unit will continue to pump increasingly hot air into the physical space it controls. The blower must be turned off before adjacent redundant units can provide cooling. This is a basic control error in virtually every computer room.
–Validate that all cooling units are capable of delivering rated capacity. The typical cooling unit cannot deliver capacity because of poor installation or maintenance. Common examples include incorrect piping where supply and return have been reversed for the last 20 years, filters that are plugged or incorrectly selected, throttling valves that are stuck, compressors that are undercharged with refrigerant, belts that are slipping, pulley sheave sizes that were incorrectly selected, etc. A third-party independent of the current contractor should perform this investigation.
–Deliver cold air where it is most needed. Computer-room cooling typically is accomplished by the random mixing of hot and cold air. While this process can work, it is inefficient and cooling unit return-air setpoints are set low to compensate. For computer rooms with raised floors, more than 60% of the available cold air is wasted. This is the result of too many perforated tiles being installed in the wrong places. In addition, cable cutouts in the raised floor must be sealed to prevent the uncontrolled escape of cold air into the hot aisle.
For maximum cooling efficiency, the hot aisle should be uncomfortably hot. The temperature difference between the cold and hot aisle is a very simple diagnostic. Have measurements made of cold aisle and hot aisle temperatures. A difference of at least 10°C is desirable. A small difference indicates significant air mixing, reduces cooling capacity and efficiency. This mixing must be corrected by sealing cable cutouts and removing perforated tiles from the hot aisle. Once this is done, under-floor static pressure should go up dramatically. If it doesn’t, look for cold air leakage through the floor or walls.