Optimize the Data Center with AUDIT-BUDDY™
2 0 2 W o r c e s t e r S t r e e t , U n i t 5 , N o r t h G r a f t o n , M A 0 1 5 3 6
w w w . p u r k a y l a b s . c o m | i n f o @ p u r k a y l a b s . c o m | 1 . 7 7 4 . 2 6 1 . 4 4 4 4
Introducing AUDIT-BUDDY™
Executive Summary
Introduction
Data centers require tightly controlled temperature and humidity levels to ensure optimal server functionality. The air going into the servers must be monitored to guarantee that it meets the equipment manufacturer’s specifications or risk server failure, energy waste and increased operating expenses. Data center engineers can use AUDIT-BUDDY™ to capture and report real-time environmental data on inlet air quality and use this information to take corrective action to manage airflow and save on cooling costs.
AUDIT-BUDDY™ System
AUDIT-BUDDY™ is a portable temperature and humidity monitor that measures true air quality in Data Centers (Figure 1). Each system consists of three independent temperature and humidity (TH1) Modules and an adjustable Carbon fiber rod. Engineers can place the TH1 Modules at three different heights on the adjustable Carbon fiber rod (up to 84” - 214cm - 48U) to measure inlet air temperature across an entire server rack. AUDIT-BUDDY™ is battery powered and weighs 5.5lbs (2.5Kg). It can measure temperatures at multiple servers simultaneously, and then be moved across an aisle to measure across multiple racks. The weighted triangular base allows the AUDIT-BUDDY™
to be safely placed as close as 1 inch (2.54cm) away from the server rack, allowing the patent pending fan design to draw in air and quickly adjust to the thermal ambient and report the inlet air temperature accurately.
AUDIT-BUDDY™ is a cost-effective and intuitive tool that captures both real-time and long-term scans of the environmental conditions of a data center, without the need for any infrastructure modifications or downtime. Engineers can read data on TH1 Module screen (temp, humidity Hi/Lo) or transfer data via USB flash drive (or direct PC connect) to a PC or MAC® Excel Program. Purkay Labs supplies a free Excel Macro that automates post processing of data, generates reports, plots and statistics.
Four Factors in Environmental Monitoring
The core function of environmental monitoring is to ensure that optimal temperature and humidity levels exist throughout the data hall. This involves several factors described below, some of which are often overlooked :
Server: The air going into the servers must be monitored to guarantee that it meets the equipment
manufacturer’s specifications or risk server failure, energy waste and increase operating expenses. One should ideally measure both the server intake and the exhaust.
Rack Height: Every server rack contains multiple servers, and the temperature will vary greatly across the rack
new/old servers and partially populated racks. One must measure at different racks to account for variations across an aisle.
Time: IT Loads vary throughout the day in an active data center. The IT Load in a financial services firm may be
much higher during the trading day than at night. The thermal characteristics can change due to server loading, cooling algorithms, virtualization and airflow variances. One must measure over an extended period of time to account of load variations to get a true picture of the temperature and humidity variance.
These four factors make the task of comprehensive environmental monitoring a challenging prospect. Several methods exist to monitor the facility. These are discussed below.
Option 1 : Comprehensive and Permanent Monitoring System
One method involves equipping every single server in every rack with protruding sensors that measure the server intake and exhaust. Such a monitoring system would cover tens of thousands of points in a single data hall and can yield many gigabytes worth of data every day. This requires a sophisticated central system to process and parse the volumes of data in an intelligent manner, and trigger alarms when conditions are deemed unfavorable for a particular server for the engineer to take corrective action.
Although this approach is comprehensive, the complexity dictates a long, involved set-up process, system downtime during installation, extensive vetting process to prevent false alarms, and the training and hiring of operators who can monitor the control station 24X7. Moreover, these systems require significant investment, upwards of a million dollars, plus the expense of hiring skilled personnel to operate them. As a result, these systems are often found exclusively in the Tier 4 and higher facilities.
Option 2: Non-Permanent Monitors
It is difficult for most data centers to justify such an investment for a permanent, centralized monitoring. Typically, these facilities have permanent sensors every couple of racks to get a “feel” for the environment. Engineers walk around the facility with a hygrometer or wireless sensors to sample “suspected problem” areas. These methods of monitoring are inadequate since they are subject to human element, coupled with inconsistent monitoring, compromising the accuracy of the results in the white space. Wireless sensors are also an option. They however require a fairly complex installation to be operational, require additional gateways, and contribute to RF clutter in the white space. In addition, they bring their own logistics issue, namely the need to keeping track of them for battery replacement and interference problems.
Infrared (IR) guns have been considered as an option but these only measure the surface temperature of the server and not the true air temperature. Accuracy is strongly influenced by the surface color. Since minor changes to the temperature can dramatically affect the operating efficiency and budget, data centers need accurate measurements and IR guns are not a viable option.
The larger problem with these non-permanent solutions is that they are only meant to sample “suspected problems” at a single location. They capture the temperature and humidity at a single server, but cannot easily account for the thermal differences across the rack height, across the aisle, across the server or how the thermal conditions change over time.
Option 3: AUDIT-BUDDY™
AUDIT-BUDDY™ is designed to allow the Data Center Engineer with limited budgets to monitor the data center environment without requiring the installation of a complicated monitoring system and the associated overhead of processing gigabytes worth of collected data. This is achieved in three easy steps.
Step 1: Baseline the Facility using QuickScan
The first step in using the AUDIT-BUDDY™ system is to baseline the Data Center using the QuickScan mode. Each
QuickScan takes 20 seconds scans, collecting six temperature and humidity measurements per single location. The
Engineer places AUDIT-BUDDY™ in front of the first server rack in the aisle, takes a 20 second QuickScan, and moves to the next server rack (Figure 2). The process is continued until measurements have been taken at every server rack across every aisle. The QuickScan data is then fed into Purkay Labs’ Contour Map Macro to generate a Thermal Contour Map of the Air Profile for each aisle. A data hall with 40 aisles will have 40 contour maps – representing the state of the Data Center at that time. The whole exercise would be completed in a couple of hours by a single operator.
The elegance in this technique is the collection of maps represents the 3 dimensional view of the air profile temperature of all the aisles of the data center. It is equivalent to having six Hygrometers hanging outside the each rack for every aisle in the data hall and looking at the data from all of them (Figure 3). The Contour Map is much simple way to so. It will point out to the data center engineer specific zones where there might be hot spots or over cooled zones that warrant attention. The arrow in Figure 4 indicates the presence of a hot spot emanating
Step 2: Isolate and Track Problem Areas (LongScan)
The Contour Maps represent a snapshot in time, and do not account for any temperature and humidity changes over time. The engineer should use the information from the QuickScans and the Contour Maps to isolate and focus on the suspect areas. This involves using AUDIT-BUDDY™ in the LongScan mode to confirm that the issue spotted is not a transient one. Simply place the AUDIT-BUDDY™ system in front of the suspect rack and collect the temperature (and humidity) over a period of time (typically a day). The Purkay Labs’ Base Excel Macro will analyze the collected sample and indicate through Trend Graphs that the problem is of a permanent or transient in nature. Figure 5 shows an AUDIT-BUDDY™ system collecting LongScan data in front of a rack.
The combination of QuickScan/Contour Maps and LongScan allows the engineers to quickly and inexpensively assess the state of the data center taking into account the four factors mentioned above. Do note that AUDIT-BUDDY™ design allows the engineer to really measure the true air temperature accurately (±0.9°F/0.5°C) accuracy. The portable design (5.5 lbs. /2.5 Kg) allow it to be transported throughout the data center easily, leaving no footprints after the data collection has been completed.
Step 3: Diagnose Problems (Delta-T) measurement
Once the problem zone has been identified and confirmed to warrant corrective action, AUDIT-BUDDY™ offers a pathway to solve the issue. Most Data Center thermal issues – such as hot spots or overcooling – stem from poor
airflow management. A substantial amount of cold air does not reach the server due to the presence of recirculation and/or bypass airflow. One can make the data center and energy efficient by measuring the extent of recirculation and bypass air flow present for a particular server/rack and correcting the root cause.
AUDIT-BUDDY™ server delta-T measurement technology provides a way to measure and generate metrics to indicate the degree of recirculation and reflow present, and what to do about it. This is achieved by placing one AUDIT-BUDDY™ system at the inlet and one at the outlet of the server, and collecting LongScan data (typically a day) (Figure 6). The collected data is processed using the Delta-T macro. The output indicates the degree of recirculation and bypass airflow present. The macro also suggests certain corrective actions for the engineer to implement. The engineer can implement one or more of the suggestions and repeat the process to make sure the problem has indeed been addressed.
SUMMARY:
In three easy steps, the engineer can quickly to pinpoint, diagnose, and fix airflow issues.
Step 1: Use AUDIT-BUDDY™’s QuickScan mode to take 20 second scans of the aisle to baseline the
facility with Thermal Contour Maps to isolate problem areas.
Step 2: Use the LongScan mode to confirm or deny the extent of the problem over time
Step 3: Use two AUDIT-BUDDY™ systems to get targeted metrics to diagnose the amount of cold air
waste and address airflow problems
AUDIT-BUDDY™ provides a cost effective solution to the Data Center Engineer with limited resources to optimize the data center on an ongoing basis. Data Centers are dynamic entities and the configurations do change. We recommend the engineer repeat the process every quarter, to insure no new problems have been created. Once a new issue is detected, the above process may be repeated to address them, just as easily.
Server Exhaust Server
Intake
ΔT Server
ADVANTAGES OF AUDIT-BUDDY™
Completely Standalone: AUDIT-BUDDY™ is a completely independent monitor. It is battery operated and does not
require a separate computer program to run or read data. Moreover, it does not need to integrate into the facility infrastructure. AUDIT-BUDDY™ is an important tool that engineers can take out when they have a problem or want to run a routine maintenance check, and then put away in storage. It is not meant to be a part of the system. It leaves no footprint or affects the existing infrastructure in any way.
Yields ready-to-use information: Unlike most data loggers or wireless sensors, you do not get volumes of data.
Engineers select what they want to sample, how long they want to collect data for, and how often to collect the data. Contour Maps are an efficient way to baseline the data center aisles. Setting these parameters ensures that engineers get the data the need to suit their needs. If a Colocation owner wants to prove their SLA, they can select a LongScan to sample every 10 minutes for 24 hours and prove that they are providing their SLA metrics.
Accurate: AUDIT-BUDDY™’s unique design provides engineers with information on the air going into the server.
The TH1 module does not need to reach the ambient being measured. The inlet air is measured with two stable stolid-state sensors with ± 0.9°F [0.5°C] temperature accuracy and ± 3% RH humidity accuracy. The solid state sensors design guarantee accuracy and long-term stability. The calibration is guaranteed for 2 years.
Tool to improve Optimize the Data Center: AUDIT-BUDDY™ with the Delta-T Measurement technology provides a
simple, inexpensive way to improve the energy efficiency of the Data Center without a big investment by eliminating cold air waste. It is a perfect complement to facilities with BAS or DCIM systems. It provides the engineer a way to measure air quality in areas with little to no coverage because the sensors in those locations were not specified. AUDIT-BUDDY™ is a simple first step for a facility electing to invest in a sophisticated monitoring system.
About Purkay Labs