Data center infrastructure monitoring is not a one-size-fits-all proposition. There are many parameters you could monitor: power, energy, temperature, humidity, pressure, etc. There are also multiple levels in physical and organizational hierarchy that monitoring could be deployed at: main feeds, PDUs, cabinets, busways, servers, etc. Obviously, you want a system that supports as many parameters as possible and covers them throughout the hierarchy.
What is sometimes forgotten, however, is the dimension of time. Good monitoring also takes place on multiple time scales.
Real-time monitoring Some monitoring is about catching things as they happen (minutes to hours). This includes things like sudden load changes, which may indicate failure or possibly unauthorized changes, rapid temperature fluctuations (see our previous post on hot spot detection) and water leaks. A monitoring system supporting real-time monitoring should make it easy to define alert conditions and support multiple notification channels. It should also make it easy to define visual dashboards suitable for NOC applications. And, it should support virtual, location-relevant dashboards that can quickly provide on-site relevant information.
Short-term monitoring Short-term monitoring usually focuses on shorter (hours to weeks) time frame tasks or failure resolutions. It includes monitoring system migrations, cooling system rearrangement or tuning, data center expansions and disaster recovery. A system well suited for short-term monitoring applications should make it easy to define new, ad hoc reports and dashboards. It should also allow new sensors to be easily added or existing sensors to be moved if necessary.
Long-term monitoring Long-term monitoring focuses on system optimization over time (days to months), preventative maintenance and cost allocation. Long-term analysis looks for slower trends, stability of the system, and detection of subtler trends that may not otherwise be detected. They may be caused by seasonal (weather-related or system load-related) effects, equipment wear, or misconfigured/failing AC equipment. A good long-term monitoring system should allow you to analyze data at the appropriate time scale. With the right system you should be able to look at your reports and easily recognize patterns. Some of our customers can tell us what day of the week it was or the weather on any given day by just looking at the energy charts.
Billing Billing is a variant of long-term monitoring, with some special requirements that warrant explicit mentioning. A billing reporting system is generally used only at billing time (e.g. monthly). A reporting system used for billing must be able to handle specific temporal requirements (match the billing calendar) and be able to provide the exact metrics used for billing (e.g. hourly peaks of current or actual energy consumed). It may also need to apply a variable rate schedule to the billing metric (e.g. peak vs off-peak power rates or below/above nominal draw power rates). Finally, a billing system needs to produce a permanent record of the data used in deriving the final bill for tracking and audit purposes.
Data center infrastructure monitoring is not a one-size-fits-all proposition. Fortunately, Packet Power gives you the ability to monitor power and environmental conditions at whatever level you need in your facility and across whatever time frame necessary. We make data center monitoring easy. You're on your own, however, for the "walking and chewing gum" bit.