Why do we always blame the battery when the lights go out?

By Paul Smethurst, Managing Director, Hillstone Products Ltd

The preservation of power is crucial to the operation of a datacentre and every facility will have many battery back-up systems to prevent a mains failure from being a catastrophic disaster.

These systems can be identified under two categories.

1) Critical Infra-Structure Systems, such as:

2) Essential Facilities Systems, such as:

Clearly the battery underpins each standby power system and by definition of being a sub component, prevents the battery manufacturer from being directly part of the datacentre supply chain, in either the design and build phase or during operations of the facility.

With such an array of different types of systems that use batteries, the accountability of battery maintenance therefore falls within the equipment vendor’s Service Level Agreements (SLA’s) or the Facility Management maintenance team.

Where SLA’s make provision for vendor neutral battery supply, generic battery warranty compliance is managed by the battery maintenance experts.

Where systems are not supported with SLA’s the risk of failure becomes the direct responsibility of the Facility Management team.

The complexities of successful battery maintenance are therefore key to preventing a power failure being blamed on to the battery.

Thankfully datacentres are designed with redundant power paths, this prevents the vulnerability of one battery string being the single point of failure, but nevertheless the battery remains the primary frontline alternative source of power in a mains outage.

As shown in diagram, a good battery will be needed in all power outages and while bad fuel is a single point of failure for the datacentre, the battery will protect against the majority of power outages.

The frontline reliance on the battery to cover the short duration outages creates an equal importance of both preventative battery maintenance and the integrity of fuel quality. Together they are essential for the reliability of the datacentre.

The importance of maintenance

The battery is an indirect cost within any datacentre CAPEX budget, being a component of the emergency power system equipment. The component status prevents any direct relationship between the datacentre owner and the battery manufacturer which leads to the reliance of protecting the battery investment (battery maintenance) by others in the supply chain.

The OPEX budgets are issued alongside vendors SLA’s for the supplied emergency power system equipment. For commercial reasons this allows the battery to be a none specific branded component, so the SLA will not be specific to the battery manufacturer warranty conditions.

This creates the need for using battery experts to deliver specific preventative maintenance to manufacturer’s warranty conditions. The correct interpretation is crucial to ensure the battery is not being blamed when the lights go out.

The IT server equipment in Datacentres cannot operate on the raw utility power supplied by the grid. Datacentre therefore depend on UPS Systems to provide quality filtered mains required by IT equipment.

In addition, under the direction of Tier certifications, the redundancy and size of the UPS Systems are dimensions on the total IT load of the datahall. Therefore, a Tier 3 datacentre with a 1000kW datahall will have a minimum of a dual path UPS design which gives say 15 minute run time protection.

This investment into the UPS System may be for 10 years, the battery may only be a 5 year product due to cost or even a 10 year product may be changed during its life, add in the scenario of battery replacement from a different brand and the complexities of managing the battery maintenance requires specialist knowledge if we are going to keep the lights on.

The battery manufacturer will determine the end of life in terms of the design life capacity value of the battery with the recommendation to perform a 3 hour constant current or power discharge test (with a DC load bank) to determine the battery capacity. This is normal for industries such as oil & gas, petro-chem, power generation where emergency back up systems are predominately batteries and chargers.

The battery performance verification will be confirmed as part of the UPS autonomy run time test, which is usually performed annually with an AC load bank and careful year on year analysis of the battery performance can help determine the end of life of the battery.

However because the run time is short the test engineer cannot measure the individual battery blocks during an autonomy test there is also a need to deploy specialist 3 party battery experts to carry out more frequent maintenance work on the battery system.

Such maintenance will include taking impedance, conductance or resistance measurements on the individual battery blocks. The preference for the type of reading taken is subject to the preference of the 3 party expert.

These results can also be taken with permanent battery monitoring systems and the collation of all this data helps identify problems with the batteries.

Another important and somewhat over looked part of maintenance is visual inspections. In North America they fall under the NERC & FERC regulation and they form part of good practice giving additional valuable contribution to battery maintenance.

Conclusions

Ensuring the integrity of power to the datacentre requires both a robust preventative battery maintenance system and a transparency of reporting for warranty compliance and end of life replacement.

Commercially battery suppliers will want closer relationships to manage replacements & warranty claims to safe guard future potential sales. Therefore it makes sense to engage and create relationships between facility owners, vendors and battery suppliers to increase reliability of operation.

Battery failures in the datacentre originate from the exclusion of the battery manufacturer during the design and build phase of the datacentre and is encompassed within generic SLA for emergency power system equipment and the differences of battery maintenance recommendations from 3 party experts.

Creating an holistic maintenance program that combines historical – present & future predictions for the performance of every battery system will avoid the blame being placed on the battery when the lights go out.

Paul is presently running a small team developing a vendor neutral low cost annual subscription battery maintenance software platform called BattLife.

BattLife addresses all the maintenance issues described in this article and is due for release Qtr1 2019.