As Hurricane Sandy ravaged East Coast last year, causing more than $68 billion in damage , it also brought significant attention to the disaster readiness of data centers. Various data center facilities struggled with power problems amid widespread flooding and utility outages, immediately impacting the businesses that rely on those resources.
While several data centers who downplayed the adverse geological contingencies were caught completely off-guard, various state-of-the-art facilities with meticulous DR planning also found it difficult to stay up and running in face of the unprecedented scale of Sandy.
The storm exposed gaping holes in the the scope of existing disaster plans and hard-pressed the need of better monitoring measures, preemptive testing , backup power, and several other improvements.
In the wake of upcoming hurricane season, I spoke to Brad Ratushny, Director of Infrastructure, INetU on how the Allentown, PA based company stayed online and ensured zero downtime for its clients during Sandy. Brad, an industry veteran with over 15 years of experience, is in charge of all data centers in the INetU global footprint.
In this interview, he talks about several proactive steps data centers can take, including the testing of all backup systems, review of emergency procedures, final generator checks and having back up fuel vendors on standby to mitigate the effects of natural disasters.
– Brad Ratushny, Director of infrastructure, INetU.
Q: Let’s begin with a brief introduction of yours and a broad overview of INetU’s services.
A: My name is Brad Ratushny and I’m the Director of infrastructure at INetU. I have been with the company for 15 years and in my current role specifically for about 5 years. We at INetU have been providing dedicated business hosting and cloud services for more than 15 years. We pride ourselves on being the experts in engineering complex hosting solutions and having first-hand experience on compliance based projects in the US and throughout our global footprint.
Q: Please tell us about INetU’s Data Center facilities and the Infrastructure and Technical specs you have in place from the Disaster Recovery and the Business Continuity POV.
A: We have a total of 10 data centers in Seattle, Pennsylvania, Ashburn, Virginia, Amsterdam and Allentown, where we are headquartered. We’re a very risk-averse company and always try to ramp up whatever we do because we like to be a little bit more safe. For example, while the typical run-time for generators in the industry is 24 hours, our fuel tanks have capacity for 48 hrs of run-time.
In addition to N+1 UPSs and generators, we also have an additional portable units to make sure that we’re always safe in case of power outages.
Also, even though we’re not in the hot zone for lightning strikes, we have lightning rods on the roof of our facilities for deflecting thunder storm outages and the lightning surge suppression in our data centers is UL listed.
In addition to having the proper data center infrastructure in place, we also take care of proper maintenance of the building envelope, roof etc. to keep everything up and running.
Q: What would you say are some of the key measures that data centers need to have in place to mitigate the adverse impacts of the natural disasters? Also, can you share with us some examples to show how you approach data center disaster planning?
A: The biggest thing in my mind from the Director of infrastructure perspective is testing, testing and testing. When we’re talking about DR and BCP, preventive maintenance is absolutely critical. DR and BCP plans aren’t something that just sit on someone’s bookshelf. They’re living, breathing documents that’re often the blueprint for how people adapt to emergencies. I actually rely on emergency preparedness plans quite a bit.
Largely, systems are absolutely critical, but the people that operate those systems are even more important. So training your team for specific situations is very important. What I mean is, when you train for Hurricane Sandy, you look at possible power disruption, cooling disruption and disruption to your various other infrastructure components; the same training applies to other potential natural disasters as well, but you need to look at what disaster you could be faced with in the near term and accordingly adjust, train and be prepared for that.
Lets’s look at what happened at the east coast when Hurricane Sandy hit last year. A lot of data centers on the coast had their disaster plans ready. They had up to 5 days of fuel on site to run their generators. Now I know of a few examples where the generators didn’t run at all because the fuel wasn’t maintained properly. They had the fuel but it wasn’t rotated and maintained timely, so it started clogging up the generators, causing them to fail.
Also, what most people didn’t expect to happen was that the fuel trucks and fuel services couldn’t get the fuel they needed on the coast, because the fuel delivery up and down the coastline became a challenge in itself. So instead of getting their fuel along the coast, which is the usual practice, they started coming inland to areas like ours, where we were concerned about a fuel shortage ourselves.When we came to know about this possibility, we went out and started setting up contracts with people in Midwest and Western Pennsylvania to make sure we won’t be impacted.
Fortunately, it never got to that point, but it’s a good example of how you can’t just live by your plan and need to think everything through level by level to respond to a disaster effectively. And that’s why I said that your DR and BCP plans are living, breathing documents. You need to train on them properly and make sure that you’re adaptable to emergencies as you go through.
Q: How do INetU’s Disaster Recovery capabilities ensure continuity in the event of a site-level failure?
A: Our primary focus is keeping our mission critical websites up and running, but plenty of our clients do actually use us for disaster recovery for their primary site. Again, I’ll use Hurricane Sandy as an example. During Sandy, we were just on-boarding a DR client and working with them to get the configuration setup. Their main configuration was somewhere along the coast and unfortunately, they were very heavily impacted at their primary facility. Even though they weren’t fully live here yet, they physically brought us their equipment. Now colocation is not a focus for us, but when we have an enterprise client who we are working with, we are flexible and we do whatever we need to do to help them.
So the client walks into our lobby with mud on their shoes and server in hand and we help them get their business back up and running. Ever since that they have actually been using us for their primary site and they use theirs as a backup site. So we are proud that we go an extra mile to help our clients and that’s what we are here for.
Q: How does INetU ensure that their data centers remain energy efficient?
A: We are constantly striving to increase efficiency in each area of our operation. In addition to aiding our clients move to the cloud, where it makes sense for them, we also monitor and implement efficiencies in our data center operations as well as efficiencies in the building envelope as a whole. These efficiencies can include replacing aged, less efficient infrastructure with newer more efficient hardware, decommissioning underutilized equipment, or increasing insulation to improve a facilities R-value.
Q: Wrapping up, since you’ve been in the industry since long, what according to you are some of the questions organizations should ask while choosing an enterprise data center from the security POV?
First, you need to make sure that the data center has all the relevant industry certifications like PCI DSS compliance, SOC3, SOC2/SSAE 16 Type II and ISAE 3402. Then you need to go to a level deeper than that and check the physical security of the data center, security equipment, processes etc. You also need to check if they have proper procedures to control, monitor, and record access to the facility. For example, some legacy data centers are relatively unmanaged and don’t have 24×7 security, which is fine for certain applications, but definitely not for enterprise environments.
So you need to look into all these factors, weigh them and think further how they apply to your business specifically.