Business Continuity Planning
Aug. 13, 2008 by ravishan
The University Business Continuity Planning has been pretty active for the past three years or so under the leadership of Cliff Ashton from Physical Plant and has developed detailed plans for various emergency responses. We have had some emergency response exercises on campus in collaboration with the teams from both Middletown and I believe the state of CT. I am sure you have all heard the loud sirens being tested from Long Lane as well as the use of fire alarm systems to also broadcast voice based messages. The group also has developed detailed procedures for various emergencies in terms of responsible staff members, escalation etc. Finally, we have worked with the group to develop emergency communication through Connect Ed which will be used to reach our community through regular phones and cell phones.
In parallel, I asked a group from ITS (Joanne, Jolee, Dave Warner, Henk Meij, James Taft and Steve Windsor) to look into developing a business continuity plan for ITS. The discussions and recommendations of the group can be found at https://itsdoku.wesleyan.edu/doku.php?id=&idx=business_continuity. This posting is an update on what we plan to do in the next few months.
Business Continuity Plans are extremely hard to develop in our environments. Typically, the easiest thing to do is to mirror everything you do in your data center to a site that is ideally several miles away and is situated in a location that is unlikely to be affected by similar environmental conditions or disasters. Obviously this also would require that in the event of a disaster, we have a very high speed connectivity to the mirror site. Obviously, this is a very simplistic view and several questions arise – how often do you replicate the data to the mirror site? How do you manage the addition of new hardware and services so that both locations have identical configurations? etc. etc. This is so expensive to configure in the first place and manage subsequently that this is not a viable option for us.
So, we need to be practical about this. The group from ITS basically took a couple of scenerios to work on. As you may know, our data center poses a risk in that it is where almost all our hardware is. So, as a first step, we wanted to see how best to configure an alternate location that would provide the most important services within a few hours of a major disaster. Now, the disaster can come in many unpredictable ways and we cannot anticipate all of them. One obvious one is, that there is something that affects the data center and its immediate vicinity, but the rest of the campus is operational.
Obviously this is a situation for which we better have a solid plan. Many classes depend on the network connectivity and services such as Blackboard and WesFiles. So, as a first step, the group looked at Tier I services that are the most critical services that we need to have up at the earliest possible time. You can imagine that in this day of interconnected services, it is one of the most difficult task. Email, Blackboard and such services depend on Active Directory and PeopleSoft and will not run standalone… Our web servers require PeopleSoft and Curl database etc. etc.
The wiki captures a lot of these discussions… But, you will notice that a lot has changed already between the time the group met nd now. WesFiles did not even exist at that point.
Based on these initial discussions, we have been alerting the senior administration that this is critical and we have to do something about it. As a first step, we proposed a satellite data center on site to protect against the one scenario of the data center being affected and the rest of the campus functional. The advnatage of onsite satellite center is also that we can actively use all the hardware in such a center. I willspare you the details, but thanks to advances in technology, we can have disk systems and blades configured to be physically in different locations, but all working together logically. There are of course distance limitations to making this work seamlessly, so this will not work well when the separation is several miles.
We explored a couple of locations and priced out the construction cost. It turned out to be astronomical… In the current budget climate on campus, it would be next to impossible to be able to justify this.
Believe me, we are not the only one who is struggling with this problem… Everyone is worried about this issue and looking for ways to have an alternate site to host their critical services and struggling to develop a viable and practical financial model.
In the mean time, as the Chair of the Network Infrastructure and Services Advisory Council for the Commission for Educational Technology, I was made aware of surplus funds in Connecticut Education Network that the commission was considering reallocating. I worked primarily with Judy Greiman from Connecticut Conference of Independent Colleges and John Vittner from CEN to put together a proposal to reallocate some of these funds to help the higher ed members of CEN get started with the business continuity planning. James Estrada from Fairfield, Mike Trimble and Saburo Usami from Sacred Heart and Jean Madden-Hennessey from Saint Joseph College also were involved during initial discussions
We successfully partnered with George Loftus from NEREN to structure a plan by which in a facility operated by RCN in Springfield will be used as the Safe Harbor for interested CEN member institution and that each institution will get a half a rack of space for free for three to four years (still being negotiated). This has been received favorably by most of the CEN members and Quinnipiac, Fairfield and Sacred Heart are planning to take advantage of this as early as September. We are waiting for the contract documents to arrive.
We discussed several important issues such as physical security to the data, how secure the racks are, how the network will have to be configured to ensure secure transmission etc. The good news is, over 10 institutions are participating and collective discussions on this (we had a conference call with Quinnipiac this afternoon, for example) will result in a solution that addresses all of these issues and will benefit all of us in the end. And the company RCN has been in this business for quite some time and NEREN staff are an excellent resource for all of us.
Our plan is to go in there in October when all aspects of the facility including UPS is available. Our goal is to accomplish the following by July 1, 2009 (Phase I):
- Backup all Databases (eg. PeopleSoft, CURL, Millennium, Blackboard, WesFiles, Lyris. MySQL)
- Backup all additional data associated with: Blackboard, and WesFiles
- Have a fail over strategy for Email (both WebMail/Cyrus and Exchange) and the University Web Server.
I will write more on this once we have a plan in place. James and Henk are leading this effort and once Karen Warren joins, this will become one of the first major projects that she will be managing.
I am very excited about this because finally we are ready to do something about this…
