ORLANDO -- Paul Timmerman, a SQL Server DBA who manages a data center at the Treasure Island Resort Casino in Red...
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
Wing, Minn., had an uneasy feeling after leaving a disaster recovery seminar last week.
The casino, located on the banks of the Mississippi River, depends on Timmerman to maintain the databases that keep the 2,500 slot machines running.
Timmerman, who oversees an eight-clustered SQL Server environment, was not sure he was completely prepared for the unexpected. After the seminar, held at the Professional Association of SQL Server (PASS) users conference, Timmerman said he was prepared to update procedures and revise contingency plans in the event of a system failure.
"There are a lot of facets that are out of my control," he said. "We have the ability to get back to a reliable situation because I'm sure our management wouldn't be too happy if our slots went down."
In the event of a flood, a power outage or even the loss of air conditioning, many companies prepare by replicating data to systems at a data center in a different location.
While most firms said they are prepared for the most catastrophic events, many fail to have their contingency plans detailed enough in the event a system does go offline, said Danette D. Riviello, a DBA who heads the IT department at Columbia, Md.-based Magellan Health Services Inc.
Having the smallest contingency details prepared in advance is important in the event of a total site outage or a server level failure to reduce system downtime and save money, said Riviello, who spoke to hundreds of users at the disaster recovery seminar.
Planning for a disaster begins with a critical evaluation, Riviello added. Determine how much downtime can be tolerated for each application and how much data loss the company can afford.
"A lot of times you can't have a cookie cutter plan for every circumstance," Riviello said. "But you've got to plan for every contingency you can think of and not just the one that is the most convenient."
Good planning begins by talking with all the players involved. DBAs should work with system administrators and system analysts who are familiar with the company applications and what it means when they fail, she said. Developers need to know if you are going to fail-over, and managers and business owners also should be involved since they have a large financial investment at stake.
Getting down small contingency details in advance, such as maintaining appropriate contact phone numbers, can reduce downtime, saving further headaches and money. Planning for the unknown also eliminates headaches and surprises when data loss does occur, Riviello said.
"It should be a joint decision on how to back things up," she said. "It needs to be laid out to everyone how much data is at risk and how much is backed up so no one is caught by surprise.
Ken Powers, a SQL Server DBA at Montvale, N.J.-based A&P supermarkets, said he oversees a heterogeneous environment consisting of SQL Server, Oracle Corp. and IBM DB2 DBMSes. Even though A&P replicates its data to an off-site data center every two hours, careful contingency planning will only help when a system goes down, Powers said.
"It's helped me think about whether I'm really prepared for a major problem," Powers said. "When something goes wrong and you're in a bind, it's nice to have a step- by-step plan in front of you and all the information you need to make the appropriate contacts."
In addition, DBAs should prioritize their servers and consider system dependencies, Riviello said. Establish escalation procedures and consider generic standby servers to build for less crucial systems.
Script out the physical layout of the database, script link servers, custom messages and logins, she added. List the system configuration and the backup devices.
Run daily disaster recovery scripts and store the output locally. In the scripts, companies should include the information needed to rebuild the master, replication information and information needed to move an application to a new server, Riviello said.
"In the event of an emergency, it's important to stay focused and having all this information on hand will help," Riviello said. "Contact the key people so they know and can get the resources in place to make recovery faster and smoother."