Many companies choose to have their SQL Server infrastructure hosted by a professional hosting provider, also known as a colocation, or colo, facility. This choice provides such advantages as easy infrastructure expansion, power redundancy and climate control. But while a SQL Server hosting environment has benefits, there are several things you should keep in mind when you select a SQL Server hosting provider and when you are dealing with data center management and infrastructure later on.
Here are the five best practices for SQL Server hosting:
- Pick a provider within reasonable driving distance
- Train multiple people at the colo site
- Use "remote hands" services
- Prepare for reboots and restarts
- Bring domain controllers back first
Pick a provider. One of the advantages of having your servers hosted is that you are not limited to your local area. Having a colo somewhere in a busy cosmopolitan area offers such advantages as redundant and fast ISP channels. All data centers hosting SQL Server provide skilled and trained staff that you can use to work on your servers. However, there is no substitute for having your own staff perform work on your own servers in the case of an emergency or even for regular maintenance. Therefore, if possible, it is to your benefit to choose a data center where you can quickly send your staff to perform regular or emergency maintenance.
You should get to know the colo employees -- talk to them and ask questions. After a while, you get a good sense of what they can do well. Then, when you need help, you can ask for a specific person based on their expertise.
Train multiple people at the colo site. In many cases there is usually one person within the company that's a designated colo expert, but you should really make sure that multiple employees are trained at, and familiar with, your SQL Server hosting facility. If your main contact cannot make a trip, there should be others who can handle the work. I've heard of a case where an employee working at a colo facility got locked in a server cage because he wasn't familiar with the system, and it took a while before an employee came to rescue him. The other advantage of having multiple employees familiar with the environment is consistency of the hardware setup. There are many ways to wire servers and label the wiring, and sometimes you can get a mish-mash of wiring setups. If you train everyone to follow the same conventions, your set of servers and server cages will look consistent and professional. Speaking of "professional," you should make sure your server area within the colo is well-organized and looks professional. If you leave a mess behind -- empty boxes, wires on the floor -- it reflects poorly on your company.
Use 'remote hands' services. Pretty much every colo facility will provide skilled and qualified staff to help you with remote tasks. The common industry term for that service is "remote hands." Their services can range from installing Windows or applying patches to all kinds of hardware work, such as disk replacement and hardware troubleshooting. You can typically use these resources either on an hourly or per-incident basis. For the most part, a lot of troubleshooting and Windows maintenance can be done using Remote Desktop Protocol, or RDP, where you log on remotely and do the work yourself. This works well on the software side, but for hardware-related work, it pays off to have a physical presence at the colo facility. And, while it might appear to be cheaper to send your own people, when you add up the costs of travel, hotels and meals, you could be better off hiring remote hands to help you with the tasks you trust them with. However, the quality and the level of experience of these remote hands can vary greatly. Therefore, you should get to know the colo employees -- talk to them and ask questions. After a while you get a good sense of what they can do well. Then, when you need help, you can ask for a specific person based on their expertise.
Prepare for reboots and restarts. Line up colo employees for reboots. This advice applies mainly to companies where uptime is crucial. Occasionally you may be in a situation where SQL Server or Windows gets in a state where it stops being responsive, or throws errors you know might go away after a reboot. I've seen a case where SQL Server was flaky and needed a restart, but then ended up hanging during it. We rebooted the server, but it ended up hanging during shutdown. Since this server didn't have a remote hardware reset switch and the colo facility was three hours away, we needed to get someone from the SQL Server hosting provider to go in and reset the server manually.
No matter what colo facilities tell you in their marketing materials, their response time is usually longer than you expect. So, while you might expect them to start working within, let's say, 45 minutes, in reality it could stretch to 90 minutes. In the meantime, your server is down and you are stuck waiting. If the server is critical, you are better off lining up a colo employee to be on-site when you are restarting SQL Server and/or Windows. You may not need them in the end, but in case you do, you will be glad you spent $100 or so to have them on-site. If you don't end up using them, it's still money well spent because it gives you peace of mind during stressful periods.
Bring domain controllers back first. If a SQL Server computer starts up without being able to communicate with a domain controller, it doesn't authenticate the way it should, and this affects how it runs. You are likely to see strange behavior and job failures if the jobs are owned by a Windows login. If you lose power and then all machines start on their own after the power is back, you run the risk of some SQL Servers not being properly authenticated. Pretty much all colos offer redundant power supply. However, even those fail sometimes, or they may not last as long as the power outage. If your SQL Server hosting provider loses power, your best bet is to send someone over (your employees or remote hands) and have them unplug all servers. Once the power is back on, they should start at least two domain controllers online first. Once they are fully functional, you can start other servers in whatever order is required by your environment.
About the author:
Roman Rehak is principal database architect at MyWebGrocer in Burlington, Vt. He specializes in SQL Server development, database performance tuning, ActiveX Data Objects.NET and writing database tools. He is president of the Vermont SQL Server User Group.