Large SQL Server installations bring added challenges, particularly how to back up all those databases and ensure minimal data loss in the event of a failure. Here I'll detail some points to consider when setting up a backup plan for a large SQL Server environment.
Observation: Whether you have one server or 500 servers, the movement of backup files either to tape or disk storage will have some affect on your network bandwidth. One best practice is to write your backup files to disk and then archive to tape. When writing to disk, it is wise not to write to the same physical drives or array because a server or disk failure could wipe out both your database and your backup copy. For a large installation of SQL Servers, you would also probably have some type of centralized tape backup method. Both of these scenarios require the movement of backup files across your network.
Recommendation: To help handle the volume of data that needs to be written to disk and tape, you should have a separate management network (VLAN) to isolate this type of network traffic so it does not affect application users. Database files probably account for a good majority of your overall disk storage needs, so eliminating the movement of these files across your user network will ensure you are not impacting users and provide maximum file throughput to meet your backup needs.
Observation: Another hurdle with having a lot of servers is how to monitor them to confirm all jobs are successful. In addition to making sure backups run smoothly, you also need to monitor available disk space. It is practically impossible to connect to every server every day to check that all processes are running without issue, so you should implement some tools or processes to automate monitoring.
Recommendation: The simplest and cheapest option is to set up SQL Server alerts and operators. You can use specific event codes to monitor backups and send messages to notify DBAs of errors. You may also use one of the many third-party monitoring tools available from companies including Idera, Quest Software, NetIQ Corp. and Microsoft. Some of these tools have been written specifically for SQL Server, while others are more generic in nature and cover all aspects of your server; some also offer auto correction for problems encountered.
Testing database recovery
Observation: You must test database recovery so you know if you can recover from a real failure. As with any measure you put in place, the more time taken for testing the better the end result will be. It is probably next to impossible to test every restore of every database, but you should take a cross section of different types of servers and run periodic recovery tests to make sure your plan will work.
Recommendation: Create categories for your databases and organize them accordingly. Then do periodic spot testing for different databases in each category. Although this approach won't guarantee a 100% successful database restore, it will give you some peace of mind as to what needs to be done when you must do a restore.
Observation: Using compression software for SQL Server backups is essential for large databases as well as a very large installed base. These tools allow you to complete backups faster and save disk and tape space due to the built-in compression.
Recommendations: Evaluate one or more SQL Server backup tools that offer file compression, available from companies including Idera, Quest and Red Gate Software. These tools allow you to back up and compress files at the same time. Backup time can be cut in half and backup files may be reduced to as much as 10% of the native backup file. This offers big advantages for the large SQL Server shop.
Scheduling backup jobs
Observation: SQL Server Agent is a great tool for scheduling backup jobs, but it doesn't offer any benefits for scheduling a lot of jobs when it comes to identifying overlaps and handling job dependencies. It would be nice to have all your backups run and finish at the same time, but that is probably unlikely, especially in a large environment. Also managing so many jobs -- making changes and monitoring -- is a job in itself.
Recommendation: Look at implementing master/target servers for your SQL Server Agent. This allows you to have one centralized place for job management and push jobs out to other servers. This can help ease job administration, but does not offer the flexibility of looking for overlaps and job dependencies.
Also consider sqlSentry and Idera's SQLschedule. These tools provide you a graphical view of all your jobs so you can see when the jobs run and how long they take. They also give you the ability to handle job dependencies. So if job A has to finish before job B can start, you can handle the issue using one of these tools. Furthermore, they allow you to look at all of your servers together and you can narrow down your view by looking for similar jobs across servers. You have complete control over changing the schedule with a simple calendar interface like Microsoft Outlook.
Observation: As with any large project, keeping processes consistent is key. When administering a large number of servers, it is extremely helpful if you regularly roll out new changes, troubleshoot and recover databases when needed.
Recommendation: Use scripts to deploy jobs for new servers and changes across the board for existing servers. Naming conventions for your backup files is also important. The backup name should include some type of identifier for server, database, date/time and type of backup (full, differential or log). The type of recovery model and backup schedule should also be similar for your servers. Your database may fit into multiple classes, but you should have a limited number of differences. This will become critical when recovering a database; if you know what class the database is, you will quickly know what can be recovered and the potential for data loss. Also make sure you are monitoring servers consistently. With so many servers to monitor you need a simple way to make sure they are all running normally.
These are just a few things to take into consideration when implementing a backup plan for a large SQL Server installation. Even though these are high level in nature, they should give you a good starting point for a new plan or things to check for your existing backup implementation.
About the author: Greg Robidoux is the president and founder of Edgewood Solutions LLC, a technology services company delivering professional services and product solutions for Microsoft SQL Server. He has authored numerous articles and has delivered presentations at regional SQL Server users' groups and national SQL Server events. Robidoux, who also serves as the SearchSQLServer.com Backup and Recovery expert, welcomes your questions.
More information from SearchSQLServer.com
- Step-by-Step Guide: How to properly back up a SQL Server
- Tip: How to store database backup media
- Checklist: How to maintain an effective SQL Server DR strategy