This specific query is taking hours to complete. We are experiencing temporary time-outs in our database. The backup
is so big that it fails. We are constantly running out of space when the database Transaction Log grows. Our tempdb database is huge. All of these are calls and tickets we regularly receive that could indicate issues related to SQL Server database size.
When I started my IT career more than 25 years ago, we were in the application development baby boom. We were developing database applications like crazy and collecting a lot of information through our systems. Some of those legacy systems still exist today and continue to accumulate data.
The problem is that many of these old applications were developed for small data sets, without any vision toward future data archiving or purging.
I can't count how many times I've heard a client say, "We will take care of the archival or purging process once the application is in production." Yet of all the times I've heard this phrase, few clients have ever followed through in a fashionable manner.
Many times development teams have no idea how long they should keep their data. Then, even if the application is migrated to a newer format, the old accumulated data is also migrated to the new system and stays there.
Implications of database size
So let's try to summarize the overall implications of SQL Server database size and having a large database. But first, let's define what constitutes "large."
With evolving technologies and faster hardware, hundreds of gigabytes are now common and might not generate performance issues. So the term "large database" really depends on many factors, such as the hardware, the volume of data updates, the characteristics of the applications, the system's service-level agreement and more.
For example, if your system must generate millions of new rows per minute and also read data in real time at the same instant, your SQL Server database size will matter more than if you have few data changes per day and generate few reports overnight.
For that reason, I usually focus on the following implications when assessing if a database is too large. Be aware that some of the problems below could be caused by other issues, so they must be analyzed before assuming the cause.
- Backup size and time to complete the backup versus available space for the backup and performance hit while backup is running
- Tables with dozens of millions of rows that are the source for performance hits, blocks, locks or deadlocks
- Fast-growing database files
- Quickly expanding transaction logs
- Fast-growing or highly used tempdb databases
- High CPU or memory contention on relatively strong hardware
Integration of processes
One of the biggest lessons I've learned as a system analyst is that your applications must be developed in a way so that the archival or purging processes are integrated parts of the systems.
But even if they are not already integrated, there are ways to patch your systems afterward. Some of those solutions might include partitioning data or tables, using views to include current and archived data and modifying code and stored procedures to include date range filters.
Presently, when I help clients design their systems and processes, I always focus on the size of the data in the long term. Even if the client does not know what the volumes of data will be, I always assume and plan for large stores. I always instruct the development teams to build their database applications as if the database already houses a lot of information.
Changing index structure
Another very important fact that many of our clients neglect is that as your data changes or grows, you might need to change your index structure and tune it or rewrite queries accordingly. It's strongly recommended to be proactive by monitoring database applications and tuning code as needed before it bottlenecks.
We have clients that add the periodical tuning tasks to the project plan. Here are a few guidelines for quickly growing environments:
- Keep your online transaction processing (OLTP) database as small as possible.
- Archive or purge data periodically.
- When archiving, ensure your OLTP environment is not affected by the data purging process (i.e., use partitioning).
- Create a different backup strategy for the OLTP environment and for the archived environment based on their characteristics.
- Monitor and tune your database periodically. As data changes or grows, queries might need to be tuned and indexes might need to be dropped or added accordingly.
To answer my initial question, does SQL Server database size matter? Yes. Obviously, the smaller your database is, the better performance you will receive and the easier it will be to maintain.
About the author
Michelle Gutzait is a principal SQL Server database consultant at Pythian, a global data consulting and managed services company. She has been involved in IT for over 25 years as a developer, business analyst and database consultant.