The following was
based upon a real-life experience in a Tier 1 bank. We used Sybase tools
like ASE and Replication Server but this case is very applicable to other
engines such as Oracle with Data Guard, etc.
When I was working last year for a large bank, we had a disastrous experience that was saved by replication. Now I'm convinced that any hardware-based solution such as clustering is nowhere near as
good as a software-based solution such as replication. Why
bother spending large sums of money on hardware clustering when you can
achieve the same results with a bit of ingenuity and thinking? The notes
below will make my points a bit clearer.
I have always been fascinated by Peter
Thawley's article entitled "High Availability in the Internet Age." This
article was published in the International Sybase User Group (ISUG) journal in Q3 1999. The author
brought up the downtime that eBay suffered as a
result of a hardware fault that rendered their Oracle database corrupt.
It took 24 hours for the Oracle database to be loaded back and in the
meantime, the business lost a lot of income. Regardless of what caused the downtime, one
would have expected the architect and DBAs to account for such failures.
Although failures due to hardware problems are relatively rare due to
redundancy and mirroring, they can and do happen.
The bank I worked for had a trading system that had
been running for two years. It had relatively large
databases on the order of 70GB, using Solaris 2.7 and Sybase 11.9.2 (the Stone Age by today's standards). They had been struggling to
create a proper business contingency (BC) site for this database. The
criteria were that a trade had to hit the database and the trader screen
displayed an acknowledgement message within 3 seconds of trade entry. The
system had to be capable of handling 60,000 trades input per hour with
replication on, and the trade delivery from the production to BC database
had to be designed in such a way th
To continue reading for free, register below or login
To read more you must become a member of SearchSQLServer.com
');
// -->

at it minimized or eliminated any trade
losses. The system had to be capable of supporting 300 traders online.
Within two weeks of being on site we managed to create a BC system using
Sybase replication, capable of handling the above criteria. By mid-November, we put the system into production and the client did
some initial tests over a week-end that established that the replication
was working in a timely manner and the trades were getting to the BC site. What
is interesting is the rest of the story.
In the afternoon on a fateful business day, disaster struck in the form of a
hardware problem on the data server. The Sybase error log started sending the
following out:
ASE was running on an E4500 server with two Sun A5200 arrays for data
storage. The problem was that a full array was taken out of action as a
result of a single disk failure! You may argue the reasons for this.
However, it appears that this is an inherent design fault of this type of
array. (On three specific disk slots on A5200 you need to have disks. These
disks are used to re-issue the "fcal signal" and if one of these fails and
fcal signal does not get around, the array can fail.) As we had two
arrays, one would have expected that the mirror array would have saved us.
Unfortunately, that didn't happen in this case.
The problem was that although one of the storage arrays (out
of two) was taken out of action, a Sybase device was residing on a volume
that had both the underlying damaged disk and its mirror on the same array.
This in turn was traced to Veritas that a few months before had a hot swap
disk in the same array as the originally affected disk. As a result of human
error, this problem was not detected or it was just ignored. So when the single
array failed in the middle of writing data, a hole appeared in the trading
database.
The point that I am trying to make is about the speed with which we managed to recover. We checked the BC database and the last trade entry there was recorded at
two minutes before the crash. The counters showed that particular
trade entry was indeed the last trade gone into the trading system. In this
scenario all we did was to swap the interfaces files and asked the users to
start their applications again. In effect we carried out a full
test of the BC database and it tuned out to be fine, so that was the sign off!
The fundamental question is whether or not hardware redundancy in the form of
hardware-level disk mirroring can ever provide complete protection for
databases. Many organizations tend to use a form of local area (traditional
disk mirroring) or wide area (Storage Area Networks) hardware mirroring to
create redundancy and resiliency. For example, EMC's SRDF is an example
of an extended-distance replication solution at the hardware level. As an
alternative, software replication can be deployed. I tend to argue in favor of
software solutions for the reasons explained below.
As a practitioner, I have always believed that software high availability (HA) solutions in the
form of data replication (though somehow asynchronous) do have an edge on
hardware-based disk-to-disk mirroring. My arguments here are based on the
axioms that although databases are built on file systems or raw partitions,
they fundamentally behave in a different way compared to pure file systems.
Indeed it is a simplistic view to assume that disk-to-disk, bit-by-bit
mirroring will be sufficient protection for your data.
HA disk-to-disk mirroring in its most common form is designed to do a
bit-by-bit copy of data from the primary disk to the mirror. As long as the
distance between two disk locations is not great and a fast connection such
as Gigabit exists between these two sites, the writes will be almost
simultaneous. At a first glance such solutions may look ideal; they can be
almost off-the-shelf. Apply the solution using local or remote clustering
and we have high availability of data.
In reality, databases are more complicated than a simple matter of bit-by-bit
copying. A database engine performs a lot of tasks through the background
processes in a process-based RDBMS like Oracle or through internal processes
in a thread-based model like Sybase. These processes routinely perform tasks
such as checkpointing and flushing of transaction logs and data that are
necessary for maintenance of the database but in reality have no bearing to
the availability of data. In other words, most of this form of data need not
be copied or replicated. In comparison, in HA designs based on data
replication, this redundant data is never replicated because it is not
deemed necessary for the BC database. This brings us to the point of how
software replication is different from hardware mirroring.
Replication in its most common form moves transactions (inserts, updates and
deletes) at the table level from a source database to one or more
destination databases. Compared to hardware mirroring, which is basically all
or nothing, replication can
This points out another benefit of using replication technology for
availability. Often customers use disk mirroring and disk snapshot
technologies to maintain a hot database, often at a relatively high cost.
These technologies do a good job with physical problems like a bad block on
a disk. However, if there is data corruption caused by a software error in the I/O device driver or DBMS layer, that corruption will simply be propagated to the remote device! Since Sybase Replication Server
operates at the logical level by turning transactions back into SQL
operations against the replicate server(s), this
risk is entirely avoided.