W H I T E P A P E R
www.persistent.com
© 2017 Persistent Systems Ltd. All rights reserved.
9
correspond to monthly downtime in the 45-minute range. With some work to configure redundant
resources, you can get higher availability when installing a DBMS on top of VMs. For disaster recovery,
the picture is the same: at the IaaS level, administrators must understand the tradeoffs between the
existing options offered by the DBMS vendor working on IaaS hardware, and configure and manage
them.
On the other hand, at the
PaaS
level, you have database as a service (with several technology variants
including data warehouse and Hadoop as a service, described below); what follows immediately applies
to relational DBMSs and warehouses, and we refer to it as DBaaS.
Your database will share infrastructure resources with other tenants. DBaaS forces you to relinquish
control on the infrastructure: your visibility of the hardware and the virtual machines it runs on disappears,
so it is a departure from the traditional way of managing your database. The underlying database is, for
the most part, managed: you do not have to spend time managing upgrades, high availability, disaster
recovery, or backups. Your developers use directly the service to provision data, optimize queries and
user workload, and develop an application. As we will see below, computational and storage resources
can be scaled out automatically, and they have state of the art fault tolerance features.
With DBaaS, you get out of the box much higher availability SLAs from the cloud platform vendor. For
disaster recovery, the picture is similar: DBaaS come with built-in disaster recovery. For instance, Azure
offers out-of-the-box geo-replicated full and differential backups. Finally, rapid failove
r 5to another site
in planned events (e.g., maintenance) or in failure events with minimal data loss can also be configured
much easily with DBaaS as compared to DBMS on IaaS.
b.
Knowledge
and
types of queries
– Legacy database and data warehouse environments (where
queries pre-exist, so they are completely known) are best migrated to the cloud layering the DBMS on
top of IaaS services. Legacy code may be difficult to port it to a database-as-a-service offering. For new
developments, the choice between DBaaS and DBMS on IaaS depends of the requirements discussed
in this section.
c.
Performance
and
scalability
, depending on
data volumes
,
user scales
and
cost
– Per our volume
characterization, large volumes start at 20 TB. Below this level, traditional SMP databases are now able
to handle loads of relatively complex queries for medium size concurrent number of users and average
to fast response times. SMP DBMSs running on IaaS services can be a fit, especially if there are there
are no drastic variations on volume or user scale, i.e., when the lack of elasticity is not really a problem.
DBMS on IaaS gives you the advantage of predictable performance, as you will be the only tenant –but,
at a price point above DBaaS, as discussed below in more detail.
At larger volume scales, or larger user scales, with queries still needing average to fast response
times, it is worth considering MPP data warehouses as a service, which have a parallel distributed
capability that can take advantage of scale out architecture
s 6 .This type of databases, very expensive
when purchased on-premise, are much more affordable on the cloud especially when licensed as a
service, and they outperform SMP databases by far (several times faster, up to 10 times faster on query
intensive workloads, depending on volume; the larger, the bigger the difference). Database services on
MPP differ in terms of their architecture: some, as Azure SQL DW, separate and scale independently
(and elastically) compute nodes and storage nodes; some, like Amazon Redshift, don’t perform the
separation and scale compute and storage nodes jointly, so may be costlier in some cases. Volumes in
the price list go up to the low/mid hundred TBs of compressed data which, uncompressed, is about 1 PB;
DWaaS do not go past petabyte size yet.
5
This in fact applies more to transactional databases on the cloud, not so much to cloud analytics databases.
6
There are also (SMP) databases as a service in most cloud platform providers as well. And there are in-memory, columnar databases on scale-up servers which are also a
solution for a large category of analytics workloads.