W H I T E P A P E R
www.persistent.com
© 2017 Persistent Systems Ltd. All rights reserved.
6.1.3 Cloud Function
Here only AWS Lambda service is ready for production, Azure functions are in preview and Google Cloud
Functions are in closed alpha. Apart from that, there are some differences in them on supported languages,
event sources, architecture etc. For detailed comparison,
click here .6.1.4 Cloud Data Warehouses
6.1.4.1 Scaling
In Amazon’s Redshift, storage and compute units are grouped together as a node definition. So, while new
clusters are being provisioned, the current cluster is available only in read mode. The time taken to complete the
operation of cluster provisioning and data copying to new cluster could take a few hours to days in Redshift. In
contrast to this, in Azure SQL Data Warehouse, the scaling of the clusters can happen in minutes as the scale out
can be done for compute and storage units independently. Both Google’s BigQuery and IBM’s DashDB allows
to scale and pay for compute and storage node independently.
6.1.4.2 Data sources
Both AWS Redshift and Azure SQL Data Warehouse have a mechanism to integrate with respective blob storage,
Hadoop service and NoSQL data sources. To integrate with on premise database, it needs to be exported into a
file and then imported to respective storage mechanism.
6.1.4.3 Client BI Tools
Redshift integrates with many popular BI tools, like Tableau. In addition, it also allows connecting using JDBC
and ODBC drivers. Azure SQL Data Warehouse also supports integration with popular BI tools such as Tableau
and Power BI.
Both Redshift and Azure SQL Data Warehouse look promising. Azure SQL Data Warehouse leads in some areas,
such as the scalability and decoupling the store from compute. On the other hand, Redshift leads in security by
enabling it to be hosted in a VPC.
Enterprises who have been using Microsoft SQL Server widely will naturally move their data warehouse to Azure
SQL Data Warehouse as it is extension of SQL Server family of products and they will find developer knowledge
and skills in house easily.
Goggle’s big query is fully managed data warehouse where it requires minimum input and rest it manages in
terms of number of nodes, indexes, periodic maintenance etc. Performance on BigQuery is generally better as
it brings as many resources as needed to run the query versus other platforms where it is limited by number of
CPUs customer is paying for.
In terms of overall adoption, Amazon Redshift is still leading the market as it can integrate well with other AWS
services including DynamoDB, Amazon S3, Amazon Kinesis, AWS Data Pipeline and AWS Lambda. There are
more number of vendors who have certified Amazon’s Redshift data warehouse with their offerings to enable
customers to continue to use the tools you do today.
6.1.5 Data Visualization
Azure PowerBI tool is a more mature visualization tool than AWS’s recently released QuickSight. Google Data
Studio 360 is also coming out of Beta. So, in this space, Azure PowerBI is well established. PowerBI integrates
with many business systems and applications like Microsoft Dynamics, Salesforce, Google Analytics, and
Microsoft Excel. Most of the service providers also provide integration with reporting tools like Tableau and
QlikView, as these tools can connect to on premise data sources as well.
23