Table of Contents Table of Contents
Previous Page  8 / 96 Next Page
Information
Show Menu
Previous Page 8 / 96 Next Page
Page Background

W H I T E P A P E R

© 2017 Persistent Systems Ltd. All rights reserved. 8

www.persistent.com

a. Our comments on the topic in any of the phases of the lifecycle of an analytics project, which are either not

covered in the reference publication or which we want to specifically highlight, given our own experience

as practitioners on the topic.

b. Our comments on the impact of the 3 recent tendencies not covered on the Kimball book, namely cloud

deployments, big data and self-service tools and agility on the topic.

Driven by the pragmatic need of releasing a first version of this document sooner, we will initially emphasize the

construction of the target environment (data warehouse/data mart), at the expense of its use through BI applications .

1

This means we will initially summarize each topic and provide our added value under the angle of building the target

environment, ignoring the BI track.

3. Relevance of the new tendencies for our scope

Data management for analytics is undergoing radical change. The system landscape, the architectural patterns to

perform data integration and improve data quality, as well as the underlying software technology to develop data

integration and improve data quality are all changing. This cannot only be attributed to the pervasiveness of data-

driven analysis: We believe in fact that this phenomenon is to be attributed to the convergence of three of the hottest

topics in today's information technology industry:

(i) The usage of cloud computing for analytics projects,

(ii) The explosion of available data for analysis, and

(iii) The recent rebirth ofArtificial Intelligence, in particular machine learning.

We will dedicate a section below to each of the two first topics. As for machine learning, we also dedicate a specific

subsection on this subject on the big data quality section (see

below), even though the subject is not strictly

5.3.3.3

specific to big data nor to data quality domains, but because this technology is being increasingly applied in the

intersection of the two domains. We will also illustrate that it also is contributing to make data integration more

accessible to less technical users, so it has a definite impact on our third new tendency, namely self-service tools and

agility.

We hope that after reading chapter

it will become immediately clear that the system landscape and the available

3

underlying technology is dramatically changing in projects that must prepare and manage data for analytics. As for the

architectural patterns for integrating data, section

below discusses Data Integration processing styles, ETL and

4.4.1

ELT, which is very relevant to the presentation about cloud computing (section

) and big data for analytics (section

3.1 3.2

).

3.1 Cloud Computing for Analytics

Since cloud computing emerged, its effects on IT infrastructure, network services and applications have been

enormous. Data warehousing and analytics is no exception and is even a key use case, as the cloud provides

generous processor and storage resources to assure processing speed and volume scalability.

At PSL we are witnessing this change in most of the new analytics projects being launched with new customers. Most

customers that engage with PSL do so because of we are seen as innovators which can help with cutting the time to

market in their solutions. In that respect, our current experience in cloud deployments is the key, and best practices

needs to be shared as widely as possible.

1

By these we mean query and reporting tools, data mining tools, dashboards/scorecards, and analytic applications embedding query and

reporting technology, as well as domain expertise on a particular business process.