

W H I T E P A P E R
© 2017 Persistent Systems Ltd. All rights reserved. 8
www.persistent.com
a. Our comments on the topic in any of the phases of the lifecycle of an analytics project, which are either not
covered in the reference publication or which we want to specifically highlight, given our own experience
as practitioners on the topic.
b. Our comments on the impact of the 3 recent tendencies not covered on the Kimball book, namely cloud
deployments, big data and self-service tools and agility on the topic.
Driven by the pragmatic need of releasing a first version of this document sooner, we will initially emphasize the
construction of the target environment (data warehouse/data mart), at the expense of its use through BI applications .
1This means we will initially summarize each topic and provide our added value under the angle of building the target
environment, ignoring the BI track.
3. Relevance of the new tendencies for our scope
Data management for analytics is undergoing radical change. The system landscape, the architectural patterns to
perform data integration and improve data quality, as well as the underlying software technology to develop data
integration and improve data quality are all changing. This cannot only be attributed to the pervasiveness of data-
driven analysis: We believe in fact that this phenomenon is to be attributed to the convergence of three of the hottest
topics in today's information technology industry:
(i) The usage of cloud computing for analytics projects,
(ii) The explosion of available data for analysis, and
(iii) The recent rebirth ofArtificial Intelligence, in particular machine learning.
We will dedicate a section below to each of the two first topics. As for machine learning, we also dedicate a specific
subsection on this subject on the big data quality section (see
below), even though the subject is not strictly
5.3.3.3specific to big data nor to data quality domains, but because this technology is being increasingly applied in the
intersection of the two domains. We will also illustrate that it also is contributing to make data integration more
accessible to less technical users, so it has a definite impact on our third new tendency, namely self-service tools and
agility.
We hope that after reading chapter
it will become immediately clear that the system landscape and the available
3underlying technology is dramatically changing in projects that must prepare and manage data for analytics. As for the
architectural patterns for integrating data, section
below discusses Data Integration processing styles, ETL and
4.4.1ELT, which is very relevant to the presentation about cloud computing (section
) and big data for analytics (section
3.1 3.2).
3.1 Cloud Computing for Analytics
Since cloud computing emerged, its effects on IT infrastructure, network services and applications have been
enormous. Data warehousing and analytics is no exception and is even a key use case, as the cloud provides
generous processor and storage resources to assure processing speed and volume scalability.
At PSL we are witnessing this change in most of the new analytics projects being launched with new customers. Most
customers that engage with PSL do so because of we are seen as innovators which can help with cutting the time to
market in their solutions. In that respect, our current experience in cloud deployments is the key, and best practices
needs to be shared as widely as possible.
1
By these we mean query and reporting tools, data mining tools, dashboards/scorecards, and analytic applications embedding query and
reporting technology, as well as domain expertise on a particular business process.