

W H I T E P A P E R
© 2017 Persistent Systems Ltd. All rights reserved. 90
www.persistent.com
Conclusion
In this document, we have attempted to document a collection of best practices for acquiring, integrating, modeling and governing
data from a growing number of sources for analytics opportunities. We picked the best reference publication available, Kimball’s
data warehousing lifecycle toolkit
,and summarized it along what we believe are the four most important transversal topics:
[1]dimensional data modeling, data quality, performance and security, as well as technical architecture. Each of these topics is briefly
presented in its own chapter, and followed by a summary of its treatment through the requirements gathering, design,
development, and deployment phases in Kimball’s lifecycle, serving as a high-level view of the reference document from the
vantage point of the topic in question.
We have gone beyond Kimball’s traditional data warehousing architecture by recognizing that the mission of analytics has
expanded on three very important areas not covered by Kimball’s reference publication: Cloud analytics deployments, Big Data
and the use of self-service tools along with use of agile development techniques, which we believe will be key drivers for any Data
Management endeavor today. Even though some of these tendencies are still evolving, we have attempted to grow the list of
traditional data warehousing best practices with our own recent experience on these new areas, in addition to our experience in
traditional warehousing projects.
Our key recommendations are:
—
The true measure of success in a DW/BI project is business user acceptance of the deliverables to improve their
decision-making. Thus, strive to collaborate with your business users at every step of the lifecycle.
—
Leverage conformed dimensions as the basis for integration; this is still true independently of the deployment and the
database technology. Understand and embrace dimensional design as the organizing theme for the data within the
DW/BI environment.
—
Think about performance, scalability, security, usability and other non-functional aspects right from requirements and
design: these aspects cannot be treated as an after-thought. Make sure you understand the tradeoffs in the new cloud
and big data environments.
—
Pay special attention to data quality. Many data projects fail for lack of attention to this topic. The list of available
technologies and services has expanded, so don’t reinvent the wheel. But also, make sure you understand what is meant
by high quality big data in such an environment.
—
Embrace agile methodologies, but beware of extracting a limited source data or focusing on a narrowly-defined set of
business requirements in a vacuum. The deliverable may be built quickly to claim success, but it should be leveraged by
other groups and integrated with other analytics.
—
Consider testing very seriously, invest in testing automation early and take advantage of agilemethodologies to test early
and frequently. Be flexible to experiment with self-service systems with your business users.
—
Finally, remember that a DW/BI system evolves and grows after the first deliverable. Metadata management should help
you down the line with data lineage and impact analysis functionality.
We hope that having access to Kimball’s proven reference publication and having read this document’s collection of best practices
for both traditional, on premise data warehousing and cloud analytics and big data environments will help you in designing,
building, maintaining and extending a successful analytics solution.