Table of Contents Table of Contents
Previous Page  90 / 96 Next Page
Information
Show Menu
Previous Page 90 / 96 Next Page
Page Background

W H I T E P A P E R

© 2017 Persistent Systems Ltd. All rights reserved. 90

www.persistent.com

Conclusion

In this document, we have attempted to document a collection of best practices for acquiring, integrating, modeling and governing

data from a growing number of sources for analytics opportunities. We picked the best reference publication available, Kimball’s

data warehousing lifecycle toolkit

,

and summarized it along what we believe are the four most important transversal topics:

[1]

dimensional data modeling, data quality, performance and security, as well as technical architecture. Each of these topics is briefly

presented in its own chapter, and followed by a summary of its treatment through the requirements gathering, design,

development, and deployment phases in Kimball’s lifecycle, serving as a high-level view of the reference document from the

vantage point of the topic in question.

We have gone beyond Kimball’s traditional data warehousing architecture by recognizing that the mission of analytics has

expanded on three very important areas not covered by Kimball’s reference publication: Cloud analytics deployments, Big Data

and the use of self-service tools along with use of agile development techniques, which we believe will be key drivers for any Data

Management endeavor today. Even though some of these tendencies are still evolving, we have attempted to grow the list of

traditional data warehousing best practices with our own recent experience on these new areas, in addition to our experience in

traditional warehousing projects.

Our key recommendations are:

The true measure of success in a DW/BI project is business user acceptance of the deliverables to improve their

decision-making. Thus, strive to collaborate with your business users at every step of the lifecycle.

Leverage conformed dimensions as the basis for integration; this is still true independently of the deployment and the

database technology. Understand and embrace dimensional design as the organizing theme for the data within the

DW/BI environment.

Think about performance, scalability, security, usability and other non-functional aspects right from requirements and

design: these aspects cannot be treated as an after-thought. Make sure you understand the tradeoffs in the new cloud

and big data environments.

Pay special attention to data quality. Many data projects fail for lack of attention to this topic. The list of available

technologies and services has expanded, so don’t reinvent the wheel. But also, make sure you understand what is meant

by high quality big data in such an environment.

Embrace agile methodologies, but beware of extracting a limited source data or focusing on a narrowly-defined set of

business requirements in a vacuum. The deliverable may be built quickly to claim success, but it should be leveraged by

other groups and integrated with other analytics.

Consider testing very seriously, invest in testing automation early and take advantage of agilemethodologies to test early

and frequently. Be flexible to experiment with self-service systems with your business users.

Finally, remember that a DW/BI system evolves and grows after the first deliverable. Metadata management should help

you down the line with data lineage and impact analysis functionality.

We hope that having access to Kimball’s proven reference publication and having read this document’s collection of best practices

for both traditional, on premise data warehousing and cloud analytics and big data environments will help you in designing,

building, maintaining and extending a successful analytics solution.