Table of Contents Table of Contents
Previous Page  6 / 96 Next Page
Information
Show Menu
Previous Page 6 / 96 Next Page
Page Background

W H I T E P A P E R

© 2017 Persistent Systems Ltd. All rights reserved. 6

www.persistent.com

2.1 Reference publication: Kimball's DataWarehousing Lifecycle

nd

The reference publication we selected is “The Data Warehouse Lifecycle Toolkit”, 2 edition, by Ralph Kimball, Margy

Ross, Warren Thornthwaite, Joy Mundy and Bob Becker

.

This second edition is from 2008 and significantly

[1]

updates and reorganizes the first edition, published 9 years before. It is a very practical field guide for designers,

managers and owners of a data warehousing / business intelligence (DW/BI) system. It is well-known to PSL

architects from theAnalytics practice.

The book describes a coherent framework, the “Kimball Lifecycle”, covering all the way from the original scoping and

planning of an overall enterprise DW/BI system, through the detailed steps of developing and deploying, to the final

steps of planning the next phases. The Lifecycle diagram below depicts the sequence of high level tasks required for

effective DW/BI requirements gathering, design, development, and deployment.

Figure 1: The Kimball Lifecycle diagram

After a planning and requirements gathering phase, the design and development activities are carried on three

parallel tracks: a technology track, a data track and a BI track. The data track shows the journey of data from the

sources to the target environment: It is extracted, transformed and loaded from the sources into the target

environment to make it comply with a data model exposed to the BI applications. For reasons explained in chapter

5

below, the preferred model for analysis follows a dimensional model design, and it is also managed at the physical

level for optimal performance and other non-functional requirements. These tracks converge at deployment, followed

by maintenance activities and planning for growth.

It is important to note that the notion of data marts is also covered on this publication: it is now referred to as “business

process dimensional model” (see Glossary). The reason for this change is because the authors consider that the term

“data mart”, in their own words, “has been marginalized by others to mean summarized departmental, independent

non-architected datasets”.