

W H I T E P A P E R
© 2017 Persistent Systems Ltd. All rights reserved. 6
www.persistent.com
2.1 Reference publication: Kimball's DataWarehousing Lifecycle
nd
The reference publication we selected is “The Data Warehouse Lifecycle Toolkit”, 2 edition, by Ralph Kimball, Margy
Ross, Warren Thornthwaite, Joy Mundy and Bob Becker
.This second edition is from 2008 and significantly
[1]updates and reorganizes the first edition, published 9 years before. It is a very practical field guide for designers,
managers and owners of a data warehousing / business intelligence (DW/BI) system. It is well-known to PSL
architects from theAnalytics practice.
The book describes a coherent framework, the “Kimball Lifecycle”, covering all the way from the original scoping and
planning of an overall enterprise DW/BI system, through the detailed steps of developing and deploying, to the final
steps of planning the next phases. The Lifecycle diagram below depicts the sequence of high level tasks required for
effective DW/BI requirements gathering, design, development, and deployment.
Figure 1: The Kimball Lifecycle diagram
After a planning and requirements gathering phase, the design and development activities are carried on three
parallel tracks: a technology track, a data track and a BI track. The data track shows the journey of data from the
sources to the target environment: It is extracted, transformed and loaded from the sources into the target
environment to make it comply with a data model exposed to the BI applications. For reasons explained in chapter
5below, the preferred model for analysis follows a dimensional model design, and it is also managed at the physical
level for optimal performance and other non-functional requirements. These tracks converge at deployment, followed
by maintenance activities and planning for growth.
It is important to note that the notion of data marts is also covered on this publication: it is now referred to as “business
process dimensional model” (see Glossary). The reason for this change is because the authors consider that the term
“data mart”, in their own words, “has been marginalized by others to mean summarized departmental, independent
non-architected datasets”.