Table of Contents Table of Contents
Previous Page  25 / 96 Next Page
Information
Show Menu
Previous Page 25 / 96 Next Page
Page Background

W H I T E P A P E R

© 2017 Persistent Systems Ltd. All rights reserved. 25

www.persistent.com

The way the paragraph is written can mislead the reader into believing that 75%of the time is spent after development,

and that a DW/BI project must be executed in the waterfall style depicted. Modern agile methodologies include test

driven development which stresses writing unit tests early and running them often through automated processes;

while this happens, QA developers can focus on developing test datasets and end-to-end system tests early enough

as

well.At

each sprint a tested, if possible end-to-end reduced scope should be delivered.

But we believe nevertheless that Kimball has a point. Data management for analytics is all about complex data and

systems integration, cleansing bad data, loading and managing large data volumes, all of which are hard to get right;

and besides, the true measure of success in a DW/BI project is business user acceptance of the deliverables to

improve their decision-making. All of this makes testing and validation challenges much harder than in a traditional

application. In our experience, the effort to acquire, integrate and provide high quality data has always been more

demanding than anticipated.

To tackle this challenge, the recommendation from a technical architecture point of view is to

seriously invest in

testing automation

. This means

Use testing tools

Automate the test environment to run both unit tests and system test suites

Log results, compare against known correct results, and publish reports on these results

Invest in performance and load testing tools

Use a bug tracking system and use BI reports and metrics to track deployment readiness as well as user

reported bugs

Details on the necessary testing that needs to be carried out for alpha and beta testing of the DW/BI solution are

summarized in section

.

And, as in the previous section, the deployment activities centered on dimensional

5.3.6

model, data cleansing, and performance and security topics are to be found in these chapters corresponding

deployment sections.

The remaining aspects not developed in this document are related to lifecycle processes: backing up, recovering,

physical database maintenance and automated system operation. These aspects are covered in

,

chapter 13,

[1]

pages 567 - 573.

4.4 Enhancements to the reference publication

4.4.1 Technical Architecture in our own experience

The following are the key decisions that must be made while crafting the ETL architecture and the presentation server

for a dimensional data warehouse. These decisions have significant impacts on the upfront and ongoing cost and

complexity of the ETL solution and, ultimately, on the success of the overall BI/DW solution. General advice and best

practices involved in taking them are highlighted in

boldface

. Needless to say, in many aspects we coincide with

Kimball, but we have strived to color our presentation with our own experience.