Table of Contents Table of Contents
Previous Page  35 / 96 Next Page
Information
Show Menu
Previous Page 35 / 96 Next Page
Page Background

W H I T E P A P E R

© 2017 Persistent Systems Ltd. All rights reserved. 35

www.persistent.com

5.3 Best Practices

This section lists some best practices from the reference book, from a dimensional data modeling perspective.

They appear in the sequel in

boldface.

5.3.1 Dimension DataModeling - Project definition stage

When the project is being launched and the core team is lined up, in addition to the classic roles of project manager,

data architect, ETL developer, BI developer, etc., a best practice is to

make sure there is a Data Steward

(

, p.

[1]

35).Acommon curse is departmental data silos, where every department has their own definition and interpretation

of data entities, and transactional events, no one’s data ties to anyone else, and the result is anarchy. Data

stewards should be empowered to establish, publish and enforce common definitions, rules and permissible

values for data for all information that is of cross-organizational interest in the enterprise. It is important to make

sure data stewards work with both business users and IT team members, and are well supported, as this is a very

politically challenging role. Down the line, together with the QA team, they will also identify data quality errors and

drive them to resolution (whichmay imply revisiting the definitions and the rules).

Make sure you

drive priorities based on business goals, and that they are balanced with delivery feasibility

assurance from IT architects. Set up realistic expectations with all stake-holders.

5.3.2 Dimension DataModeling - Requirements Interview

Document the high-level Enterprise Data Warehouse in the preliminary EDWbus matrix

(see section

)

as

4.1

an outcome of the interview process, as this practice helps come up with common dimensions that apply to several

business processes.

The best practice of designing conformed dimensions based on the bus matrix

is of

immense importance, as it serves as a data architecture blueprint to ensure that the DW/BI data can be integrated

and extended across the organization over time.

5.3.3 Dimension DataModeling - Logical

These are the logical dimensional datamodeling best practices from the reference book

. [1]

Derive dimensional models from workshops with business users and IT representatives

rather than

from an isolated designer sitting in an ivory tower. Engaging subject matter experts from the business is critical

to designing appropriate dimensional models.

Creation of durable Surrogate/Supernatural Key

for item keys that are less likely to change but,

surprisingly, do change (ISBN 13 is an example). Surrogate keys are created for primary keys in dimensions.

This is a useful concept, even more so in the world of rapid data growth/collection, and mergers and

acquisitions. See also the discussion on subsystems 10 and 14 in the section

below.

5.3.5

In the case of dimension hierarchies,

when the conditions below are met, a snowflake model may be

evaluated as a possible fit. Otherwise, the recommendation is to prefer de-normalized dimensions,

i.e., a star model.

The discussion in

,

pages 268 and 339 about dimension hierarchies and outriggers is

[1]

excellent for the practitioner and very applicable. We often encounter fixed hierarchies in the form of time-date,

and geography (city, county, district, province, state, or region). Snowflake models show each hierarchical

level separated into its own table, and this is acceptable when either