

W H I T E P A P E R
© 2017 Persistent Systems Ltd. All rights reserved. 35
www.persistent.com
5.3 Best Practices
This section lists some best practices from the reference book, from a dimensional data modeling perspective.
They appear in the sequel in
boldface.
5.3.1 Dimension DataModeling - Project definition stage
When the project is being launched and the core team is lined up, in addition to the classic roles of project manager,
data architect, ETL developer, BI developer, etc., a best practice is to
make sure there is a Data Steward
(, p.
[1]35).Acommon curse is departmental data silos, where every department has their own definition and interpretation
of data entities, and transactional events, no one’s data ties to anyone else, and the result is anarchy. Data
stewards should be empowered to establish, publish and enforce common definitions, rules and permissible
values for data for all information that is of cross-organizational interest in the enterprise. It is important to make
sure data stewards work with both business users and IT team members, and are well supported, as this is a very
politically challenging role. Down the line, together with the QA team, they will also identify data quality errors and
drive them to resolution (whichmay imply revisiting the definitions and the rules).
Make sure you
drive priorities based on business goals, and that they are balanced with delivery feasibility
assurance from IT architects. Set up realistic expectations with all stake-holders.
5.3.2 Dimension DataModeling - Requirements Interview
Document the high-level Enterprise Data Warehouse in the preliminary EDWbus matrix
(see section
)as
4.1an outcome of the interview process, as this practice helps come up with common dimensions that apply to several
business processes.
The best practice of designing conformed dimensions based on the bus matrix
is of
immense importance, as it serves as a data architecture blueprint to ensure that the DW/BI data can be integrated
and extended across the organization over time.
5.3.3 Dimension DataModeling - Logical
These are the logical dimensional datamodeling best practices from the reference book
. [1]—
Derive dimensional models from workshops with business users and IT representatives
rather than
from an isolated designer sitting in an ivory tower. Engaging subject matter experts from the business is critical
to designing appropriate dimensional models.
—
Creation of durable Surrogate/Supernatural Key
for item keys that are less likely to change but,
surprisingly, do change (ISBN 13 is an example). Surrogate keys are created for primary keys in dimensions.
This is a useful concept, even more so in the world of rapid data growth/collection, and mergers and
acquisitions. See also the discussion on subsystems 10 and 14 in the section
below.
5.3.5—
In the case of dimension hierarchies,
when the conditions below are met, a snowflake model may be
evaluated as a possible fit. Otherwise, the recommendation is to prefer de-normalized dimensions,
i.e., a star model.
The discussion in
,pages 268 and 339 about dimension hierarchies and outriggers is
[1]excellent for the practitioner and very applicable. We often encounter fixed hierarchies in the form of time-date,
and geography (city, county, district, province, state, or region). Snowflake models show each hierarchical
level separated into its own table, and this is acceptable when either