

W H I T E P A P E R
16
For instance, a hire date on employees, or a product introduction date, and the business needs to slice and dice by non-calendar date
attributes (in this case, a virtual outrigger is to be set up on a date dimension table, as it plays a different role).
© 2017 Persistent Systems Ltd. All rights reserved. 36
www.persistent.com
—
Dimensions are large, as theremay be space savings,
—
Slicing-dicing by all possible hierarchy attributes is needed (in addition, some BI tools work better with a
snowflaked design), and/or
—
There is a need for other fact tables at a grain corresponding to higher hierarchy levels (as they can hook
into the appropriate level of the snowflake structure; besides, surrogate keys at higher hierarchy level may
be used for aggregating fact data with summary tables or with OLAP cubes–these keys enjoy the same
advantages related to stability under changes just discussed in the previous point).
—
The hierarchy is ragged, i.e., where the number of levels is variable and of the same type of data (as in
organization structures or in a parts hierarchy). In this case, a bridge table (a form of snowflaking) works
best, if you need to both restrict facts based on dimension data and summarize them based on the
hierarchical recursive relationship (see
,page 269).
[1]—
Another exception is what is called an “outrigger dimension”, which is a dimension table joined to another
16dimension tabl
e, i.e., another form of snowflaking. A discussion when it makes sense to use outriggers
can be read in
,page 267.
[1]—
When a fact measurement is associated with multiple values of the same dimension (even at the lowest level
of granularity), it is necessary to have
a bridge table to link the fact and the dimension
. Examples of open-
ended multiple values, such as a variable number of diagnoses for the same hospital bill line item can’t be
resolved directly on the fact table.
—
To satisfy lookups and joins, it is recommended to
have the dimension columns of the Fact table declared
as NOT NULL. This also means that each dimension should have a Null value as a row
. Generally, a row
with key -1, and value as “undefined” or “unknown” is inserted in the Dimension table, and the -1 is the
corresponding value in the related Fact table record, to make sure the lookup/join from fact to dimension is
correctly satisfied. NULLs are also not very intuitive for SQLsummary and ranking operations.
—
Historization/Slowly Changing Dimension: Kimball
,page 257 also has a great and educative discussion on
[1]SCD dimensions. There are multiple SCDs,
the most common of them is the Type 2
, which provides both
the newer and older entry, with an indicator for which is the most current record, and the begin/end validity
dates. See also the discussion on subsystem9 in the sectio
nbelow, as well as the discussion about mini-
5.3.5dimensions in subsystem12.
—
Special Types of Fact tables: These two kinds of facts have always been constructed where needed, in
addition to the basic transaction fact, in response to a report/query requirement.
—
Periodic Snapshot Fact Tables:
At the periodicity of a day, week or month, a snapshot is captured for
the fact table, with regular repeating measurement or set of measurements
. This is common in the
financial industry for monthly balances, in ERP and retail systems for inventory balances, and in the travel
industry for bookings. Both
en-masse
, end of period loads (adding rows per period) and continuous, rolling
loads (updating the current period rows) may be implemented.
—
Accumulating Snapshot Fact Tables: For processes with many states with a well-defined beginning and
end (e.g., added to cart, paid, shipped, delivered, returned), the lowest grain transaction fact would have
multiple records with different dates for the same order.
The accumulating snapshot fact has the
discrete states listed with dates
to enable queries like average interval between delivery dates and
return dates.