

W H I T E P A P E R
© 2017 Persistent Systems Ltd. All rights reserved. 72
www.persistent.com
7.2.3 Performance at Dimensional DWModel design stage
—
Design the DW model in such a way that it provides a balance between requirements of slice/dice and filters
across dimensions and fact data, and the need for preserving history of changes
(page 262, 270, 332, 338,
[1]339). We review the three patterns introduced in section
froma performance point of view hereunder:
5.2—
Star Schemas allows for history to be tracked individually on dimension tables, as well as slicing/dicing by
dimensions. De-normalization of dimensions is done to avoid capturing too many minuscule relationships.
This is generally themost efficient froma performance angle.
—
Snow-flake model (normalized or 3NF) works best if you want to slice-dice by all possible relationships.
Other condition where this may apply is in the case of ragged dimension hierarchies and outrigger
dimensions, as explained in section
.However, due to too many joins, the database view complexity
5.3.3goes up and performance is slower. If this is really a requirement, then assess the performance impact
before designing, as the space saving may not be substantial (between 1 to 3% as compared to de-
normalization –
page 267).
[1]—
Fully de-normalized fact table with dimension attributes and no dimension tables (degenerate
dimensions) – this may be the most simplistic design approach but may be difficult to maintain in the long
run. If any attribute value changes (SCD Type-I or SCD Type-II), it may warrant updating the whole fact
table which is costly from performance point. Also, if there is a need to show a filter in reports to choose
values from, then it would require creating additional views on the fact table with DISTINCT clause which is
inefficient. If slicing/dicing by dimensions is not a key requirement, then this pattern is best from
performance as there are no or minimal joins or indexes lookups. Degenerate dimensions are normal and
helpful too.
—
Design aggregate tables to speed-up dashboards and score-cards which only show summary data. The
aggregations are used to delivery fast query responses. Listed below are 3 aggregation patterns deriving from
[1] .For purpose of discussion, we shall use a dimension model with Attendance as the main fact, and the
Dimensions as Date, Promotion, Showing, Theatre, andMovie.
—
Lost dimensions/ Summaries of Fact tables are created by only using required dimensions. Data from
other dimensions is “lost”. For instance, a daily sum of earnings by theatre is a case of lost dimensions as
the dimensions Promotions, Showing, andMovie do not matter.
—
Rolled up dimension: Aggregations is often used with dimensions that are hierarchical in the business
world. Some common aggregation dimensions would be date and geographies. However, some business
specific aggregations follow dimensions of fixed hierarchies. These dimensions are also called a
Shrunken Rollup dimension.
—
Collapsed dimension (de-normalized): Columns that are part of dimensions usually get reproduced as is in
the de-normalized structure that has the facts and the dimension attributes.