

W H I T E P A P E R
© 2017 Persistent Systems Ltd. All rights reserved. 46
www.persistent.com
As a practice,
we would continue to recommend usage of modeling tools
; however, in case limits are
encountered, it may be better to stay away from physical modeling. Either the versions cannot model the
newer physical constructs in the database (like newer data types), or they are not equipped to deal with newer
databases (like HPE Vertica).
—
Shared Data Architect/Modeler and ETL Architect roles.
In case of smaller projects or departmental data
marts,
ETL architect and Data Architect roles are often played by the same person
. However,
a
recommendation is for the person to recognize this dual role.
When executing an Agile project, the Data
Architect/Modeler is usually 1-2 sprints ahead of the ETL architect, the tasks should reflect it. Another
recommendation in this case is that since the “producer” and “consumer” of the model is the same person,
another set of eyes to review the work is useful.
Especially the BI Architect (responsible for the work in BI
tools) is usually the consumer of the work of both the Data Architect and ETL Architect, with focus on the
performance of the front-end visualizations and reports. This person can be a reviewer of the work.
—
Design the presentation layer database structures to suit the report refresh intervals, taking into
consideration different reporting patterns
In our experience we have seen reports of the following kinds:
—
Existence: Did audience of certain demographic attend an event?
—
Comparison: How does the revenue for a particular demographic compare to another demographic?
—
Ratios/rankings/clusters: How does the ratio of a particular month rank with annual?
—
Statistical: Top N earning events
—
Point-in-time Target/Order fill chart: For a particular date in the future, what is the order fill chart as of
today?
—
Rate/Velocity: What is the rate of booking as compared to rate of booking last year?
—
Time Trend (MoM, YoY): What are theYear onYear audience number?
—
Correlation: Increase/Decrease over a (non-date) dimension, usually numeric or scaled dimension.
(For example: age, distance).
—
CrossTab:Aplot of values of one dimension over the discrete values of another dimension
Some of these complex reports require a presentation layer that has to be refreshed fully, not just
incrementally. These full refreshes render a dashboard or report unusable for the duration of the refresh. To
remedy this, at times, it is required to generate two sets of tables, one from which the reporting is taking place,
and the other being refreshed. Sometimes this may be the key to deciding if an OLAP layer is needed or the
presentation layer tables can be created through ETL.