

W H I T E P A P E R
© 2017 Persistent Systems Ltd. All rights reserved. 81
30
This might not be the case, as explained below.
www.persistent.com
c. Performance of reports and dashboard (less than 3 seconds for transactional reports, about twice as
much for analytic reports) was also critical due to nature of business and all data to be reflected in near
real-time (15-minute delays)
d. Data volume was close to 15 TB per 1000 tenants (schools), the public cloud provider was Microsoft
Azure.
—
Solution
a. Customer initially had decided to build an Operational Data Store (ODS) type of data warehouse. The
initial results for transactional reports were substantially above the requested performance requirements,
due to the large number of joins between dimension tables in the normalized schema.
b. PSL proposed not to build a separate ODS but follow a dimensional de-normalized model to serve both
transactional reports on a particular effective date as well as analytic reports.
c. PSL proposed to build aggregate tables along with Aggregate-aware feature in reporting tools like SAP
BusinessObjects or IBMCognos.
d. PSL conducted a workshop to help customer select adequate database platforms and analytic tools that
are designed for high performance such as MPP DBMSs, and also investigated in-memory databases,
columnar databases, and DWappliances. Given that in their future plans they want to handle unstructured
and semi-structured data, PSL also investigated big data stores such as Hadoop with MapReduce, Spark,
as well as No-SQLdatabases.
e. PSL recommended to use an MPP database engine with a shared schema approach, partitioning by
tenant-id and suitable indexing scheme tomake reports performand keeping below requirements inmind
I. Analytic app with terabyte-size datamart
ii. Intense queries not following the dimensional pattern
iii. Linear scalability
The customer has not made the final choice yet.
7.3.2 Security best practices based on our own experience
—
Access control on DWdata contained in dimensional model entities
—
This approach is most widely used and is common when the security provided by DB engines is sufficient
30
to control the access . This is preferred to the pre-defined security framework provided by BI tools on DB
engines are proprietary in nature and can only be used for the tool in question.
—
The security approach to control the access on all dimensions, fact tables and summary tables via views is
the simplest and the one we recommended most, because of its flexibility in assigning user roles (provided
by BI tools or DB roles) to different sub-systems (or subject areas or data marts) of the DW. This process
involves identification of all data objects that will be exposed to the end-users, classification of data tables
and fields as per user roles and access levels.