W H I T E P A P E R
www.persistent.com
© 2017 Persistent Systems Ltd. All rights reserved.
Analytic workload
• Operational reports to understand what is going
on at this moment, based on granular level of
detail.
• Analytical queries (aggregations, slice/dice) for
reports and dashboards identifying critical trends,
problems, outliers, find out root causes.
• [future release] Big Data – consolidate data from
sources like curriculum, health, social
• [future release] Predictive analytics – predict
outcome and provide recommendations
• Operational reports
Canned report:
Get daily
attendance of Pupils with SEN
status from my class
Ad-hoc report:
Give me a list
of all the pupils in my User
Defined Field ‘school choir’
and their contact details
• Analytical report:
Get average attendance of
pupils with SEN status for a
given year at a monthly grain
• Comparable number of
reports of each category
• Most users execute
operational reports
Response times
Depend on type of queries. Simple canned
operational reports < 1 Second; Medium < 2
Second; Complex < 3 Second; Dashboards are
treated as multiple parallel simple reports; Analytic
reports under 6 seconds
User scales
10000 active users (10 per school)
1000 connected users (10% of active users)
100 concurrent users for simple reports
10 concurrent users for
medium, 5 for complex
1000 schools per
deployment unit (see
below), each with 6 years
of history
Data volumes
Large (not huge): 15 TB per 1000 tenants (schools)
Data velocity
Small: ~ 830 records/sec (see comments column),
which corresponds to 250,000 records every 5 min
Leaves 2/3 of the capacity (10 min) for activity
peaks
Synchro. with OLTP
product every 15min, Avg
#recs /tenant/15 min = 250
Data variety
Near term versions: data is well known, comes from
OLTP product model. Beyond: small number of
external sources (curriculum, health, social, etc.).
Type of data
Structured for now; after next version of the
product, there may be some unstructured data
(social sources)
Data integration /
quality needs
Data transforms, slowly changing dimensions,
change data capture, batch pull
Many entities with 1: m
relationships that needed
to be tracked for each
change (SCD)
16