Table of Contents Table of Contents
Previous Page  19 / 96 Next Page
Information
Show Menu
Previous Page 19 / 96 Next Page
Page Background

W H I T E P A P E R

© 2017 Persistent Systems Ltd. All rights reserved. 19

www.persistent.com

To define calculations, filters, projections and/or aggregations, in order to implement a logical model suitable for

an analytic application, and

To be able to do this on a cloud or an on premise deployment.

In order for a business user to do this effectively, these environments must support a variety of other services:

Discovery of datasets, at least of data behind the firewall in on premise deployments, and in the underlying

repository in cloud deployments –these systems will also handle (if they don't do that today) discovery of

datasets on the cloud,

Data sampling,

Tagging and cataloging datasets as well as searching cataloged datasets through tags,

Collaboration workspaces where users can share transformations, datasets and queries; and

Last but not least, ameans for IT administrators and data architects

Tomonitor and govern the overall environment and

To operationalize business-driven data preparation workflows in the corporate data environment.

Typically, data preparation is run on small sets or on a sample; operationalization takes care of making this

work on larger volumes. It also deals with applying the preparation incrementally on new data from the

sources, modeling the end result, integrating it in the data warehouse and securing it at the corporate level.

A new brand of self-service data preparation tools is now gaining significant traction and mindshare in the market.

Gartner recognized this trend about 18 months ago, as a new market segment

which disrupts the analytics market

[8]

by empowering the business with an agile means to helping users find, assess, combine, and prepare their data for

consumption. Tools from this market segment include Triacta, Paxata, INFARev, Talend Data Preparation, Datameer

SAPAgile Data Preparation and Tamr.

4. Technical Architecture

4.1 Technical Architecture Overview

Architecture, in the data warehousing world, answers the question “how will we fulfill the requirements”. It describes

the plan for what you want the DW/BI system to look like when it can be launched in production, and the way to develop

the plan. Kimball

describes BI/DWarchitecture in chapters 4, 5, and 6, and calls out 3 distinct pieces.

[1]

1.

DataArchitecture

. At the core of the data architecture there is the firm belief that “dimensional modeling is the

most viable technique for delivering data for business intelligence because it addresses the twin non-

negotiable goals of business understandability and query performance”

( ,

page 233). The data architecture

[1]

is built with the following blocks:

Enterprise DW Bus Matrix, a conceptual tool in where the rows correspond to business processes and

the columns correspond to natural groupings of standardized descriptive reference data, or

conformed

dimensions

in Kimball parlance. Thematrix delivers the overall big picture of the data architecture for the

DW/BI system, while ensuring consistency and integrating the various enterprise dimensional models.