Table of Contents Table of Contents
Previous Page  93 / 96 Next Page
Information
Show Menu
Previous Page 93 / 96 Next Page
Page Background

W H I T E P A P E R

© 2017 Persistent Systems Ltd. All rights reserved. 93

www.persistent.com

Data Lake.

This term refers to an emerging approach to extracting and placing big data for analytics, along with corporate data, in

a Hadoop cluster which several components are layered in order to effectively enable data scientists and business analysts to

extract analytical value out of the data. One of the primary motivations of a data lake is not to lose any data that may be relevant for

analysis, now or at some point in the future. Data lakes thus store all the data deemed relevant for analysis, and ingests it in raw

form, without conforming to any overarching data model.

Data Mart.

In the Kimball reference book this term now means “business process dimensional model”. In the first version, the term

“data mart” was intended to refer to “a logical subset of the complete data warehouse”, generally restricted to a single business

process or to a group of related business processes targeted toward a business group. The reason for the change, in their own

words, was because “the term has been marginalized by others to mean summarized departmental, independent non-architected

datasets”.

DW/BI System.

The complete end-to-end data warehouse and business intelligence system.

Data Warehouse, DW or EDW.

This is the queryable data in the DW/BI system, the largest possible union of presentation server

data.Adata warehouse is made up of the union of all its business process dimensional models.

ETL System.

Extraction, transformation and loading system consisting of a set of processes by which the operational source data

is prepared for a data warehouse.

Master Data Management.

Master data are defined as the basic characteristics of key business entities, such as customers,

products, employees, and suppliers of an organization. Master data management, or MDM, are systems designed to create,

manage and hold master copies of these key business entities, and have been built in response to the proliferation of tables

managed by different transactional systems, which often represent the same business entity multiple times. MDM systems support

these transactional systems and have a way to reconcile different sources of data for attributes of the same real world business

entity. MDM systems typically consume data quality software through specificAPIs tomake sure the data they hold are clean.

There are two main scenarios supported by MDM systems: (I) as a central hub to create master data, which then replicates to

applications, and (ii) a master data consolidation application. In both scenarios, MDM systems help enormously in removing data

silos. MDM systems provide a bridge tomovemaster data in silos to a single source of truth, managed by theMDM system.