Table of Contents Table of Contents
WP_Data Management Best Practices Next Page
Information
Show Menu
WP_Data Management Best Practices Next Page

 




Front Page
1

Contents
2

1 Introduction and Scope of this document
5

2 Document Plan
5

2.1 Reference publication: Kimball’s Data Warehousing Lifecycle
6

2.2 Enhancements to Reference Publication
7

3 Relevance of the new tendencies for our scope
8

3.1 Cloud Computing for Analytics
8

3.1.1 Cloud data warehouses
9

3.1.2 Cloud data integration
10

3.1.3 Cloud Application Integration
12

3.2 Big Data
13

3.2.1 Hadoop for ETL workloads
15

3.2.2 Hadoop for Analytics, or Data Lakes
16

3.2.3 Hadoop as a service
17

3.3 Self-service tools and agility
18

4 Technical Architecture
19

4.1 Technical Architecture Overview
19

4.2 DW/BI system architecture model
20

4.3 Best Practices
22

4.3.1 Technical Architecture at Planning Stage
22

4.3.2 Technical Architecture at Design Stage
23

4.3.3 Technical Architecture at Development Stage
24

4.3.4 Technical Architecture at Deployment Stage
24

4.4 Enhancements to the reference publication
25

4.4.1 Technical Architecture in our own experience
25

4.4.2 Cloud Deployments
31

4.4.3 Big Data
31

4.4.4 Self-service and agility
32

5 Dimensional data modeling
33

5.1 Dimension Data Modeling Overview
33

5.2 Brief Summary
34

5.3 Best Practices
35

5.3.1 Dimension Data Modeling - Project definition stage
35

5.3.2 Dimension Data Modeling - Requirements Interview
35

5.3.3 Dimension Data Modeling – Logical
35

5.3.4 Dimension Data Modeling – Physical
38

5.3.5 Dimension Data Modeling – ETL Considerations
38

5.3.6 Dimension Data Modeling – Deployment Considerations
42

5.4 Enhancements to the reference publication
44

5.4.1 Dimension Modeling in our own experience
44

5.4.2 Cloud Deployments
47

5.4.3 Big Data
48

5.4.4 NoSQL
49

5.4.5 Self-Service and Agility
50

6 Data Quality
51

6.1 Brief Summary
51

6.2 Best Practices
53

6.2.1 Data quality at project definition stage
53

6.2.2 Data Quality at Requirements Definition Stage
54

6.2.3 Data Quality at Design Stage
54

6.2.4 Data Quality at Development Stage
55

6.2.5 Data Quality at Deployment Stage
56

6.3 Enhancements to the reference publication
57

6.3.1 Our own experience
57

6.3.2 Cloud Deployments
59

6.3.3 Big Data
60

6.3.4 Self-Service Tools
66

7 Non-functional aspects
69

7.1 Brief Summary
69

7.2 Best Practices
69

7.2.1 Performance at Requirements and planning stage
69

7.2.2 Performance at Design and Technical architecture stage
70

7.2.3 Performance at Dimensional DW Model design stage
72

7.2.4 Performance at Implementation, physical design stage
73

7.2.5 Performance at Deployment and support stage
74

7.2.6 Security at Requirements and planning stage
74

7.2.7 Security at Design and Technical architecture stage
75

7.2.8 Security at Implementation, physical design stage
75

7.2.9 Security at Deployment and support stage
76

7.3 Enhancements to the reference publication
77

7.3.1 Performance optimization based on our own experience
77

7.3.2 Security best practices based on our own experience
81

7.3.3 Cloud deployments
83

7.3.4 Big data
86

8 Conclusion
90

9 Glossary
91

10 References
94