Packt+ | Advance your knowledge in tech

You're reading from Practical Business Intelligence Optimize Business Intelligence for Efficient Data Analysis

Product type Paperback

Published in Dec 2016

Publisher Packt

ISBN-13 9781785885433

Length 352 pages

Edition 1st Edition

Languages

Python

Concepts

Business Intelligence

Author (1):

Ahmed Sherif

View More author details

Table of Contents (16) Chapters

Practical Business Intelligence

Credits

About the Author

About the Reviewer

www.PacktPub.com

Customer Feedback

Preface

1. Introduction to Practical Business Intelligence FREE CHAPTER

2. Web Scraping

3. Analysis with Excel and Creating Interactive Maps and Charts with Power BI

4. Creating Bar Charts with D3.js

5. Forecasting with R

6. Creating Histograms and Normal Distribution Plots with Python

7. Creating a Sales Dashboard with Tableau

8. Creating an Inventory Dashboard with QlikSense

9. Data Analysis with Microsoft SQL Server

Understanding the Kimball method

As we discuss the data warehouse where our data is being housed, we will be remised not to bring up Ralph Kimball, one of the original architects of the data warehouse. Kimball's methodology incorporated dimensional modeling, which has become the standard for modeling a data warehouse for business intelligence purposes. Dimensional modeling incorporates joining tables that have detail data and tables that have lookup data.

A detail table is known as a fact table in dimensional modeling. An example of a fact table would be a table holding thousands of rows of transactional sales from a retail store. The table will house several IDs affiliated with the product, the sales person, the purchase date, and the purchaser, just to name a few. Additionally, the fact table will store numeric data for each individual transaction, such as the sales quantity for sales amount. These numeric values are generally referred to as measures.

While there is usually one fact table, there will also be several lookup or dimensional tables that will have one table for each ID that is used in a fact table. So, for example, there would be one dimensional table for the product name affiliated with a product ID. There would be one dimensional table for the month, week, day, and year of the ID affiliated with the date. These dimensional tables are also referred to as lookup tables, because they basically look up what the name of a dimension ID is affiliated with. Usually, you would find as many dimensional tables as there are IDs in the fact table. The dimensional tables will all be joined to one fact table creating something of a "star" look. Hence, the name for this table arrangement is star schema, as seen in the following screenshot:

It is likely that the fact table will be the largest table in a data warehouse, while the lookup tables will most likely have fewer rows, some just one row. The tables are joined by keys, also known as foreign keys and primary keys. Foreign keys are referenced in fact tables to a unique identifier in a separate lookup table as primary keys. Foreign keys allow the most efficient join between a fact table and a dimensional table as they are usually a numeric data type. The purpose of a foreign key is to locate a single row in a lookup table to join to and establish a relationship. This rule is referred to as the referential integrity constraint and it exists to ensure that a key in a detail or fact table has a unique description to a lookup or dimensional table. As more and more rows are added to a lookup table, that new dimension is just given the next number of the identifier in line, usually starting with something like 1. Query performance between table joins suffers when we introduce non-numeric characters into the join, or worse, symbols (although most databases will not allow that).

You're reading from Practical Business Intelligence Optimize Business Intelligence for Efficient Data Analysis

Table of Contents (16) Chapters

Understanding the Kimball method

Personalised recommendations for you