Dimension tables- Fundamental of Datawarehouse

Dimension table contain descriptive information.

Types of dimensions

There are three types of dimensions

Junk dimension
Conformed dimension
SCD or CDC dimension(Slowly changing dimension or Changed data capture)

Junk Dimension:

Data’s which are not useful for clients is called junk dimension.

Ex: Flags, Row inserted by user etc

Conformed dimension:

If dimension is connected with multiple facts then it is called as conformed dimension.

Ex: Time

SCD or CDC:

Example: If say a person is changed from BTM to Silk board the changed values is inserted in the table.

If anything is changed in the source that values, we are loading in to the target this is known as SCD or CDC.

CRITICAL COLUMN: If values are changing according to the time such columns is called critical column.

Types of SCD:

SCD1 or TYPE1: Here only current value will be present in the dimension table. Here in SCD1 is going to overwrite on previous values.

SCD1 contains only current information, if anything changes in the source the changed value overwrites on the target so that SCD1 contains only current information.

We can analyze only current data.

SCD2: (SLOWLY GROWING TARGET)

If anything is changed in source then SCD contains both previous and current information.

If anything is changed in source then new row is inserted into a target, which contains both previous and current information.
Using this we can analyze complete data

Example: Say A person is moving from BTM to SLK BOARD in 2011 and then again he is moving to YELHANKA in 2012, This changes has to be entered even at destination.
Using SCD 2, a new row is inserted to maintain the records at target as shown in figure below:

DIMENSION TABLE:

CUSTOMER ID	CUSTOMER NAME	CUSTOMER ADDRESS	YEAR
100	RAM	BTM	2010
100	RAM	SLK BOARD	2011
100	RAM`	YELHANKA	2012

The target table size slowly increases so that it is also called as SLOWLY GROWING TARGET.

TYPES OF SCD2:

It has got 3 types

SCD2 WITH VERSION NUMBER

SCD2 WITH FLAG

SCD2 WITH DATE RANGE

SCD2 WITH VERSION NUMBER: -

CUSTOMER ID	CUSTOMER NAME	CUSTOMER ADDRESS	VERSION
100	RAM	BTM	1.0
100	RAM	SLK BOARD	1.1
100	RAM`	YELHANKA	1.2

SCD2 WITH FLAG: -

CUSTOMER ID	CUSTOMER NAME	CUSTOMER ADDRESS	FLAG
100	RAM	BTM	0
100	RAM	SLK BOARD	0
100	RAM`	YELHANKA	1

Here ‘0’ represents previous values and ‘1’ represents current value.

SCD2 with flag is used 99% of the time as loading 0 and 1 is easy.

SCD2 WITH DATE RANGE: -

CUSTOMER ID	CUSTOMER NAME	CUSTOMER ADDRESS	DATE
100	RAM	BTM	12^th March 2010
100	RAM	SLK BOARD	8^th Feb 2011
100	RAM`	YELHANKA	2^nd Dec 2012

MSBI

Search This Blog

Dimension tables- Fundamental of Datawarehouse

Labels

Comments

Post a Comment

Popular posts from this blog

BIG Data, Hadoop – Chapter 2 - Data Life Cycle

How to Copy or Move Multiple Files from One Folder to Another Folder using Talend