SQL Server: Difference between normalization and de-normalization

Denormalization is the process of attempting to optimize the performance of a database by adding redundant data or by grouping data.

More disk space is used as the same data is duplicated in more than one table
DML operations are more expensive as the same data must be maintained in more than one table
You run the risk that the data can get "out of sync"

Denormalized databases fair well under heavy read-load and when the application is read intensive. This is because of the following reasons:

The data is present in the same table so there is no need for any joins, hence the selects are very fast.
A single table with all the required data allows much more efficient index usage. If the columns are indexed properly, then results can be filtered and sorted by utilizing the same index. While in the case of a normalized table, since the data would be spread out in different tables, this would not be possible.

Although for reasons mentioned above selects can be very fast on denormalized tables, but because the data is duplicated, the updates and inserts become complex and costly.

Example :

COMPANYCODE DIVISION CUSTOMER RECORDS

1001 LFA .115........... 5

1001 POC 121........... 9

1002 USA ...........006........... 2

1002 IN ...........568........... 7

Normalized databases fair very well under conditions where the applications are write-intensive and the write-load is more than the read-load. This is because of the following reasons:

Normalized tables are usually smaller and have a smaller foot-print because the data is divided vertically among many tables. This allows them to perform better as they are small enough to get fit into the buffer.
The updates are very fast because the data to be updated is located at a single place and there are no duplicates.
Similarly the inserts are very fast because the data has to be inserted at a single place and does not have to be duplicated.
The selects are fast in cases where data has to be fetched from a single table, because normally normalized tables are small enough to get fit into the buffer.
Because the data is not duplicated so there is less need for heavy duty group by or distinct queries

Normalization is the process of de-composing a relation with anomalies into a well structured relation.

Example:

COMPANYCODE CUSTOMER RECORDS

1001................115..........5

1001................121..........9

1002................006..........2

1002................568..........7

COMPANYCODE DIVISION

1001.............LFA

1001.............POC

1002.............USA

1002.............IN

SQL Server

Thursday, 3 July 2014

Difference between normalization and de-normalization

No comments:

Post a Comment