Tuesday, April 6, 2010

Types of Denormalization of database

We all know what is normalization of the database and many of you know what is denormalization as well. What this post is meant for is to define the methods of the denormalization.

There is very little theoretical info available on the web on denormalization despite it being widely used. In fact every programmer would have used denormalization in one way or other in many projects. So since I could not find any theoretical explaination of it I would try to define the types and their prescribed useage myself.

So children, let's start the "Denormalization Class".

There are three types of denormalizations (Please forgive my poor vocabulary if you find the names of the types not much convincing. Suggestions are always welcomed!).

1). Reference Centralization Method

I bet every programmer would have used this method in each database driven project that they have worked on. In this method you simply avoid unnecessary normalization on your database. An example would explain this better.

In a User table you would want to have references to the address and phone numbers. Normally you would simply add address and phone fields in the user table. Believe it or not this is "Reference Centralization Method". If you wanted to normalize it you would need to have address table reference ID in the user table, then phone number ID in the Addresses table to reference the Phones table and so on. So in short, when you decide to not to have an over normalized table - and have some fixed fields instead - you are using Reference Centralization Method.

One can argue that it cannot be termed "Denormalization" just because it's not normalization. But hey, I am here to classify the types of denormalization and I can clearly see this as one of the types. So what if it's so commonly used? I cannot ignore it.

See this link to know when to use it.

2). CSV Makeshift Denormalization

Sometimes you prefer to not to create an association table to link two tables, you put CSV values in one of the master tables just to keep it simple. This method is CSV Makeshift Denormalization.

See this link to know when to use it.

3). Scattered denormalization

Hmm. This one is interesting. This is completely opposite to the ideology of normalization. In this you store a single set of information in multiple places. For example instead of having just User_ID in the Blog_Description table you would store User_Name, User_Rating along with User_ID in parallel to the User table. Thus you have to update the Blog_Description table each time you update the User table. It may look weird but it's actually useful in some cases.

See this link to know when to use it.

Anymore type you can think of? Please suggest.

To keep the post smaller I will write another post to prescribe when to and how to use different types of Denormalizations.

No comments:

Post a Comment