Home
Username:
Password:
Database Optimization Tutorials

Database Normalization / Denormalization




Visitors to VTC.com will be able to view all introductory videos for each training course.
Free Trial Members will gain access to first three chapters for each training course.
Full Access Members have full access to VTC.com’s entire library of video tutorials.


Learn More

Subtitles of the Movie

In our last videos, we've been talking all about Normalization. We've emphasized how important it is to remove the redundancy from our data. Now, we're going to talk about Denormalizing the database or putting redundancy right back into it again. When we talk about Denormalization, we first have to keep in mind that the database is going to be Normalized. That means, we will have actually removed all the redundancy from the database first. Denormalization is indeed just the opposite. We're gonna add the redundant data right back in again. However, we must have a Normalized database first so that we can ensure we are not adding too much redundancy to the database itself. The whole reason that we are going to Denormalize is about speed. We're going to use it to actually provide rapid access to the data itself. We're gonna use it for balance, balance between duplication and performance. Now, you want to keep in mind that if you have an older computer that might have a slow input, output subsystem, Normalization can slow it down. Denormalization can actually enhance the performance. The key here is how fast can we process the data? If I have a great deal of users all accessing the same table at the same time for different reasons, if I setup a system of redundancy, I'll actually gain speed because the users will not have to go to exactly the same location at exactly the same time and that is what Denormalization is about. Now, it's very important to remember that you have to keep the redundant values correct. If I have two separate locations for the data, I'm setting myself up for inaccuracies so you have to be able to identify some type of pattern for how the redundancy was actually agreed to. Now, in Denormalization, we have Normalized the database first and that's very important. Denormalization does not take place until after the database has been Normalized. So let's very quickly review the steps. Our first Normal form told us to eliminate the redundancy. The second Normal form told us to separate the key components and then relate the tables back to each other again. Our third Normal form was all about our attributes. We had to be absolutely sure that all of the columns define the primary key and stayed on target with just one subject per table itself. Now once we have Normalized that database, we can continue to Denormalization. The key here is a fully Normalized database won't have many redundancies. As a result, INSERT, UPDATE, and DELETE, they are much, much easier to code. So we have to Normalize first so that we really speed up the coding process of the database. I'm also going to remember that for any application code, it's always going to be easier to write and maintain if I'm only having to worry about one location for the data instead of multiple locations. Now, also, I want you to keep in mind that a Denormalized database is not the same as a non-Normalized database. If I skipped Normalization altogether and follow poor design techniques, the database is non-Normalized. In other words, nobody ever did anything to it. Denormalized databases have always been meticulously Normalized first. We have to eliminate as many redundancies as possible then we are gonna carefully analyze the database and only put back the ones we need. Remember, when we're working with Denormalization, we're really trying to enhance performance. Different times that we can do this is with repeating groups of information. If you have a lot of individual information that you're always pulling up as a group, redundancy can help the speed of the database. This is also the cases we discussed earlier with usability. If different users access tables in many different ways, it's much easier to actually add the redundancy to the database itself and also look at how frequently do you run queries on your data. If you have a query that is actually going to be pulling the data 60 percent of the time or more, consider Denormalizing. The concept here is that you have one portion of your database that is repeatedly used. As a result, I can set that off to the side and really gain some performance by minimizing how much is pulled from each of the tables each time the query is run. Also, when you have calculations that are applied to columns before queries are successfully answered in the coding itself, Denormalization can help. If you have large or clumsy primary keys, they can become very, very time consuming and take up a lot of the resources of your computer. Again Denormalization can help. Other aspects that come in mind are the actual type of database you are using. If for instance you are using a decision support database. Denormalization is probably the way to go. Now, keep in mind that in these types of databases, we're not really adding data as frequently as we're just using it. We don't have to worry about redundancy because our data is static. You have a historical database for instance. The odds are going back into last year's data to change it are very, very minimal. As a result, redundancy really isn't an issue as it applies to accuracy. Now, keep that in mind as opposed to a transaction processing database. In those particular cases, you have high volumes of activity and you really do not want to Denormalize the database itself. Finally, if you have complex relational joins within the database, this is another area where the redundancy may be better. Joins are slowing down our system anyway. In these particular instances, setting up the duplication just might be a better way to go.

Tutorial Information

Course: Database Optimization
Author: Lauri Sowa-Matson
SKU: 34078
ISBN: 1-935320-94-7
Release Date: 2010-01-06
Duration: 8 hrs / 111 lessons
Work Files: Yes
Captions: Available on CD and Online University
Compatibility: Vista/XP/2000, OS X, Linux
QuickTime 7, Flash 8

VTC Sign up & Benefits

  • Unlimited Access
  • 98,729 Video Tutorials (23,265 free)
  • Video Available as Flash or QuickTime
  • Over 1026 Courses
  • $30 for One Month Access
  • Multi-User Discounts Available