Please login

Prime

Prepinsta Prime

Video courses for company/skill based Preparation

(Check all courses)
Get Prime Video
Prime

Prepinsta Prime

Purchase mock tests for company/skill building

(Check all mocks)
Get Prime mock

Normalization in DBMS

What is Normalization in DBMS?

 

In this article, we will learn about Normalization in DBMS.

We define Normalization as a  step-by-step process via which we reduce the complexity of a database. The database may have a lot redundancies (unnecessary information that has become outdated or no longer useful) like attendance record for former students in a university.

A lot of anomalies may be caused in a database because of various cause insertion, deletion and updation operation or concurrency errors.

There are 6 different normalisation benchmarks used which are –

  • 1 NF (normal form)
  • 2 NF
  • 3 NF
  • BCNF (Boyce-Codd Normal Form
  • 4NF
  • 5NF (not used anymore)
  • 6NF (not used anymore)

1NF

Name – 1 Normal Form

Prerequisite –

The relation should not have any multivalued attribute at all in it. For example, for Student table in the database (relation). In the phone number column there must only be one phone number (value).

How to solve –

Decompose the table into single values attributes. Either the second phone number must be deleted or it can be decomposed into primary and secondary number.

Not 1 NF

ID Fname PhoneNumber
1 Bran 915988589, 989458602
2 Sansa 790614709
3 Jon 700293958

Is 1 NF

ID Fname Primary_No Secondary_No
1 Bran 915988589 989458602
2 Sansa 790614709
3 Jon 700293958

Is 1 NF

IDFnamePhoneNumber
1Bran915988589
2Sansa790614709
3Jon700293958

2NF

To understand the conditions for 2NF, one must know the following –

  • What is non prime attribute
  • What is an candidate Key

Candidate Key – A candidate key is a column, or set of columns, in a table that can uniquely identify any database record without referring to any other data. The best set of columns which identity it uniquely is chosen as candidate key.

Non Prime Attribute – Is the row that doesn’t belong to the candidate key for the table.

Example –

In the table below, if you want to identify any row of the table uniquely, we need both {StudentID, CourseID} thus, it becomes candidate key.

And since CourseName is not part of candidate key. It becomes non-prime attribute.

Enrolled Courses Table

StudentIDCourseIDCourseName
1CSE101C
1CSE102C++
2CSE101C
2CSE102C++
3CSE103DBMS

What is 2NF Finally?

To be 2NF in DBMS. The relation should be –

  • Already 1NF
  • No Non Prime attribute should be dependent on any candidate key element, which is called not having partial dependency. This maybe is difficult to understand from definitions but easier from example.

In table, the candidate key is {StudentID, CourseID}, the name of the course is obviously dependent on the CourseID right? CourseID, is non prime also. Thus there is partial dependency in it. Thus, the relation is not 2NF.

To make it 2NF we can break the table as follows –

Enrolled Courses Table

StudentIDCourseID
1CSE101
1CSE102
2CSE101
2CSE102
3CSE103

Courses Table

CourseIDName
CSE101C
CSE102C++
CSE103DBMS

3NF

For a relation to be in 3NF form the following must hold true –

  • Should be 2NF already
  • There should not be any transitive dependency for non prime attributes

Don’t worry if you don’t know what transitive dependency is, we will understand this with an example.

Student Table

StudentIDNameAgeStateCountry
1Bran16UPIndia
2Jon21UPIndia
3Arya17UPIndia
4Jamie41GujaratIndia
5Tyrion41GujaratIndia

Now, for the table above. The primary key obviously is StudentID.

However, the following dependencies are there –

  1. StudentID -> Name
  2. StudentID -> Age
  3. StudentID -> State
  4. State -> Country

Now, when we look at point 3 and 4 they form transitive dependency as they are of format A -> B and B -> C. Thus, this forms transitive dependency is not in 3NF.

BCNF

The BCNF is the short for Boyce-Codd Normal Form the following are the conditions for a relation to be in BCNF form –

  • Should be 3NF
  • If every non-trivial functional dependency A –> B, A is a super key, i.e. LHS should always be a super key.

Let’s have a look onto an example to understand this better. Let’s say a student enrolling a course can also register for courses from other branches and the course registration table for student is stored in the following way –

Course Registered Table

StudentIDNameCourseNameBranchIDCourseID
1BranCCSE101
1BranFluid MechanicsMECH101
1BranC++CSE102
2JamieCCSE101
2JamieBasic ElectronicsEEE101

Candidate Key – {StudentID, CourseName}

  • To identify each row uniquely we need {StudentID, CourseName}
  • Thus, its the candidate key

Functional dependencies are –

  1. StudentID -> Name
  2. CourseName -> {BranchID, CourseID}

Now, neither the 2nd is a dependency of type X -> Y, nor CourseName is a super key.

Note – 

Set of attributes that can help us uniquely identify a record or tuple in the database is called super key.

Student Table

StudentIDName
1Bran
1Bran
1Bran
2Jamie
2Jamie

Course Table

CourseNameBranchIDCourseID
CCSE101
Fluid MechanicsMECH101
C++CSE102
CCSE101
Basic ElectronicsEEE101

4NF

To be in 4NF form the follwing must be true

  1. Should be in BCNF form
  2. Should not have any Multi-Valued Dependency.

Example –

A -> BC