Each table typically should represent a single subject. For example, in looking at our database in PostgreSQL, the customer table only contains the data about the customer, the employee table only contains data about the employee, and so forth. In addition, each intersection of a row and column should only contain one value and not multiple values.
EXAMPLE
If you look at the example from the movie rating database, see that actor and genre have multiple values:You also want to ensure that there is no data item that is unnecessarily stored in more than one table, to help minimize redundancy. This way, if data needs to be added, updated, or deleted, it only needs to be done in one place. All of the attributes in a table should be dependent on the primary key so that the data can be uniquely identifiable by the primary key values. Each of the tables should have no insertion, update, or deletion anomalies so that you can ensure that the data is consistent.
To help reach these objectives, database designers go through the normalization process one stage at a time. For the first normal form (1NF), you ensure that each column in our tables is single-valued, meaning it does not contain multiple values. By doing so, you eliminate any repeating groups of columns. You also ensure that each of the attributes or columns in a table has a unique name. The first normal form (1NF) is also when you identify the primary key of each table, with the attributes being dependent on the primary key.
For the second normal form (2NF), our tables should be in the first normal form (1NF) to begin, but you also ensure there is no partial dependency. It can help to look at a table like the employee table to understand dependency:
The primary key in this table is the employee_id. This allows us to uniquely identify each row, even if you happen to have two employees with the same name. The employee_id is unique for every single row, and using it, you can reference a specific record.
EXAMPLE
For example, if you need to find out the address of a specific employee, you can use the employee_id to reference it. That is also the case for every other column in this table. This means that every column depends on the primary key, and its respective data can be fetched by using it.Now that you have an understanding of dependency, you can now define what partial dependency is. Partial dependency can occur when an attribute in a table depends on only a part of the primary key, and not the whole key. This can be the case even if the primary key is uniquely generated as you have to consider candidate keys. If this occurs, you need to remove the attribute that is causing partial dependency and move it to another table where it is fully dependent on the primary key.
For the third normal form (3NF), the tables should be in the second normal form (2NF), and additionally have no transitive dependencies. A transitive dependency exists when there are functional dependencies.
EXAMPLE
For example, you may have a column X dependent on the primary key of a table, but then you have column Y being dependent on column X. With this, you have a transitive dependency, with column Y being dependent on the primary key but indirectly. The solution ends up being quite simple: remove the columns that are creating a transitive dependency into another table.You will learn about higher normal forms in a later tutorial, such as 4NF, in which there are no independent multivalued dependencies. But remember that the third normal form (3NF) is typically where the normalization process ends.
Source: Authored by Vincent Tran