- You should be familiar with 1NF
The Second Normal Form (2NF) is a database normalization form that builds on the First Normal Form. The primary goal of 2NF is to eliminate partial dependencies.
Key Concepts
- A
Candidate keyis a minimal set of attributes that can uniquely identify each row in a table. There can be multiple candidate keys in a table. - A
Non-primeattribute is a column that isn’t part of any candidate key. - A
Functional dependencyis a relationship between two sets of attributes in a table, where the value of one set uniquely determines the value of another. - A
partial dependencyhappens when a non-prime attribute depends on only a part of a candidate key, rather than on the entire candidate key.
The 2NF Rule
To achieve 2NF, table should already be in 1NF and must satisfy the following condition:
Table does not have any non-prime attribute that is functionally dependent on any proper subset of any candidate key of the relation. In simpler terms, all non-prime attributes must depend on the entire candidate key, not just a part of it.
Example
We have a table enrollments with the following schema:
import { integer, pgTable, primaryKey, varchar } from "drizzle-orm/pg-core";
export const enrollments = pgTable("enrollments", {
studentId: integer("student_id").notNull(),
courseId: integer("course_id").notNull(),
courseName: varchar("course_name", { length: 255 }).notNull(),
}, (t) => [
primaryKey({ columns: [t.studentId, t.courseId] }),
]);Functional Dependencies
student_id, course_id -> course_name.course_id -> course_name.
Candidate keys
Only the pair of attributes student_id and course_id can uniquely identify each row in the table. Therefore, the candidate key is student_id, course_id.
1NF Analysis
The table is in 1NF because all attributes contain atomic values, and there are no repeating groups or arrays.
2NF Analysis
course_name is a non-prime attribute because it is not part of the candidate key. Moreover, it is partially dependent on course_id, which is a part of the candidate key student_id, course_id. This violates the 2NF rule.
This leads to redundancy and potential update anomalies. For example, if the course name for course_id - 101 changes, we would need to update multiple rows.
2NF Decomposition
To eliminate the partial dependency, we can decompose the table into two separate tables:
import { integer, pgTable, primaryKey, varchar } from "drizzle-orm/pg-core";
export const courses = pgTable("courses", {
id: integer("id").notNull().primaryKey(),
courseName: varchar("course_name", { length: 255 }).notNull(),
});
export const enrollments = pgTable("enrollments", {
studentId: integer("student_id").notNull(),
courseId: integer("course_id").notNull().references(() => courses.id, { onDelete: "cascade", onUpdate: "cascade" }),
}, (t) => [
primaryKey({ columns: [t.studentId, t.courseId] }),
]);With this decomposition course_name is no longer repeated in the enrollments table and each table is free from partial dependencies. We have achieved 2NF.