- You should be familiar with 2NF
The Third Normal Form (3NF) is a database normalization form that builds on the Second Normal Form. The primary goal of 3NF is to eliminate transitive dependencies.
Key Concepts
- A
Transitive dependency
occurs when one attribute in a database indirectly relies on another through a third attribute, causing redundancy. For example, if A depends on B (A -> B
) and B depends on C (B -> C
), then A is transitively dependent on C (A -> C
). - A
Functional dependency
(X -> Y
) in a relationR
is considered trivial if the set of attributesY
is a subset of (or equal to) the set of attributesX
. - A
Super key
is a candidate key or a superset of a candidate key.
The 3NF Rule
A relation is in 3NF if it is in 2NF and at least one of the following conditions holds in every non-trivial function dependency X -> Y
:
X
is a super key.Y
is a prime attribute.
Example
We have a table course_instructors
with the following schema:
import { pgTable, primaryKey, varchar } from "drizzle-orm/pg-core";
export const courseInstructors = pgTable("course_instructors", {
course: varchar("course", { length: 255 }).notNull(),
semester: varchar("semester", { length: 255 }).notNull(),
instructor: varchar("instructor", { length: 255 }).notNull(),
instructorEmail: varchar("instructor_email", { length: 255 }).notNull(),
}, (t) => [
primaryKey({ columns: [t.course, t.semester] })
]);
Functional Dependencies
course, semester -> instructor
.instructor -> instructor_email
.course, semester -> instructor_email
.
Candidate keys
Only the pair of attributes course
and semester
can uniquely identify each row in the table. Therefore, the candidate key is course, semester
.
2NF Analysis
The table is in 2NF because all non-key attributes are fully functionally dependent on the entire candidate key (course, semester
).
3NF Analysis
The table is not in 3NF because of the transitive dependency instructor -> instructor_email
. The attribute instructor_email
is dependent on instructor
, which is not a super key.
This leads to:
- Redundancy: as the same instructor’s email can appear multiple times for different courses and semesters.
- Update anomalies: as changing an instructor’s email would require updating multiple rows.
3NF Decomposition
To eliminate the transitive dependency, we can decompose the table into two separate tables:
import { pgTable, primaryKey, varchar } from "drizzle-orm/pg-core";
export const instructors = pgTable("instructors", {
instructor: varchar("instructor", { length: 255 }).notNull().primaryKey(),
instructorEmail: varchar("instructor_email", { length: 255 }).notNull(),
});
export const courseInstructors = pgTable("course_instructors", {
course: varchar("course", { length: 255 }).notNull(),
semester: varchar("semester", { length: 255 }).notNull(),
instructor: varchar("instructor", { length: 255 }).notNull().references(() => instructors.instructor, { onDelete: "cascade", onUpdate: "cascade" }),
}, (t) => [
primaryKey({ columns: [t.course, t.semester] })
]);
With this decomposition, we have eliminated the transitive dependency. The instructor_email
is now stored in a separate table, and we have achieved 3NF.