Drizzle | Second Normal Form (2NF)
This guide assumes familiarity with:
  • You should be familiar with 1NF

The Second Normal Form (2NF) is a database normalization form that builds on the First Normal Form. The primary goal of 2NF is to eliminate partial dependencies.

Key Concepts

  1. A Candidate key is a minimal set of attributes that can uniquely identify each row in a table. There can be multiple candidate keys in a table.
  2. A Non-prime attribute is a column that isn’t part of any candidate key.
  3. A Functional dependency is a relationship between two sets of attributes in a table, where the value of one set uniquely determines the value of another.
  4. A partial dependency happens when a non-prime attribute depends on only a part of a candidate key, rather than on the entire candidate key.

The 2NF Rule

To achieve 2NF, table should already be in 1NF and must satisfy the following condition:

Table does not have any non-prime attribute that is functionally dependent on any proper subset of any candidate key of the relation. In simpler terms, all non-prime attributes must depend on the entire candidate key, not just a part of it.

Example

We have a table enrollments with the following schema:

schema.ts
schema.sql
schema.txt
import { integer, pgTable, primaryKey, varchar } from "drizzle-orm/pg-core";

export const enrollments = pgTable("enrollments", {
  studentId: integer("student_id").notNull(),
  courseId: integer("course_id").notNull(),
  courseName: varchar("course_name", { length: 255 }).notNull(),
}, (t) => [
  primaryKey({ columns: [t.studentId, t.courseId] }),
]);

Functional Dependencies

  1. student_id, course_id -> course_name.
  2. course_id -> course_name.

Candidate keys

Only the pair of attributes student_id and course_id can uniquely identify each row in the table. Therefore, the candidate key is student_id, course_id.

1NF Analysis

The table is in 1NF because all attributes contain atomic values, and there are no repeating groups or arrays.

2NF Analysis

course_name is a non-prime attribute because it is not part of the candidate key. Moreover, it is partially dependent on course_id, which is a part of the candidate key student_id, course_id. This violates the 2NF rule.

This leads to redundancy and potential update anomalies. For example, if the course name for course_id - 101 changes, we would need to update multiple rows.

2NF Decomposition

To eliminate the partial dependency, we can decompose the table into two separate tables:

schema.ts
schema.sql
schema.txt
import { integer, pgTable, primaryKey, varchar } from "drizzle-orm/pg-core";

export const courses = pgTable("courses", {
  id: integer("id").notNull().primaryKey(),
  courseName: varchar("course_name", { length: 255 }).notNull(),
});

export const enrollments = pgTable("enrollments", {
  studentId: integer("student_id").notNull(),
  courseId: integer("course_id").notNull().references(() => courses.id, { onDelete: "cascade", onUpdate: "cascade" }),
}, (t) => [
  primaryKey({ columns: [t.studentId, t.courseId] }),
]);

With this decomposition course_name is no longer repeated in the enrollments table and each table is free from partial dependencies. We have achieved 2NF.