Drizzle | Boyce-Codd Normal Form (BCNF)
This guide assumes familiarity with:
  • You should be familiar with 3NF
  • You should be familiar with EKNF

Boyce-Codd Normal Form (BCNF) is a stricter form of normalization than Third Normal Form. While 3NF focuses on the relationship between non-prime attributes and candidate keys, BCNF goes a step further by addressing potential redundancies even when 3NF is achieved. The goal is to eliminate all redundancy that can be detected through functional dependencies.

In practice, you barely see tables in 3NF that violate BCNF. However, it’s important to understand the concept of BCNF.

The BCNF Rule

To achieve BCNF, table should already be in 3NF and must satisfy the following condition:

For every non-trivial functional dependency X -> Y, the determinant (X) must be a super key. In simpler terms, the left-hand side of any functional dependency must uniquely determine the entire row in the table.

Example

We have a table room_reservations with the following schema:

schema.ts
schema.sql
schema.txt
import { integer, pgTable, primaryKey, unique, varchar } from "drizzle-orm/pg-core";

export const roomReservations = pgTable("room_reservations", {
  room: integer("room").notNull(),
  startTime: varchar("start_time", { length: 5 }).notNull(),
  endTime: varchar("end_time", { length: 5 }).notNull(),
  rateType: varchar("rate_type", { length: 20 }).notNull(),
}, (t) => [
  primaryKey({ columns: [t.room, t.startTime] }),
  unique().on(t.room, t.endTime),
  unique().on(t.rateType, t.startTime),
  unique().on(t.rateType, t.endTime),
]);

Business rules

  1. Each row in this table represents a hotel room reservation. For simplicity, assume the hotel has two rooms: Room 101 (a standard room) and Room 201 (a deluxe suite).
  2. A booking is defined by the room and the time period (start_time to end_time) for which that room is reserved. No room can have overlapping bookings (only one reservation per room at a given time).
  3. Additionally, each booking has a rate_type. There are four distinct rate types in this example, each implying a specific guest status (membership level):

Functional Dependencies

  1. room, start_time -> end_time, rate_type
  2. room, end_time -> start_time, rate_type
  3. rate_type, start_time -> room, end_time
  4. rate_type, end_time -> room, start_time
  5. rate_type -> room. Each rate type is associated with exactly one room. From the business rules, we see that a given rate code implies a specific room (e.g. any SAVER rate booking is always for Room 101, and any DELUXE-B booking is always for Room 201). Formally, this means rate type functionally determines room.

Candidate keys

The candidate keys for this table are:

  1. room, start_time
  2. room, end_time
  3. rate_type, start_time
  4. rate_type, end_time

Even though in the sample data each start_time and end_time happens to be unique across all bookings, we cannot treat start_time or end_time alone as a key because on another day two different rooms could have bookings that begin or end at the same time. That is why a combination of attributes is needed to uniquely identify a booking.

3NF Analysis

Since every attribute in this table is part of at least one candidate key, there are no non-prime attributes. This means no transitive dependencies exist, so the table is in 3NF.

BCNF Analysis

The relation is not in BCNF because of this functional dependency:

rate_type -> room

The left-hand side (rate_type) is not a super key. Both rate_type and room are prime attributes because they are part of candidate keys. This dependency violates the BCNF rule.

This leads to:

  1. Information duplication.
  2. Update anomalies. If the hotel changes which room a given rate code applies to, multiple rows would need updating.
  3. Deletion anomalies. If a DELUXE-A booking is deleted, the information about which room that rate code applies to would be lost.

BCNF Decomposition

We need to split the table to eliminate the problematic dependency. We create two tables:

schema.ts
schema.sql
schema.txt
import { boolean, foreignKey, integer, pgTable, primaryKey, unique, varchar } from "drizzle-orm/pg-core";

export const rateTypes = pgTable("rate_types", {
  rateType: varchar("rate_type", { length: 20 }).notNull().primaryKey(),
  room: integer("room").notNull(),
  memberFlag: boolean("member_flag").notNull(),
}, (t) => [
  unique().on(t.room, t.memberFlag),
]);

export const reservations = pgTable("reservations", {
  room: integer("room").notNull(),
  startTime: varchar("start_time", { length: 5 }).notNull(),
  endTime: varchar("end_time", { length: 5 }).notNull(),
  memberFlag: boolean("member_flag").notNull(),
}, (t) => [
  primaryKey({ columns: [t.room, t.startTime] }),
  foreignKey({
    columns: [t.room, t.memberFlag],
    foreignColumns: [rateTypes.room, rateTypes.memberFlag],
  }).onDelete("cascade").onUpdate("cascade"),
  unique().on(t.room, t.endTime),
]);

The dependency rate_type -> room is fully enforced in this table (and is no longer a problem, because rate_type is a candidate key here). The table also has another candidate key room, member_flag, since each combination of room and membership status determines a unique rate type. rate_type is no longer stored in reservations table, so the redundancy is gone. Instead, the combination of room and member_flag for a reservation can be used to lookup the rate_type from the rate_types table when needed.

Both tables are now in BCNF. For every functional dependency in each table, the left-hand side is a super key.