You should also add foreign key constraints on the relevant columns in the agent_item_types
, agent_suppliers
, and supplier_item_types
tables. These constraints should reference the primary key columns in the respective tables (or equivalent tables defining agents, suppliers, and item types).
- You should be familiar with 4NF
The Fifth Normal Form (5NF), also known as Project-Join Normal Form (PJ/NF) is a level of database normalization where a table is already in 4NF, and every non-trivial join dependency in that table is implied by the candidate keys. The goal is to eliminate redundancy caused by join dependencies and to ensure that relations are decomposed into smaller components without any loss of data.
In practice, 5NF is rarely used in database design.
Key concepts
- A
Join Dependency
(JD) is a constraint that specifies that tableR
can be split into several smaller tablesR1, R2,..., Rk
and by performing a natural join on these tables, the original tableR
can be reconstructed without any loss of information and no false information (spurious rows) is created during the process. - A
Lossless/Non-Loss Decomposition
is a decomposition when all the sub-relations do the natural join and the obtained table is equal to the original table. - A
Natural Join
is a join operation that is used to combine two relations based on all common attributes. - A JD denoted as
*(R1, R2, ..., Rk)
on a relationR
is considered trivial if at least one of the componentsRi
(wherei
is between1
andk
) is equal to the set of all attributes of the original relationR
. In simple terms: A join dependency is trivial if one of the parts you’re joining already is the whole original table.
Example
We have a table agent_inventory
with the following schema:
import { pgTable, primaryKey, varchar } from "drizzle-orm/pg-core";
export const agentInventory = pgTable("agent_inventory", {
agent: varchar("agent", { length: 255 }).notNull(),
supplier: varchar("supplier", { length: 255 }).notNull(),
itemType: varchar("item_type", { length: 255 }).notNull(),
}, (t) => [
primaryKey({ columns: [t.agent, t.supplier, t.itemType] }),
]);
- An
Agent
can be authorized to source products from multipleSuppliers
. - An
Agent
can be authorized to offer multipleItem types
. - A
Supplier
can manufacture multipleItem types
. - The core business constraint: An
Agent
must offer a specificItem type
from a specificSupplier
if and only if the following three conditions are all true:- The
Agent
is authorized to source products from thatSupplier
. - The
Agent
is authorized to offer thatItem type
. - The
Supplier
is known to manufacture thatItem type
.
- The
Illustrative Consequence: An Agent
has certain Suppliers
and certain Item types
in their repertoire. If supplier S1
and supplier S2
are in their repertoire, and item type I
is in their repertoire, then (assuming supplier S1
and supplier S2
both manufacture item type I
), the agent must offer items of item type I
those manufactured by supplier S1
and those manufactured by supplier S2
.
You cannot have all the component relationships true without the corresponding combined record existing in the agent_inventory
table. It’s this enforced combination based on the component parts that leads to the Join Dependency.
Candidate Keys
agent, supplier, item_type
is the only candidate key. All three attributes are required to uniquely identify a specific assignment row.
Join Dependency
*( {agent, supplier}, {agent, item_type}, {supplier, item_type} )
is a non-trivial join dependency. This means that the table can be decomposed into smaller tables without losing any information.
4NF Analysis
The table is in 4NF because there are no non-trivial multivalued dependencies. For example, Helen Fox
sources from GearUp
and ProServe
, and she offers Webcam
, Scanners
and Docks
. However, she doesn’t offer GearUp
Docks
, nor does she offer ProServe
Scanners
. The allowed combinations are specific and don’t show the independence required for MVDs based on single attributes.
5NF Analysis
The table is not in 5NF because it contains non-trivial join dependency. This join dependency is not implied by the candidate key agent, supplier, item_type
because none of the components in the JD ({agent, supplier}
, {agent, item_type}
, {supplier, item_type}
) are superkeys of the original agent_inventory
table.
This leads to:
- Redundancy.
- Deletion Anomalies: Changing a single underlying fact (e.g., a supplier stops making an item type) may require updating multiple rows in the
agent_inventory
table. For instance, ifProServe
stopped makingDocks
, rows for bothChris Day
andHelen Fox
would need deletion. - Insertion Anomalies: If underlying facts change such that the rule dictates a new row should exist (e.g., if
Chris Day
starts handlingGearUp
suppliers, andGearUp
makesScanners
, whichChris Day
already handles, the rule implies (Chris Day
,GearUp
,Scanner
) must be inserted), the single-table structure doesn’t automatically enforce this insertion based on the component facts.
5NF Decomposition
We need to split the table into three tables: agent_suppliers
, agent_item_types
, and supplier_item_types
:
import { pgTable, primaryKey, varchar } from "drizzle-orm/pg-core";
export const agentSuppliers = pgTable("agent_suppliers", {
agent: varchar("agent", { length: 255 }).notNull(),
supplier: varchar("supplier", { length: 255 }).notNull(),
}, (t) => [
primaryKey({ columns: [t.agent, t.supplier] }),
]);
export const agentItemTypes = pgTable("agent_item_types", {
agent: varchar("agent", { length: 255 }).notNull(),
itemType: varchar("item_type", { length: 255 }).notNull(),
}, (t) => [
primaryKey({ columns: [t.agent, t.itemType] }),
]);
export const supplierItemTypes = pgTable("supplier_item_types", {
supplier: varchar("supplier", { length: 255 }).notNull(),
itemType: varchar("item_type", { length: 255 }).notNull(),
}, (t) => [
primaryKey({ columns: [t.supplier, t.itemType] }),
]);
To retrieve the data in a format equivalent to the original agent_inventory
table we can execute this query:
const db = drizzle(...);
const results = await db
.select({
agent: agentSuppliers.agent,
supplier: agentSuppliers.supplier,
itemType: agentItemTypes.itemType,
})
.from(agentSuppliers)
.innerJoin(
agentItemTypes,
eq(agentSuppliers.agent, agentItemTypes.agent)
)
.innerJoin(
supplierItemTypes,
and(
eq(agentSuppliers.supplier, supplierItemTypes.supplier),
eq(agentItemTypes.itemType, supplierItemTypes.itemType)
)
);
SELECT
"agent_suppliers"."agent",
"agent_suppliers"."supplier",
"agent_item_types"."item_type"
FROM
"agent_suppliers"
INNER JOIN
"agent_item_types"
ON "agent_suppliers"."agent" = "agent_item_types"."agent"
INNER JOIN
"supplier_item_types"
ON "agent_suppliers"."supplier" = "supplier_item_types"."supplier"
AND "agent_item_types"."item_type" = "supplier_item_types"."item_type";
Decomposing the table into agent_suppliers
, agent_item_types
, and supplier_item_types
achieves 5NF. Each table now represents a single fundamental relationship from the core business rule. This eliminates the specific join dependency that caused redundancy and update anomalies in the original table.