Progress
Database Design
Guide


The First Normal Form

The columns of a table in first normal form have these characteristics:

The first rule of normalization is that you must remove duplicate columns or columns that contain more than one value to a new table.

First, examine an unnormalized Customer table, as shown in Table 2–1.

Table 2–1: Unnormalized Customer Table with Several Values in a Column 
Cust Num
Name
Street
Order Number
101
Jones, Sue
2 Mill Ave.
M31, M98, M129
102
Hand, Jim
12 Dudley St.
M56
103
Lee, Sandy
45 School St.
M37, M40
104
Tan, Steve
67 Main St.
M41

Here, the Order Number column has more than one entry. This makes it very difficult to perform even the simplest tasks—for example, delete all order numbers M56 and above, find the total number of orders for a customer, or print orders in sorted order. To perform any of those tasks you need a complex algorithm to examine each value in the Order Number column for each row. It is, therefore, crucial that each column in a table consists of exactly one value.

Table 2–2 shows the same Customer table in a different unnormalized format.

Table 2–2: Unnormalized Table with Multiple Duplicate Columns 
Cust
Num

Name

Street
Order
Number1
Order
Number2
Order
Number3
101
Jones, Sue
2 Mill Ave.
M31
M98
M129
102
Hand, Jim
12 Dudley St.
M56
Null
Null
103
Lee, Sandy
45 School St.
M37
M140
Null
104
Tan, Steve
67 Main St.
M41
Null
Null

Here, instead of a single Order Number column, there are three separate but duplicate columns for Order Number (Order Number1, Order Number2, and Order Number3). This format is also not efficient. What happens if a customer has more than three orders? You must either add a new column or clear an existing column value to make a new entry. It is difficult to estimate a reasonable maximum number of orders for a customer. If your business is brisk, you might have to create 200 Order Number columns for a row. Also if a customer has only 10 orders, the database will contain 190 null values for this customer.

Furthermore, it is difficult and time consuming to retrieve data with repeating columns. For example, to determine which customer has Order Number M98, you must look at each Order Number column individually (all 200 or more of them) in every row to find a match.

To reduce the Customer table to the first normal form, split it into two smaller tables, one table to store only Customer information (see Table 2–3) and another to store only Order information (see Table 2–4).

Table 2–3: Customer Table 
Cust Num
(Primary Key)
Name
Street
101
Jones, Sue
2 Mill Ave.
102
Hand, Jim
12 Dudley St.
103
Lee, Sandy
45 School St.
104
Tan, Steve
67 Main St.

Table 2–4: Order Table 
Order Number
(Primary Key)
Cust Num
(Foreign Key)
M31
101
M98
101
M129
101
M56
102
M37
103
M140
103
M41
104

Note that there is only one instance of a column in the Customer and Order tables and each column contains exactly one value. The Cust Num column in the Order table relates to the Cust Num column in the Customer table.

A table that is normalized to the first normal form has these advantages:


Copyright © 2004 Progress Software Corporation
www.progress.com
Voice: (781) 280-4000
Fax: (781) 280-4095