Progress
Database Design
Guide


Calculating Index Size

You can estimate the approximate maximum amount of disk space occupied by an index by using this formula:

Number of rows * (6 + number of columns in index + index column storage) * 2

For example, if you have an index on a character column with an average of 21 characters for column index storage (see Table 4–2) and there are 500 rows in the table, the index size is:

500 * (6 + 1 + 21) * 2 = 29,000 bytes

The size of an index is dependent on four things:

However, you will never reach this maximum because Progress uses a data compression algorithm to reduce the amount of disk space an index uses. In fact, an index uses on average about 20% to 60% less disk space than the maximum amount you calculated using the previously described formula.

The amount of data compressed depends on the data itself. Progress compresses identical leading data as well as collapses trailing entries into one entry. Typically non-unique indexes get better compression than unique indexes.

NOTE: All key values are compressed in the index, eliminating as many redundant bytes as possible.

Table 4–2 lists the column storage values for different data types.

Table 4–2: Column Storage
Data Type
Value
Column Storage in Bytes
Character
1 + number of characters, excluding trailing blanks. If the 
number of characters is greater than 240, add 3 to the  number 
of characters instead of 1. 
Date
3 
Decimal
zero
1 
nonzero
2 + (number of significant digits + 1)/2 
Integer
zero
1 
1
2 
128
3 
32768
4 
8 million
4 
2,147,483,647
5 
Logical
false
0 
true
1 

Figure 4–2 shows how Progress compresses data.

Figure 4–2: Data Compression

The City index is stored by city and by ROWID in ascending order. There is no compression for the very first entry “Bolonia”. For subsequent entries, Progress eliminates any characters that are identical to the leading characters of Bolonia. Therefore, for the second entry, “Bolton”, there is no need to save the first three characters “Bol” since they are identical to leading characters of Bolonia. Instead, Bolton compresses to “ton”. Subsequently, Progress does not save redundant occurrences Bolton. Similarly, the first two characters of “Bonn” and “Boston” (“Bo”) are not saved.

For ROWIDs, Progress eliminates identical leading digits. It saves the last digit of the ROWID separately and combines ROWIDs that differ only by the last digit into one entry. For example, Progress saves the leading three digits of the first ROWID 333 under ROWID, and saves the last digit under nth byte. Go down the list and notice that the first occurrence of Boston has a ROWID of 1111, the second has a ROWID of 1118. Since the leading three digits (111) of the second ROWID are identical to the first one, they are not saved; only the last digit (8) appears in the index.

Because of the compression feature, Progress can substantially decrease the amount of space indexes normally use. In the above example, 65 bytes are used to store the index that previously took up 141 bytes. That’s a saving of approximately 54%. As you can see, the amount of disk space saved depends on the data itself. You can save the most space on the non-unique indexes.


Copyright © 2004 Progress Software Corporation
www.progress.com
Voice: (781) 280-4000
Fax: (781) 280-4095