Progress
Database Design
Guide


Defining Your Own Delimiters

To define attributes for your own delimiters, you must create a word-break rules file. Within this file, you define a data structure named word_attr. The following is an example of a word-break rules file:

#define    AT_SIGN    64
#define    PERIOD     46
#define    COMMA      44

word_attr  =
{
PERIOD,      BEFORE_DIGIT,
COMMA,       BEFORE_DIGIT,
Ox2D,         BEFORE_DIGIT,      /*hyphen */
39,          IGNORE,            /*single quote */
’$’,          USE_IT,
’%’,          USE_IT,
’#’,          USE_IT,
AT_SIGN,      USE_IT,
’_’          USE_IT
}; 

Note that you can use the #define syntax to define constants. Within the word_attr definition you can reference characters by enclosing them in single quotes (‘ ’), by their decimal ASCII values, or by their hexadecimal values (0x2D, for example). Each item in the table except the last item must be followed by a comma.

Your word-break table does not have to include every character. By default, all letters in the language are assigned the LETTER attribute, the characters 0 through 9 are assigned the DIGIT attribute, and all other characters are assigned the TERMINATOR attribute. You only have to specify the characters whose attributes you want to change.

NOTE: The asterisk (*), vertical line (|), exclamation point (!), caret (^), and opening and closing parentheses all have special meaning within a CONTAINS clause. You must give these characters the TERMINATOR attribute or their special meaning is lost. You may, however, select to maintain only one of the OR operators and use the others as letters.

After you’ve defined your word-break table, you must compile it with the PROUTIL command:

Operating System
Syntax
UNIX
Windows
proutil database -C wbreak-compiler src-file rule-numb

In the syntax, database is the name of your database, src-file is the name of your word-break rules file, and rule-numb is a number between 1 and 255 that uniquely identifies this set of rules on your system.

The PROUTIL command produces a binary file named prowrd.rule-numb. For example, if rule-numb is 34, the file is named prowrd.34. To reference this file, you must either move it to the $DLC directory or set the environment variable PROWDrule-numb to reference it. For example, the variable PROWD34 specifies the location of the prowrd.34 file. (Note that the PROWD environment variable does not contain a period.)

To apply word-break rules to a database, use the word-rules qualifier of the PROUTIL command:

Operating System
Syntax
UNIX
Windows
proutil database -C word-rules rule-numb

The value of rule-numb is the same value you specified when compiling the rules. To switch back to the default rules, specify 0 for rule-numb.

If you change the word-break rules when word indexes are active, the indexes might not work properly because the rules used to create the index differ from those used when searching the index. Therefore, when you change the break rules for a database, Progress warns you if any word indexes are active. You should rebuild these indexes. You can make old indexes consistent with the new rules by rebuilding them with the PROUTIL idxbuild qualifier. For more information about the PROUTIL idxbuild qualifier, see the PROUTIL idxbuild qualifier entry in the Progress Database Administration Guide and Reference .

Progress maintains a cyclic redundancy check (CRC) to ensure that the word-break rule file does not change between sessions. If it has changed, Progress displays a message when you attempt to connect to the database. The connect attempt fails. You can fix this by restoring the original file or resetting the break rules to the default. Note that resetting to the default break rules may invalidate your word indexes.


Copyright © 2004 Progress Software Corporation
www.progress.com
Voice: (781) 280-4000
Fax: (781) 280-4095