Field Justification, Secondary Indexes, and Sort Ordering
Published By | Date | Version | Knowledge Level | Keywords |
---|---|---|---|---|
Revelation Technologies | 28 FEB 1990 | 2.X | INTERMEDIATE | FIELD, JUSTIFICATION, INDEXING, JUSTIFICATION, SECONDARY, INDEXES, SORTING |
Advanced Revelation uses the justification specified for a field in the dictionary to determine more than display issues for columnar or window display. Sorting routines use justification to determine the algorithm to use when sorting by that field. Fields that are right-justified are assumed to contain numeric information while fields with all other justification types are assumed to contain primarily non-numeric information.
If sorting and some reduction (WITH, WHERE, HAVING) clauses are to produce correct results, fields must have the correct justification. Data with a numeric internal representation should normally be right-justified. Fields that should be right-justified are:
- Integer data fields
- Decimal data fields
- Money data fields
- Time, date, and date/time fields
Though some of these have non-numeric external representations, each is represented internally as numeric data. If these fields are left-justified in the dictionary, incorrect results are returned when the field is sorted, or if it is involved in queries that use the operators >, >=, <, ⇐ and BETWEEN.
You can still have left-justified aliases or pseudonyms for these types of fields as long as the aliases are used only for display. Specifying Left justification for one of these field types when painting an entry window will not affect querying or sorting as long as that justification is not 'bound' into the dictionary when the change is made.
Secondary Indexing and Field Justification
Field justification has an important effect on secondary indexing. Indexes for right-justified fields will not include entries for any non-numeric data.
Right-justified fields are assumed by indexing to contain only numeric information, so all non-numeric data is excluded from the index. Corruption of the index structure would result if non-numeric information were allowed to pass. This restriction may be lifted in later releases, but is in force through Version 2.0. The restriction applies only to the field that is indexed and not to the key field or fields to which the value is related. If the field being indexed is a key field, the restriction still applies.
Hybrid Fields and Customizing Sort Behavior
Some fields contain both numeric and non-numeric information in a structured manner. Which justification is correct for such a field? If the field is to be indexed, Left justification must be chosen as mentioned above. An alpha-sort algorithm, however, will produce results that are probably not desirable. Refer to Figure 1 as an example file.
Figure 2 shows the results of sorting the file by Item_Code with Left and Right justification on that field.
Under most circumstances, users would probably consider the right-justified results to be correct. As mentioned before, however, indexing the right-justified field with alphanumeric data in it would produce incorrect results: only the values 1, 2, 3, 20, and 30 would show up in the index. What to do?
Through judicious use of symbolic fields, the sort ordering and query behavior that you desire can be achieved, with or without indexing. The principle is to write a symbolic that separates and correctly justifies the alpha and numeric parts of the data. The symbolic should be specified as left-justified. This symbolic field then can be indexed, sorted by, and queried on with consistent results.
If the Item_Code field above has a definition of 'from 1 to 5 numeric characters followed by zero or one upper case alpha character' then appropriate symbolic code would be:
TMP = {ITEM_CODE} IF NUM(TMP[-1,1]) THEN TMP := ' ' END RETURN FMT(TMP, 'R#6')
This guarantees that a place holder is always included for the alpha character 'sub field', and the numeric data is justified right. If the field definition is more complicated, such as allowing from 0 to 3 alpha characters, then more complex logic is required to properly separate and justify the numeric and alpha sub-fields:
TMP = {ITEM_CODE} CONVERT @UPPER.CASE TO STR('*', 26) IN TMP NUM = TMP[1, '*'] ALPHA = {ITEM_CODE}[COL2(), LEN(TMP)] RETURN FMT(NUM, 'R#5') : FMT(ALPHA, 'L#3')
Conclusion
Developers and users must understand the effects of justification on query results and indexing. Even greater attention is required where structured alphanumeric data fields are involved. Application of symbolic fields can be used to allow the developer and user to achieve well-behaved and desirable results. In general, separating alphanumeric data items into distinct fields is preferable. Querying and sorting by parts of the logical field would then be more straightforward, and symbolics could more efficiently be written to combine the fields for display.
Examples
Figure 1
Recno Item_Code ----- --------- 1 20 2 1C 3 30T 4 2 5 3B 6 3 7 100B 8 30 9 1 10 2E 11 20X 12 1A