Setting Linear Hash Parameters in R/BASIC
Published By | Date | Version | Knowledge Level | Keywords |
---|---|---|---|---|
Revelation Technologies | 05 OCT 1988 | 1.X | EXPERT | THRESHOLD, LHASH, SIZELOCK, LHSET, RBASIC |
Introduction
Linear hash files maintain a series of parameters that are required by the filing system. These parameters are:
- Framesize
- Record Count
- Threshold
- Sizelock
The first, framesize, is the size allocated to each frame in a group (the default value is 1K). The framesize is established at the time the file is created. Once the file is created, this framesize cannot be altered.
The second parameter, record count, is new for linear hash files created under version 1.1 or later of Advanced Revelation. The value of this parameter reflects the actual number of records in the file. The record count is established and maintained by the filing system as it reads and writes records to a file.
The latter two parameters are established with default values at the time the file is created, but can be changed at any time. This article discusses the use of these parameters, and a method to change them from TCL.
Threshold
A key feature of the linear hash filing system is its ability to maintain the optimal number of groups for the quantity of data in the file. When a file is first created, a single group is allocated for data storage. As data is added to the file, additional groups are allocated, and data is redistributed into the new groups. If data is deleted, groups are consolidated and the total number of groups is reduced.
The value of the threshold parameter in a linear hash file controls the decision to allocate (or consolidate) groups. The threshold is a 'percentage full' parameter. If the average quantity of data in the primary frames of all groups exceeds the percentage represented by the threshold, the linear hash filing system will alocate new groups as it reads and writes data. Conversely, if the average quantity of data is 10% or more below the threshold percentage, groups will be consolidated.
Threshold is thus an indirect method of establishing file efficiency. A high threshold percentage makes best use of the primary frames by writing as much data to the frame as possible before allocating a new group. This saves disk space.
A low threshold percentage keeps groups relatively empty, allocating new grouops more frequently. This can result in better access time, since a smaller group can be scanned more quickly for a particular record.
Sizelock
Because of the overhead involved in allocating or consolidating groups, a file with with a highly variable quantity of data is subject to loss of efficiency. For example, a batch or import process loading a new file loses time in continuously allocating new groups.
The sizelock parameter is used to prevent a file from allocating or consolidating groups. A typical use is to crate a file with preallocated groups (by using the no.records and avg.size prarmeters in MAKEFILE). The file is then sizelocked to prevent the consolidation of largely empty groups while data is loaded into the file.
Sizelock has three basic values.These are:
Sizelock Value | Meaning |
---|---|
0 | OK to allocate/consolidate groups (default) |
1 | Locked for consolidation (OK to allocate new groups) until next SELECT; thereafter unlocked |
>=2 | Locked; must be explicity unlocked |
Changing LH parameters
The only facility provided within Advanced Revelation to change threshold percentage or sizelock is the DUMP process. This process entails loading DUMP with the filename, and executing the appropriate keystrokes to reset the threshold percentage or sizelock (press [F1] while in DUMP for guidance on proper keystrokes.)
It is possible, however, to create an R/BASIC program that changes either of these parameters via a TCL-level command.
File Header Information
The primary frame of he first group in the file (group 0) is used by the file for file header information. In addition to the data hashed to this group and the usual frame header information, the first bytes of the primary frame in group 0 are used to store the parameters that apply to the file as a whold. Included amongst these are the threshold percentage information and the sizelock flag.
This file header frame appears physically at the beginning of the DOS file corresponding to the Revelation file. Because of this, an R/BASIC program can use DOS file handling statements (OSBREAD, OSBWRITE) to manipulate the various flags stored there.
Threshold
The threshold percentage value is stored as byte 20 of the file header information. By updating the byte at this location, the threshold percentage for the file can be altered dynamically.
Possible values for threshold are integers values between 10 and 99. These are stored as percentages of the 256 possible values that byte 20 can be. For example, to set a threshold at 50%, the ASCII character 128 (50% of 256) should be written to byte 20 of the header information.
Sizelock
The sizelock flag is stored as byte 21 of the header information. Since this is also a single byte, the sizelock value can be set between 0 and 255 by writing the appropriate ASCII character to that location. Note that DUMP does not permit setting the sizelock higher than 10, but that no similar restriction appliies in this program.
Group 0 Lock
Whenever a change is being made to the file header information, group 0 should be locked. In a network environment, this will prevent more than one process from attempting to make changes to the file parameters at once. Since the linear hash filing system itself updates group 0 frequently, it is never safe to assume that group 0 need not be locked.
Because a lock on group 0 will be transient - never in place for more than a few moments - it is safe to simply place the lock logic into a timed loop.
An Example Program
The program listed in figure 1 can be used to manipulate threshold and sizelock parameters. By use of this program, the parameters can be altered from TCLl.
The program supports an interactice mode, in which all options are prompted for using popups. Alternately, all relevant arguments can be passed in the TCL command line.
The syntax for using the program is this:
LHSET filename function value option
where:
- filename : is the name of the file to change
- function : is the literal SIZELOCK or THRESHOLD
- value : is the changed value for function
- option : is (S) if no messages or popups are to be produced while the program executes or if an error condition is detected.
For SIZELOCK, value can be
(+/-)n
where n represents either an absolute value from 0 to 255 for the sizelock, or +/-n represents a change from the current value for the sizelock.
For THRESHOLD value can be
{.}m
where m represents a percentage (10-99) for the threshold value. The decimal in front of the value is optional.
* Routine to set parameters in LH files * * This program enables users to set the threshold percentage and * sizelock from TCL. *---------------------------------------------------------------- EQU TRUE$ TO 1 EQU FALSE$ TO 0 EQU NULL$ TO '' EQU SPACE$ TO ' ' DECLARE FUNCTION POP.UP DECLARE SUBROUTINE CATALYST, MSG *---------------------- * read command line to derive filename and options COMMAND = TRIM(@SENTENCE CONVERT SPACE$ TO @FM IN COMMAND * get name of program as cataloged. Used in title of popup. PROG = COMMAND<1> * get options (see if anything is in parentheses) OPTIONS = COMMAND[1,'F('] OPTIONS = COMMAND[COL2()+1,'F)'] * s.opt = suppress error messages IF OPTIONS EQ 'S' THEN S.OPT = TRUE$ ELSE S.OPT = FALSE$ * derive file name. If none (and if no 's' option), produce popup of files FILENAME = COMMAND<2> IF FILENAME EQ NULL$ THEN IF S.OPT THEN STOP ELSE CATALYST('P','@FILE') FILENAME = @ANS IF FILENAME EQ NULL$ THEN STOP END END * prepare file. If successful, get DOS file name information OPEN FILENAME TO SOURCE.FILE THEN OPEN 'FILES' FO FILE.FILES THEN GOSUB GET.DOS.INFO ELSE * 201 = "not attached" message IF S.OPT ELSE MSG('201','','','FILES') ; STOP END END ELSE IF S.OPT ELSE MSG('201','','',FILENAME) ; STOP END *---------------------- * check desired function. If none in command line, produce popup ALL.FUNCTIONS = 'SIZELOCK':@FM:'THRESHOLD' DEFAULTS = 0:@FM:80 FUNCTION = COMMAND<3> LOOP ERROR = FALSE$; DONE = FALSE$ LOCATE FUNCTION IN ALL.FUNCTIONS USING @FM SETTING POS ELSE IF S.OPT THEN STOP ELSE ERROR = TRUE$ FUNCTION = POP.UP(0,0,'',ALL.FUNCTIONS,18,'R','',PROG,'','','','') IF FUNCTION EQ NULL$ THEN DONE = TRUE$ END END WHILE ERROR AND NOT(DONE) REPEAT IF DONE THEN STOP * derive parameter for appropriate function. If none, display default DEFAULT = DEFAULTS<POS> VALUE = COMMAND<4> IF VALUE EQ NULL$ THEN VALUE = DEFAULT IF S.OPT THEN STOP ELSE MSG('Enter value for ":FUNCTION,'R',VALUE','') END * branch to appropriate logic for function ON POS GOSUB SLOCK, THOLD STOP ******************************SUBROUTINES***************************** * get DOS file name and path information for the file being changed GET.DOS.INFO: READ DOS.FILE.NAME FROM FILE.FILES, FILENAME THEN DOS.FILE.NAME = DOS.FILE.NAME<5,COUNT(DOS.FILE.NAME,@VM)+1> * insure that the file is a lh file IF DOS.FILE.NAME[-2,2] NE 'LK' THEN * W640 = 'not Linear Hash file' message IF S.OPT ELSE MSG('W640','','',FILENAME) ; STOP END DOS.FILE.NAME = DOS.FILE.NAME[14,99] PATH = DOS.FILE.NAME[1,LEN(DOS.FILE.NAME)-11] DOS.FILE.NAME = DOS.FILE.NAME[-11,11] OSOPEN PATH:DOS.FILE.NAME TO FILE.DOS ELSE IF S.OPT ELSE MSG(PATH:DOS.FILE.NAME:" can't be opened!|(Error = ":STATUS():")",'','','') END STOP END END ELSE IF S.OPT ELSE MSG('201','','',FILENAME) ; STOP END RETURN *----------------- * this routine sets the sizelock SLOCK: CHANGE = FALSE$ IF VALUE[1,1] EQ '+" OR VALUE[1,1] EQ '-' THEN CHANGE = VALUE[1,1] VALUE = VALUE[2,99] IF NOT(NUM(VALUE)) THEN VALUE = 0 IF VLAUE EQ NULL$ THEN VALUE = 1 ELSE VALUE = ((CHANGE:VALUE)*1) END GOSUB LOCK.GROUP * read header information from file, change, and write back OSBREAD HEADER FROM FILE.DOS AT 0 LENGTH 1024 OLD.VALUE = SEQ(HEADER[21,1]) ;* sizelock at byte 21 IF CHANGE THEN OLD.VALUE += VALUE VALUE = OLD.VALUE END IF VALUE LT 0 THEN VALUE = 0 ELSE IF VALUE GT 255 THEN VALUE = 255 END HEADER[21,1]=CHAR(VALUE) OSBWRITE HEADER ON FILE.DOS AT 0 UNLOCK SOURCE.FILE, GROUP.NO, 04 RETURN *--------------------- * this routine sets the threshold percentage THOLD: BEGIN CASE CASE NOT(NUM(VALUE)) ; VALUE = 80; * default value for threshold CASE VALUE LT 1 AND VALUE GT 0 * value passed as a decimal -- convert to integer VALUE = VALUE * 100 IF VALUE LT 10 THEN VALUE = 10 CASE VALUE LT THEN ; VALUE = 10 CASE VALUE GT 99 ; VALUE = 99 END CASE VALUE = CHAR(INT((VALUE/100)*256)) GOSUB LOCK.GROUP OSBREAD HEADER FROM FILE.DOS AT 0 LENGTH 1024 HEADER[20,1] = VALUE OSBWRITE HEADER ON FILE.DOS AT 0 UNLOCK SOURCE.FILE, GROUP.NO, 04 RETURN *-------------- LOCK.GROUP: GROUP.NO = 0 LOCKED = FALSE$ LOOP UNTIL LOCKED LOCK SOURCE.FILE, GROUP.NO, 04 THEN LOCKED = TRUE$ END ELSE E.TIME = TIME() +1 LOOP UNTIL TIME() = E.TIME; REPEAT END REPEAT RETURN