Speeding up R/BASIC Writes to an Indexed File

Published ByDateVersionKnowledge LevelKeywords
Revelation Technologies11 APR 19891.1XINTERMEDIATEBATCH.INDEXING, !INDEXING

When writing an R/BASIC program to create or modify a non-indexed datafile, the program can execute very quickly. On the other hand, the overhead involved in creating or modifying an indexed file significantly affects performance. This bulletin will give an example of how to minimize file I/O overhead when writing to an indexed file.

Before proceeding, it is useful to examine why there is additional overhead in writing to indexed files. Because of the way Advanced Revelation's indexing system is designed, for every data record that the system writes to an indexed file, the system must build a transaction record consisting of all the indexed fields. This transaction record is then written to the !INDEXING file.

In addition, there is the normal overhead for linear hash files of keeping the files sized properly. Since both the data file and !INDEXING files are likely linear hash files, both of these will undergo periodic resizing as data is written to them. This overhead can be partially alleviated by putting a sizelock on the data file.

To reduce the overhead involved in writing index transactions to the !INDEXING file, there is a system subroutine called BATCH.INDEXING. This subroutine tells the system to buffer the current transaction record in memory. The buffering scheme used has two conditions that will cause the transaction record to be written to disk. It will buffer the transaction record for 5 seconds or until the transaction record being buffered is larger that 1000 bytes. Use of the BATCH.INDEXING subroutine can cut file I/O overhead for indexed files in half.

BATCH.INDEXING requires two parameters. The first parameter is a mode flag consisting of true or false (1 or 0). The true flag (1) instructs the system to start buffering the transaction records. The false flag (0) causes the system to stop buffering and write out the transaction records.

The second parameter is the file variable to which the datafile was opened.

BATCH.INDEXING should be called twice from a program, once at the beginning and once at the end. The example code segment in Figure 1 provides an example of calling BATCH.INDEXING.

Note: Any time a process is running that is building or maintaining indexes, and the system goes down for any reason (power failure, programming glitch, etc.), the integrity of the indexes is lost. To regain the integrity of the indexes, they must be rebuilt.

DECLARE SUBROUTINE BATCH.INDEXING
OPEN 'TEST' TO TEST.FILE THEN
  DONE = 0
  * initialize buffered index transactions
  BATCH.INDEXING(1,TEST.FILE)
  LOOP
  * the following two subroutines are only examples
    GOSUB BUILD.RECORD:
    GOSUB WRITE.RECORD:
  UNTIL DONE REPEAT
  * end buffered transactions, flush to file
  BATCH.INDEXING(0,TEST.FILE)
END ELSE
  MSG("Can't open file TEST","","","")
END
  • tips/revmedia/r13.txt
  • Last modified: 2024/06/19 20:20
  • by 127.0.0.1