Indexing in OpenInsight Part 3 - How transactions get updated

Published 13 MAR 2010 at 10:43:00AM by Sprezz

So now we’ve covered how the index transactions get put into the bang table all that is left is to discuss how they move from the bang table as transactions into the bang table as balanced index nodes.

In the first article we explained that transactions were introduced to allow slow hardware to distribute the transaction processing. In addition to this the engineers at Cosmos had to come up with a way of allowing individual workstations to use spare processing power (when the PC was left unused) to move the transactions from the bang file into the index itself in a way that was easily interruptible if the user wanted to take control of their PC again. This being the case they opted not to move transactions straight from the bang file into the indexes as this could be an intensive operation.

Before pressing on with an explanation of this let’s briefly review what the transactions actually contain.  At this stage we’re not going to explain the precise structure of the index transaction rows, just the concepts behind them. As part II explained transaction records are made up of the changes to the indexed columns, specifically the row id of the row that has changed, the column that has changed and the old and new values of the indexed column. If time were no constraint each transaction row could be picked up and all of the indexes referenced therein be updated before returning control to the user. However it is unlikely that a user would be prepared to wait this long so the process has been subdivided into tasks.

There are essentially two tasks - move the transactions from generic transactions to indexed column specific transactions and finally move the index specific transactions into the index, removing the old value if appropriate and inserting the new. This is achieved using three routines :-

    REV_BGND_UPDATE     F.DISTRIBUTOR     F.INDEXER

REV_BGND_UPDATE This is the routine that runs when the system is idle. It works through the indexed files in the system - seemingly using the system variable @INDEX.TIME which has three fields - field one contains an @Vm delimited list of indexed tables, field two contains the table number to start on and field three contains the pointer to the indexed column to work on. It is responsible for calling F.DISTRIBUTOR and F.INDEXER as required.

F.DISTRIBUTOR This is the routine that moves the generic transactions (0, 1, 2, et al) into column specific transactions (e.g. NAME, NAME*1, NAME*2 et al).

F.INDEXER This is the routine that takes the columns specific transactions (be they BTree, Relational or Computational) and updates the appropriate index row.

Comments

Original ID: post-8804287443033928019
  • third_party_content/sprezz_blog/15413.4465277778.txt
  • Last modified: 2024/01/17 19:45
  • by 127.0.0.1