Index retrieves truncated keys (OpenInsight 32-bit Specific)
At 04 SEP 2003 01:43:04PM Gerald Lovel wrote:
In my AREV app, I have known for some time that BTREE.EXTRACT returned invalid results when record keys got longer than maybe 18 characters. The keys would be split in 2 parts, so that the first 18 characters is placed in the key list, followed by the remaining characters as a separate entry.
Well, my OI test file for querying has the same problem. I do a select which should return 115 keys, and a list of 127 scrambled ids comes back instead.
I believe that the truncation and the scrambling are caused when REDUCE calls BTREE.EXTRACT which calls V119, as the key fragments are returned sorted instead of adjacent. An article about index improvements in AREV 2.1 talked about sampling key lengths. Is this the source of the problem? Will there be a fix?
Gerald
At 04 SEP 2003 04:53PM Ira Krakow wrote:
Gerald,
I did some experimentation along those lines. I used a file with 110 keys, with an average key length of 12 (11.9 actually, but close enough). BTREE.EXTRACT returned all 110 keys. REDUCE returned only those keys with a length of 15 or less. All keys that had a length of 16 or more were not returned. It sounds like REDUCE assumes that no key length is greater than 25% of the average key length, and it discards keys with a length greater than that.
I believe I'm confirming your observations. I have a small program and a file that can reproduce the problem.
Hope this helps.
Ira
At 04 SEP 2003 06:13PM Gerald Lovel wrote:
Ira,
I did a similar test, using your sample data file.
My AREV command, INDEXLIST, returns keys from btree indexes. It selected 110 rows, which could be reported with LIST.
The command, "SELECT ADDRESSES WITH TYPE 'SF'", returned 127 hits.
The command, "LIST ADDRESSES" then showed 93 records.
So for the 110 keys, 17 were too long and were truncated by RList's SELECT; 110 + 17=127. LIST showed 93 keys; 110 - 17=93.
Then I attached the file in OI and tried REDUCE, which returned 119 hits. Listing the results showed 26 missing rows; 119 - 26=93. The key math in OI is not as clear as in AREV, but the problem is just as evident.
I like your analysis of statistical key length. REDUCE assumes that no keys are more than 25% longer than average? Wow!
Gerald
At 07 SEP 2003 08:48AM [url=http://www.sprezzatura.com]The Sprezzatura Group[/url] wrote:
It would seem more likely that this is a symptom of having keys that are longer than the length defined in the dictionary. The overage would probably be there to catch the odd miscreant key.
World Leaders in all things RevSoft
At 08 SEP 2003 09:18AM Gerald Lovel wrote:
Sprezz;
I can't tell if you are being polite, or being facetious. Must be that British sense of humor. In AREV, the default is not to type the data columns in a dictionary. The length specification is purely for RList reporting, as the data is truly variable length.
However, I took your suggestion and typed the key field in the test file (which was previously untyped). The AREV default is VARCHAR(65534), while the OI default is VARCHAR(255). Sure enough, REDUCE returned correct key lists through RLIST and QUERY afterwards.
Whether the length of variable length data is declared or not should not affect the results of data selection, as data selection is a SET operation. More specifically, data selection separates the set of all elements in a table {A} into the subsets {I} and {O}, where {I} contains all elements matching the select criteria, {O} contains elements not matching, {I} union {O}={A}, {I} intersection {O}=null. If REDUCE does something other than this, I do not see why.
Gerald
At 08 SEP 2003 09:33AM The Sprezzatura Group wrote:
Overly polite - very jetlagged having just returned to the UK so erred on the side of over politeness to ensure that the advice was simply that.
The key length thing is a very long standing gotcha that has been documented in a number of fora over the years. Here's a link from 1996… http://www.revelation.com/__8525652b0066bfaf.nsf/0/E7474332637BE6E4852563B80063755D?OpenDocument
The Sprezzatura Group
World Leaders in all things RevSoft