Sign up on the Revelation Software website to have access to the most current content, and to be able to ask questions and get answers from the Revelation community

At 27 APR 1998 10:14:11AM Oystein Reigem wrote:

GFE that cannot be LH Verified

If you recognize the description below it might be because I've posted much of this before. I didn't get any real help then but hope I do now. The situation's becoming critical. So is there anybody out there who recognizes these symptoms? Or can somebody give me useful hints on how to proceed with this problem?

I have a table where there seems to be a GFE, but LH Verify has problems verifying.

Why I belive there is a GFE? Because a stored procedure of mine does an RList select on an unindexed field, and checks afterwards with Get_Status. Get_Status reports

  "FS1003" : @VM :
  "REGIMUS\BRUKDATA\REV43145.LK" : @VM :
  15649 : @VM

What I mean with LH Verify having problems verifying? When I run LH Verify on the data table it doesn't report success or failure. It just doesn't report anything. That box where it usually reports errors (or "No Group Error") stays empty. It certainly *does* something - there's harddisc activity and I get an hourglass cursor for a reasonable amount of time. But for some reason it doesn't report back.

Also if I afterwards try to LH Verify any other table it tells me there's a GFE.

The problematic table is from a customer. At the time I got it neither I nor the customer suspected there was something wrong with it. It's big, and I got it so I could test how various queries performed on large data sets. In the meantime my customer's entered more data. If she now tries to do a LH Verify on her version of the table LH Verify doesn't finish at all. It just grinds on.

- Oystein -


At 27 APR 1998 12:53PM John Duquette wrote:

Oystein,

Can you provide some information on this file (size, modulo etc?) that can help me try to create a similar problem here.

Thanks

John Revelation


At 27 APR 1998 02:37PM Steve C. wrote:

Oystein,

I had a similar problem. See my thread above. I had a GFE

on the index file itself… the !Filename. Try running

LH Verify on the ! File.

Steve C.


At 27 APR 1998 10:47PM Cameron Revelation wrote:

Oystein,

First, get everyone out of the system and back up the data, dict, and bang files before doing anything. (That way you can always start over.)

Second, your file has an index MFS, which as Steve pointed out, can cause the GFE verifier to get confused. If possible, remove the indexes.

Third, check the dict and bang portions for GFE's.

Fourth, check the data portion. For now, I'll assume that is where the error is and that the check still fails; if so, and if a copy of Arev is handy, use the verify/fix utilities in Arev … often problems that OI's verifier doesn't find are handled by the verifier in Arev.

Fifth, fix the file if possible. Try it in OI … try it in Arev. Determine how many records were lost. If previous backups exist, see if those records can be extracted from those backups.

Don't let GFE's go unattended! They don't usually heal themselves!

Cameron Purdy

Revelation Software


At 28 APR 1998 04:59AM Oystein Reigem wrote:

John,

On the version that I have (where LH Verify on the data file finishes, but without reporting anything back), the data file has the following parameters:

Modulo: 16131
Record count: 70659
Frame Size: 1024
Threshold: 80
Sizelock: 0
LK and OV file sizes: 16131 and 11391 MB

The index file has the following parameters:

Modulo: 19067
Record count: 39171
Frame Size: 1024
Threshold: 80
Sizelock: 0
LK and OV file sizes: 19067 and 31064 MB

- Oystein -


At 28 APR 1998 11:16AM Oystein Reigem wrote:

John,

(While posting this I see that Cameron has responded too. I haven't had time yet to try what he suggests. And I'd very much like som comments to the following anyway.)

I ran LH Verify on the problematic (data) table again today. This time it (Verify) finally reported back - with Group Format Errors. I pressed the Detail button. The Debugger came up - with the message "SYS1500: Primary row locked: Table FORMAT, key 1". I aborted the Debugger. Database Manager now was inactive or hid behind other windows, and was not in the task list (I tried to Alt-tab). But suddenly it bounced back with the following details:

15649   Primary frame header type is not correct
15650   Primary frame header type is not correct
... ... ...
15688   Primary frame header type is not correct
15689   Primary frame header type is not correct
15690
15691
... ... ...
16130
16131

This certainly looks like one messed up table - so messed up that I (or the customer) should have discovered it earlier, or what?

I first thought that perhaps the error list was bogus - that the error was something else but something that got LH Verify confused. And certainly LH Verify is confused, but with some effort (e.g, by restarting from time to time), and mostly by taking one group at a time, I managed to LH Fix some errors. (I checked with LH Verify from time to time.) But finally my luck ran out when LH Fix came up with the message "SSP271: The header of the table … is corrupted. FIX_LH cannot continue". If I can take that by face value it seems LH Fix finally destroyed my table rather than fixing it.

But assume that LH Verify Detail's list of errors was genuine. What could cause the last 500 groups or so of a table to be wrong?

I assume the Debugger "SYS1500" message must have been caused by LH Verify Detail having more errors than it could handle or something. And perhaps the problems I had earlier with LH Verify not reporting anything had the same cause.

Here are more details on what happened when I tried to LH Fix:

First I tried LH Fix All Groups. The Debugger came up with the message: "ENG0703: FIX_LH, Line 1. Variable exceeds maximum length."

So I tried just the first erroneous group. I think that went wrong, but I don't remember the details. I logged out and restarted Windows to get a clean slate. Then I tried just one group again, but group 1 this time. (I thought it best to give LH Fix something easy to chew on first, and what could be easier than a group without errors, and the first one at that?) "Fixing" group 1 took some time, though. I'd expect it to go real fast. Is there something special with group 1? (I know group 0 is special.) Oh, perhaps it's the root node, or at least a non-leaf node, and perhaps it takes a longer time to check those?

Then I tried the first erroneous group. That went fast, with no messages. I took the next one too. Then I checked with LH Verify. It was the same problem with the "SYS1500: Primary row locked" message, but when the details arrived it was consistent with what I had done. The first two erroneous groups weren't longer in the list.

Then I took all the other groups that had the "Primary frame header type is not correct" message. (I.e I took all the groups that had that message the first time I verified - up to and including group 15689. I didn't check if the error listing had changed in the meantime.) When I tried to LH Verify Details after that I still got the "SYS1500: Primary row locked" message, but now the error list had shrunk to an ominous

So I restartet everything (Win, OI) to give LH Verify a second chance and this time it came forward with a more agreeable list

15652   Primary frame header type is not correct
... ... ...
15692   Primary frame header type is not correct

but as you see LH Fix didn't manage to fix a range of groups. And that seemed to be the case later too. If I tried a range I think it did the first one, or the first and last ones, or perhaps none. I didn't pay enough attention.

At this point the system (my computer) became sluggish, so I restarted. I LH Fix'ed all the errors above, but when I LH Verified nothing much had happened:

15653   Primary frame header type is not correct
... ... ...
15691   Primary frame header type is not correct

I soldiered on one group at a time, until I hit the wall with the "SSP271: The header of the table … is corrupted" message. I restarted Win/OI in case resources were low, but to no avail.

- Oystein -


At 28 APR 1998 11:18AM Oystein Reigem wrote:

Steve,

You suggested I verified the index table so I did. But it seemed to be fine. I think I might have had cases earlier, though, where the data table seemed to have a GFE but it was the index table.

- Oystein -


At 28 APR 1998 11:23AM Oystein Reigem wrote:

Cameron,

Funny message you had there. It knows about its thread but is not listed in the thread.

By the way - do you remember once I said one of your postings got me a 500 html error? Now I know when that happens. It's if you try to follow the thread links in a response page. Like if you're editing a response and suddenly need to check one of the other postings in the thread. (You go tell Jennifer. I think she got fed up with me some time ago when I had a lot of similar comments. )

- Oystein -


At 28 APR 1998 01:44PM Oystein Reigem wrote:

John,

I followed Cameron's advice and tried Arev LH Verify/Fix too. Arev LH Verify said:

Group Code Error description.................................
15649    3 Primary frame header type  is incorrect.
15650    3 Primary frame header type 4294967171 is incorrect.
15651    3 Primary frame header type 69 is incorrect.
15652    3 Primary frame header type  is incorrect.
15653    3 Primary frame header type  is incorrect.
15654    3 Primary frame header type  is incorrect.
15655    3 Primary frame header type  is incorrect.
15656    3 Primary frame header type  is incorrect.
15657    3 Primary frame header type  is incorrect.
15658    3 Primary frame header type 6 is incorrect.
15659    3 Primary frame header type 21 is incorrect.
15660    3 Primary frame header type 69 is incorrect.
15661    3 Primary frame header type 4294967284 is incorrect.
15662    3 Primary frame header type  is incorrect.
15663    3 Primary frame header type  is incorrect.
15664    3 Primary frame header type 4294967170 is incorrect.
15665    3 Primary frame header type 68 is incorrect.
15666    3 Primary frame header type  is incorrect.
15667    3 Primary frame header type  is incorrect.
15668    3 Primary frame header type  is incorrect.
15669    3 Primary frame header type  is incorrect.
15670    3 Primary frame header type 20 is incorrect.
15671    3 Primary frame header type  is incorrect.
15672    3 Primary frame header type  is incorrect.
15673    3 Primary frame header type  is incorrect.
15674    3 Primary frame header type 111 is incorrect.
15675    3 Primary frame header type 32 is incorrect.
15676    3 Primary frame header type 3 is incorrect.
15677    3 Primary frame header type 4294967177 is incorrect.
15678    3 Primary frame header type 4294967268 is incorrect.
15679    3 Primary frame header type  is incorrect.
15680    3 Primary frame header type 4294967177 is incorrect.
15681    3 Primary frame header type 100 is incorrect.
15682    3 Primary frame header type 37 is incorrect.
15683    3 Primary frame header type 107 is incorrect.
15684    3 Primary frame header type  is incorrect.
15685    3 Primary frame header type 38 is incorrect.
15686    3 Primary frame header type 26 is incorrect.
15687    3 Primary frame header type 7 is incorrect.
15688    3 Primary frame header type 80 is incorrect.
15689    3 Primary frame header type 4294967284 is incorrect.
15690    3 Primary frame header type  is incorrect.
15691    3 Primary frame header type  is incorrect.
15692    3 Primary frame header type  is incorrect.

I still hope you (or somebody else) can deduct something from this.

I asked Arev LH Fix to fix all groups. All seemed to go well until group 15686, where Arev LH Fix gave up with the message

'FIXLH_SUB' Line 1. B16 Non-numeric data when numeric required. Zero used.

Line 1 'FIXLH_SUB' broke because a run time error was encountered.

I logged out and in again and fixed the rest of the groups one by one, and got told there was 1 bad row in each. (Don't you get any messages when you fix all? You just have to check SYSTEMP?) But group 15686 couldn't be fixed.

Well - that was that, I thought. But then I tried OI LH Verify/Fix again. OI LH Verify Detail still brought up the Debugger with the "SYS1500: Primary row locked: Table FORMAT, key 1" message, but the list of errors was in agreement with Arev - with just one erroneous group - 15686. OI LH Fix then seemed to have the same problem as Arev LH Fix ("Line 1 'FIXLH_SUB' broke because a run time error was encountered"). I told the Debugger to Go. A rather messy message appeared and my hope for a solution faded even more. But after this failed OI LH Fix I tried a second one which seemed to succeed. And when I did an OI LH Verify the table was OK!!! (By the way - could I also have got Arev LH Fix to proceed after it crashed?)

But - what I have managed to fix is my outdated copy of the table. What about the customer's version? I couldn't get hold of that today. I might not succeed on that. Perhaps I should have waited for the customer's version before I did anything, but I hoped to learn something of general value. So please enlighten me!

- Oystein -


At 28 APR 1998 01:52PM Oystein Reigem wrote:

John,

I checked Arev's SYSTEMP to see which rows Arev LH Fix had deleted. It was just rubbish, like random snippets of memory or files. So could it have been a pointer (or pointers) to a non-existing group(s)?

- Oystein -


At 28 APR 1998 03:26PM Cameron Revelation wrote:

Oystein,

Funny message you had there. It knows about its thread but is not listed in the thread.

It is because I answered it off-line and there was a replication conflict. Our discussion is based on a Notes database which I replicate to my notebook so I can answer questions at my leisure, for example, while jogging, sailing, horse-back riding, or just relaxing in my hot-tub .

Cameron Purdy

info@revelation.com


At 29 APR 1998 08:13AM Jennifer Revelation wrote:

By the way - do you remember once I said one of your postings got me a 500 html error? Now I know when that happens. It's if you try to follow the thread links in a response page. Like if you're editing a response and suddenly need to check one of the other postings in the thread.

You are correct: the thread links do not work when you are creating a response, only when you are browsing the document.


At 29 APR 1998 04:51PM Aaron Kaplan wrote:

Almost everything it deleted will be garbage. If it was readable and recognizable, then there probably wouldn't have been a GFE….

apk@sprezzatura.com

Sprezzatura, Inc.

www.sprezzatura.com_zz.jpg


At 30 APR 1998 04:01AM Oystein Reigem wrote:

Aaron,

I thought GFEs often were about the "packaging" and linking of the frames being faulty, while the content might be healthy enough? I though my erroneous frames weren't really frames at all but just more or less random 1K chunks pointed to by erroneous links.

Oh, well. But can anything else be deduced from what I wrote about the errors? I mean with the approximately 500 frames at the very end of the file being faulty? The GFEs I've seen until now have been more like a sniper's shots than this massacree. (Except of course I hope it's not a massacree but somthing that hit an unpopulated region.)

- Oystein -


At 30 APR 1998 05:34AM Oystein Reigem wrote:

Aaron et al,

I thought my table had been fixed. If I run OI LH Verify it reports "No Group Errors".

But if I then press Detail the Debugger still comes up with the "SYS1500: Primary row locked: Table FORMAT, key 1" message! And when I count the rows there are less than 2000 left of the original 60-70,000!

(I should of course have counted the rows already yesterday but I really didn't expect such carnage. Arev LH Fix seemed to delete less than 100 rows, and since I didn't know the exact number of original rows, I didn't expect the new count to tell me much anyway.)

I guess there's not much more to do with my version of this table. I will look at my customer's version next week - that will be the first chance for both of us. I might then be back with more questions… Hold your breath…

So what lesson is there to be learned from this? To run LH Verify regularly on all tables? Because GFEs may lurk long before they show themselves?

- Oystein -


At 30 APR 1998 10:55AM Aaron Kaplan wrote:

If a pointer goes out and sets up links to a frame in USER.EXE or something then you will get gibberish. If there is a link to an offset, and that offset is incorrect, you'll get whatever happens to be there.

apk@sprezzatura.com

Sprezzatura, Inc.

www.sprezzatura.com_zz.jpg


At 04 MAY 1998 07:11AM Oystein Reigem wrote:

Aaron,

Thanks.

So what's your advice so I (my clients) can avoid such problems in the future? Making backups regularly doesn't help much if there is an error that goes undetected for a long time. Using LH Verify to check regularly is no good if it cannot find the errors (as you might remember in my case it didn't report any errors (nothing at all, actually)). Are there other utilities I can use to check on my tables' condition?

- Oystein -

View this thread on the forum...

  • third_party_content/community/commentary/forums_nonworks/5064dbc2857a3693852565f3004e3447.txt
  • Last modified: 2024/01/04 21:00
  • by 127.0.0.1