Sign up on the Revelation Software website to have access to the most current content, and to be able to ask questions and get answers from the Revelation community

At 29 NOV 2001 12:31:30AM Scott, LMS wrote:

Hi All

Summary

1. Is it ok to have users accessing tables that you are verifying or fixing?

2. Can "Temporary" GFEs be caused by network problems and disappear on their own when the network recovers?

3. How do you work DUMPLH to fix corruptions manually and recover broken records?

Detail

For this exercise assume AREV 3.12, win95 clients, Novell 4.x server, NLM. The system is supposed to be available 24/7. And I don't know this system at all well.

The network gets very slow on occasion, and NLM Stats only works sometimes.

The other day the user complained that she had a "read error", so she did a verify (on all the tables in the system) and got several errors in one table.

7 row is in wrong group (against 4 different groups)

1 table could not be read (against 2 different groups)

Sometime later when the user asked for support, I dialed in. I did a VERIFYLH on the specific table and got no GFEs, so I said I didn't want to fix GFEs that weren't there anymore, it would only cause problems because GFE fixing usually involves losing records.

So the user decided to run the verify everything again (another 3 hours). When that completed with no GFEs, she wanted to know why. So do I. The system seems to have continued running fine.

Can GFEs disappear by themselves?

Can dodgy network connections cause "temporary" GFE's?

Can VERIFYLH cease to identify GFEs?

Is it ok to run VERIFYLH while people are accessing the system (because if we had to kick them all off it is likely the users would stop doing verifys at all and some verifys take more than 7 hours, way too long to have the system down).

Is it ok to fix GPFs using DUMPLH and Ctrl F combinations while people might be using that file? To me, this seems like a bad idea but it has been "standard" practice on this system.

Is there some detailed doco on how to use DUMPLH to identify GFEs and fix them? All I see is pretty hex stuff and I have no idea how to tell what the broken bits look like let alone fix them. Where do the "bad rows" go? I looked for DUMPLH.FIX.GARBAGE but I suspect it is called something else in AREV 3.12. What is DUMPFIXNNNN_XXXXXXXXXXXX table for?

Scott, LMS

Treading water in a strong current…


At 29 NOV 2001 07:29AM Dan Reese wrote:

We have found it to be more reliable to verify the tables with everyone logged out of AREV.

The problems you describe are most likely caused by a caching issue. Windows will cache data, the Novell client will cache data, and the Novell server will cache data.

The most likely cause of your problem is workstation-level caching. You need to disable all workstation level caching (both Windows and Novell client).

You do not want to (and probably can't) turn off NetWare's server cache, but there is a problem with NetWare's Turbo FAT (a cached version of the server's file allocation table) that affects large (over 6MB) randomly accessed files. You can disable the Turbo FAT by installing a patch to NetWare (NetWare 3 uses fatfix, NetWare 4 uses turbodis, NetWare 5 uses something else…nw5turbo I think).


At 29 NOV 2001 10:30AM Victor Engel wrote:

In addition to Dan's idea that there could be a cache problem, you should also ensure that everyone is accessing the data the same way, in you example, that would be using the NLM.

Now when you use DUMPLH or VERIFYLH, you are doing frame and group level file accesses/modifications and are not using the services of the NLM.

For future GFE reports, you may want to ask the user to detach and reattach the file (logoff and relog on is simpler for most users). If the problem is cache-related, they will not get a GFE this time.

I believe there are other threads on this topic here. Use the search term "phantom GFE".


At 30 NOV 2001 12:19AM Scott, LMS wrote:

Hi All

The search on the phantom gfe was quite helpful. And I don't know how I could have worked on OI for so long without meeting any before.

Does anyone know specifically what causes a "table could not be read"? Hint: would running a backup at the same time as the verify cause this?

Does anyone know where I can find proper instructions for DUMPLH?

Thanks

Scott, LMS


At 30 NOV 2001 04:22AM [url=http://www.sprezzatura.com" onMouseOver=window.status=Click here to visit our web site?';return(true)]The Sprezzatura Group[/url] wrote:

Traditionally one does not run backups at the same time as an AREV/OI session is running. This is historically due to the way that backup software locks files causing OV files to be unavailable leading to data loss and corruption.

ArcServe was traditionally the worst for this.

The Sprezzatura Group

World Leaders in all things RevSoft


At 30 NOV 2001 07:45AM Joe Doscher wrote:

Hi Scott;

Using DUMP

DUMP allows examination of the internal filing structure of a Linear Hash file. To access the DUMP utility, type the following from TCL: DUMP filename where filename is the name of the Advanced Revelation file in which you have a GFE.

To move from group to group, use the up and down arrow keys. To move from frame to frame within a group (between primary and overflow) use the left and right arrow keys. Move to the group with the GFE (the group number is displayed in the original error message) and copy each frame within that group to your printer using Shift-PrtSc. You will need this information for comparison later.

Note: Keep in mind that you are directly in the filing structure. Any changes made while in the edit mode in DUMP will make permanent changes to the file, even if Esc is pressed.

While in DUMP, type Ctrl-F to fix the GFE. When prompted for the group or range to fix, enter the group number reported in the error message and press Enter. The system will attempt to fix the error by recalculating the corrupted header information. If this is not possible, then any bad records will be deleted from the file. The deleted records are normally stored in a file called DUMP.FIX.GARBAGE, depending on the type of corruption that has occurred. Each record that is deleted should become a record in DUMP.FIX.GARBAGE. DUMP creates a record key that provides the group number that the record belonged to, as well as a number that indicates where in the group of deleted records this record falls.

For example, imagine that DUMP has been used to fix a file called MY.FILE. When MY.FILE was dumped, three records were created in DUMP.FIX.GARBAGE. The keys are GROUP.3.1, GROUP.3.2, and GROUP.3.3. The record keys indicate that 3 records were deleted from GROUP 3 in MY.FILE. All records in DUMP.FIX.GARBAGE will need to be re-entered. The structure of each field of these records is as follows:

RECORDKEY FIELD1 (the record key may be preceded by ASCII characters) FIELD2 FIELD3, etc.

Depending on the type of corruption and the fix performed by DUMP, these records may not show up in DUMP.FIX.GARBAGE. This will need to be determined by examining the printed copies of the frames taken prior to dumping the file and comparing them to what currently exists in a file called DUMP.FIX.TEMP. All good records in a group are copied to DUMP.FIX.TEMP, then copied back into the re-initialized groups in the file.

Compressing Overflow Frames

While fixing the file, the following message will appear: Overflow free list has been cleared You can issue a compress to recover and reorder overflow frames. ..Press any key to continue..

Compressing the frames will remove any unused frames and reorder the overflow frames so that they are contiguous in the overflow (.OV) file. To compress the frames, enter Ctrl-C from within DUMP after the fix process has completed.

Note: this process can be quite lengthy for large files.

I cut and pasted this from a KB article for you. (Advanced Search on DUMP.

I hope it helps,

JoeD

Rev. Tech


At 30 NOV 2001 08:26PM Ted Archibald wrote:

1 - Phantom GFEs caused by faulty server hardware

One of my clients had consistent problems with phantom GFEs.

Long running reports would crash and have to be restarted. Client was getting really angry and was threatening to look at a linus reporting system with mysql etc..

LH_Verify would not find the GFE that caused the crash but might find another 1 or 2 GFEs seemingly quite random.

When these records were examined via DUMP they would not show GFE status message and thus were phantom. In addition LH_VERIFY would not seem either.

Once, two reports crashed on same GFE at same time. I assumed from this that the data in the Novell server memory was damaged and the data in the file on disk was ok.

My explanation for this problem is that the disk controller or the actual memory storage was bad occassionally and the data was damaged on the way into server memory. This rules out network problems. If the block aged with no update then the problem would be erased. If the record or some other record in the io block were updated then the whole block was written back to the hard-drive thus making a phantom GFE into a permanent GFE.

2 - GFE not seen by DUMP

There seem to be a class of GFEs that occur and are solid and are seen repeatably by LH_VERIFY but DUMP does not recognize by displaying the GFE error message at the bottom of the display. I fix these the usually way and the next LH_VERIFY does not seem them.

3 - SERVER UPGRADE

The Client has upgraded from a PII 500MHz 800MByte memory Novell server with Novel 3.2 to an IBM P3 1GHz 1GByte memory Novell 4.1. The phantom and other GFEs have disappeared. The best part is that reports that took 6 hours now run in 15min. WOW. OI displays that required 100 or more reads now take 2 sec where they were 20 sec previously.

Cheers

Ted in Tsawwassen BC


At 01 DEC 2001 08:48AM Dan Reese wrote:

Further clarification…

We have learned to make a distiction between Phantom GFE's and Temporary GFE's.

Phantom GFE's do not go away until you down the server and bring it back up. If you are using the Revelation NLM on a Novell network, Phantom GFE's are almost always caused by corruption in Novell's Turbo FAT, which can be corrected only by installing the patches I mentioned earlier. It is certainly possible for other cached devices, such as disk controller cards or NIC cards to contribute to this problem. But in practice, we have not found a Phantom GFE problem that was not cured by the Novell patches for the Turbo FAT.

Temporary GFE's are those GFE's that fix themselves if you wait long enough. This is the type of GFE that started this discussion thread.

Phantom GFE's: First and formost, if you have a Novell network, make sure you install the appropriate patch for your version of NetWare. If you you do not have this patch installed you are wasting your time doing anything else, because this problem will bite you sooner or later.

Temporary GFE's: Second, if problems continue after installing the Novell patch, focus on workstation level caching and disable caching in the Novell client.

If you are using Windows 95/98/ME you can also disable Windows' workstation-level caching in the hardware/trouble-shooting section Control Panel/System. If seems that Windows 95/98/ME caches every drive it can see, including Novell server drives. This leads to a cache of a cache and things get out of sync. You do not experience a performance hit on Novell by disabling these workstation-level caches because Novell's server-level caching outperforms Microsoft's workstation-level caching.


At 03 DEC 2001 01:40AM Scott, LMS wrote:

Hi Joe

I think the article you pasted in applies to an older version of AREV like before version 3.12 We are using 3.12, although I see from reading the forums a bit, that people are still on 2.x AREV as well.

So with Version 3.12, the dots in the file names and as much as possible in the variable names are phased out. So DUMP.FIX.GARBAGE is something else now.

Like maybe SYSTEMP

see

http://www.revelation.com/WEBSITE/DISCUSS.NSF/7a591c01171830eb8525652b0083c06b/0C1B5744016DBD5A85256B140059EB61?OpenDocument

My listtables shows a bunch of files called variations on the following theme

DUMPFIX3819_0000E82AE091

Ie how do I tell one DUMP file from another when looking for my "lost records". What are the DUMPFIXnnnn_xxxxxxxxxxxx files? Is it groundhog day?

DUMP is now DUMPLH (or are these two different things?)

Ctrl F still works. I haven't tried Ctrl C.

What I am looking for is some more detail on how to manually edit the records. I read stuff where people say things like "I edited the frame". Presumably there is a way of visually identifying a broken bit of file in DUMPLH, and editing it, like maybe putting a @RM back in or moving a record to a new frame? I guess it would help if I knew more about the algorithm used to calculate which record goes in what frame. Ie then I'd know where to put it even if I couldn't figure out how to move it there. I'd prefer to know how to move it where it is supposed to go.

I know this all sounds like playing with fire but when I'm already looking at doing file restores, a bit of editing the frames seems ok. I'd only try it on a file that was broken already.

Eg in here

http://www.revelation.com/WEBSITE/DISCUSS.NSF/7a591c01171830eb8525652b0083c06b/a42bc8065cd437cb852565a9006e892f?OpenDocument

Victor tells us that in DUMP (DUMPLH) that when you are looking at a group, that the keys are highlighted. Thats helpful. He then says "reset the sizelock" =Huh? How do I do that?, Hmm found that in the appendix on hash tables. Apparently all you have to do is list the table, so long as it had a sizelock of 1, it will be reset to 0. Otherwise from DUMPLH tablename, we press the - (minus) button to make it go down, ideally to zero. I suspect most of our tables have sizelock=0 so they can expand or contract as required.

The doco for version 3.0 in the appendix has some info on fixing GFEs using DUMPLH, but happily refers to Step 5 when there isn't one. It also suggests "Re-entering" data. But not how you work the DUMPLH editor. Or what it means when it says "the row key may be preceded by ascii characters". I think, given such vague instructions, that it may well be easier to enter the stuff using my own system or EDIT and saving with the record id.

Steve Smith talks about editing file with hex editor, fine, but how do I know what I should change what to?

http://www.revelation.com/WEBSITE/DISCUSS.NSF/7a591c01171830eb8525652b0083c06b/D78E739A07C922558525687A0036188F?OpenDocument

Victor Engel talks about fixing a file manually. But not how.

http://www.revelation.com/WEBSITE/DISCUSS.NSF/7a591c01171830eb8525652b0083c06b/C6077CC3F618C611852565010062C867?OpenDocument

Larry Wilson talks about it but avoids getting detailed.

http://www.revelation.com/WEBSITE/DISCUSS.NSF/7a591c01171830eb8525652b0083c06b/ac35dfd501b0f216852566d70017a35b?OpenDocument

Is there perhaps a SENL that details what I need to know? Or has this information been passed on by some sort of osmosis of the Group Unconscious?

Scott, LMS


At 03 DEC 2001 01:46AM Scott, LMS wrote:

Hi Sprezz

Ah this is as I thought but probably out of my power to fix.

One site we did set up a system where everyone gets chucked off between midnight and 12:30am and a bat file on the server copies all the files across to a "backup" dir. The bat file has to be run on the server so it takes fifteen minutes instead of 3 hours. The back up dir gets backed up onto tape at a time that suits the network schedule, leaving the live data alone. This works reasonably well.

What arrangement do you prefer for systems that are "supposed" to be up 24 hours a day, every day?

Also how do you manage when the verifys take hours to run. Is there a way of speeding these up? Example all the files take all day, one file in particular can take 3 hours to verify.

Scott, LMS.


At 03 DEC 2001 10:37AM Joe Doscher wrote:

Hi Scott,

Please send me your fax number. I have some matterial from "The 1st. Annual Revelation Technologies User Conference". The title is "Advanced Revelation Filing Systems" by Kurt Baker Tuesday, April 23, 1991 11:00 a.m. - 12:00 m. Developer A Track. It shows the Linear Hash HeaderInformation. If you are interested I'll fax it to you, I can not seem to find it in electronic form. I am looking for more info along the line you want.

JoeD

Still lookin.


At 03 DEC 2001 11:12AM [url=http://www.sprezzatura.com" onMouseOver=window.status=Click here to visit our web site?';return(true)]The Sprezzatura Group[/url] wrote:

Joe

Check out TB29 (included in our SysKnowledge download courtesy of your good selves) or this article here which goes into a little more depth. http://www.sprezzatura.com/revmedia/v3i1a1.htm

The Sprezzatura Group

World Leaders in all things RevSoft


At 03 DEC 2001 11:21AM Victor Engel wrote:

] DUMP is now DUMPLH (or are these two different things?)

This is the same thing.

] Ctrl F still works. I haven't tried Ctrl C.

Note that performaing a compress on a file can sometimes result in loss of data, particularly if the file has a GFE. I wouldn't use Ctrl C to fix a file but only to remove unused space after fixing a file. You can accomplish the same thing by recreating the file.

] What I am looking for is some more detail on how to manually edit the records. I read stuff where people say things like "I edited the frame". Presumably there is a way of visually identifying a broken bit of file in DUMPLH, and editing it, like maybe putting a @RM back in or moving a record to a new frame? I guess it would help if I knew more about the algorithm used to calculate which record goes in what frame. Ie then I'd know where to put it even if I couldn't figure out how to move it there. I'd prefer to know how to move it where it is supposed to go.

First, try downloading the knowledge base, available at http://www.revelation.com/WEBSITE/knowledge.nsf/89e60900cf7ebbc8852566f500654ecb/cbc7524470f2204f8525666600775041?OpenDocument or by downloading the Sysknowledge application available for download at http://www.sprezzatura.com. Find the technical bulletin "Group Format Errors in Advanced Revelation" R28. That bulletin explains various causes for GFEs and also explains how to use DUMP. You can also press F1 in DUMP to get a list of commands available in DUMP. Just remember that changes in DUMP happen immediately, so the only way to correct an error is to re-edit the data.

You will not find any information on where the record should go. But this doesn't matter. All you need to do is to correct the group so that the group is sound, whether records are in the incorrect group or not. To move records to the correct group, all you need to do is to access the record. Perhaps the easiest way to do this (after GFEs have been fixed) is to simply access file file. If you want, select the whole file and read then write each record in turn without changing it.


]What I am looking for is some more detail on how to manually edit the records. I read stuff where people say things like "I edited the frame". Presumably there is a way of visually identifying a broken bit of file in DUMPLH, and editing it, like maybe putting a @RM back in or moving a record to a new frame?

]I guess it would help if I knew more about the algorithm used to calculate which record goes in what frame. Ie then I'd know where to put it even if I couldn't figure out how to move it there. I'd prefer to know how to move it where it is supposed to go.

With DUMP you can't move anything. All you can do is edit data one character at a time. You do this using CTRL-E. You will be entering the hex value of the data that belongs there. If you want to correct the data this way, you don't need to know about where records are supposed to be saved, but you should know how the filing system formats a group. Basically, in each group you have a header then a block of data for each record. The data contains a pointer to the next record and to the next frame. The keys to the records should be highlighted. If they are not highlighted, it means that these pointers do not match the records. Each record is followed by a record mark. But the filing system does not really use these as far as I can tell. It's handy to have them, though, because you can use them to determine where the end of the record is if you need to recalculate the pointers.

]The doco for version 3.0 in the appendix has some info on fixing GFEs using DUMPLH, but happily refers to Step 5 when there isn't one. It also suggests "Re-entering" data. But not how you work the DUMPLH editor. Or what it means when it says "the row key may be preceded by ascii characters". I think, given such vague instructions, that it may well be easier to enter the stuff using my own system or EDIT and saving with the record id.

The ascii characters refer to the pointers I just mentioned. For details on the calculations involved see the technical bulletin "The Linear Hash Filing System" (R29).


At 03 DEC 2001 02:22PM Joe Doscher wrote:

The Spezzetura Group,

You Doc. # V3I1A1, I think, is exactly what Scott is looking for; to beable to interpet the DUMP screen material. Then press F1 for what keys to use to do what.

As far as our TB #29 article - I can not find it. I find TB #28 on GFEs. If you could, please email me a copy of TB #29.

JoeD

Still Trying


At 03 DEC 2001 02:55PM [url=http://www.sprezzatura.com" onMouseOver=window.status=Click here to visit our web site?';return(true)]The Sprezzatura Group[/url] wrote:

Joe

Just download SysKnowledge fom our Web site and put it onto a copy of OI - you'll get all the TBs AND the REVMEDIA articles we mentioned.

The Sprezzatura Group

World Leaders in all things RevSoft


At 03 DEC 2001 07:15PM Scott, LMS wrote:

Hi All

Thanks for that, there's enough homework in there to keep me quiet on the DUMP subject for weeks.

Joe, thanks for the fax offer. Historically, faxing or phoning across oceans has been horrendously expensive for us Australians and I have it in my head that it still is. I'm quite happy to spend your time but very reluctant to spend your (boss's) money. I know that's not very rational.

I think the Sprezz info will do for now.

BTW I know what @ANS etc are but not @ATTACK. Is @ATTACK just a cute name for the interpretation series? Same for VERBatim ie VERB interpretations? I thought VERBatim made diskettes.

Scott, LMS


At 03 DEC 2001 07:23PM [url=http://www.sprezzatura.com" onMouseOver=window.status=Click here to visit our web site?';return(true)]The Sprezzatura Group[/url] wrote:

Sorry - just cute… @Attack, VERBatim and RTPSeries…. the old REVMEDIA.

The Sprezzatura Group

World Leaders in all things RevSoft

View this thread on the forum...

  • third_party_content/community/commentary/forums_nonworks/c0c6df4fc038c4fd85256b13001e59ea.txt
  • Last modified: 2023/12/28 07:40
  • by 127.0.0.1