Compiling 64K on a Shoestring by Blaise Wrenn (LexStat Systems Ltd)

Published By	Date	Version	Knowledge Level	Keywords
Sprezzatura Ltd	01 DEC 1992	3.0+	EXPERT	COMPILE, 64K, SOURCE, CODE

Those of you who have been programming with R/BASIC for some time will no doubt have encountered the annoying message: "Warning! Source code approaching #@$!%& 34K maximum." (Expletives added), and will ultimately have had a compilation fail due to too much source code in one record. I'm sure I'm not alone in wondering why the object code has to be accumulated in the same 64K memory space as that occupied by the source, but there is a way to work around the limitation.

The strategy set forth in this article was developed in desperation after adding a feature to a program which caused the source to exceed the 34K limit and because there was no easy way to strip out a reasonably sized section of the program to an external subroutine. I had already written and used a pre-compile utility which squeezed out most extraneous spaces and removed all comments.

Looking at a cross-reference listing of the program I realised that several long internal subroutine names and several long variable names were called many times. If I could shorten them, the resulting code might just squeak by the 34K barrier. Checking the manual for rules on variable/identifier names, I noted that a variable identifier had to start with a letter, but could then be followed by numbers or periods and dollar signs. Several iterations of reducing meaningfully-named variables such as LINE.NUMBER to L$ and SAVE.NUMBER to N$ ensued. (I chose the $ sign as a termination character because I never ordinarily use it, and it would indicate a substituted name.)

The scheme worked beautifully. Then another feature was added. Bang! Over the limit again. Another round of swaps. Another feature, another round of swaps, and so it went. I finally reached the point where I had used just about every letter from A$ to Z$. As it would happen, at this point I had to chase a bug – something to do with LINE.NUMBER. Damn, was it L$ or N$? Well, I changed the wrong one, and the program completely fell over (remember, I am used to writing in a high level language). It was clearly time to maintain a table of changes, and come to think of it, why not automate the whole process? So, here's the drill:

As this is a simple-minded solution, in your original source, use meaningful variable names, and avoid the use of variable names which are words which comprise the R/BASIC programming language or @variables or portions thereof. Be sure to have a space after every variable name (i.e., don't use statements such as LINE.NUMBER=1 rather, use LINE.NUMBER = 1 instead).

Construct all internal subroutine names in such a way that they can be distinguished from variables. I use the form VERB.ADJECTIVE.OBJECT (e.g. OPEN.DATA.FILES, PARSE.USER.INPUT, GENERATE.REPORT.TOTALS, etc.). This has an obvious beneficial side-effect. Generate a cross- reference listing of your source. (A utility to do this is supplied on the REVMEDIA volume 4 utility diskette and is called XREF_PROG.) Number your subroutines S0$ through S99$ (or even higher). Assign the most often used and longest-named variables two-character identifiers A$ through Z$. Assign other variables three-character identifiers AA$ through ZZ$ (here's a chance to be a bit more descriptive!).

  * SWAP - Blaise Wrenn, LexStat Systems
  *-------------------------------------------------
  EQU BLANK        TO ' '
  EQU NUL          TO ''
  DECLARE SUBROUTINE MSG, FSMSG

  SENTENCE = TRIM( @SENTENCE )
  CONVERT BLANK TO @FM IN SENTENCE
  IF SENTENCE<1> = 'RUN' THEN
    FILENAME   = SENTENCE<4>
    RECORDNAME = SENTENCE<5>
  END ELSE
    FILENAME   = SENTENCE<2>
    RECORDNAME = SENTENCE<3>
  END

  GOSUB OPEN.DATA.FILES
  READ RECORD FROM F.BIG.SOURCE, RECORDNAME THEN
    READ SWAPS FROM F.SWAPS, RECORDNAME THEN
     COUNT.SWAPS = COUNT(SWAPS, @FM) + (SWAPS # NUL)
     FOR INDEX.SWAPS = 1 TO COUNT.SWAPS
      LINE = TRIM( SWAPS< INDEX.SWAPS > )
      * Check that swap data actually exists
      IF LEN( LINE ) THEN
       * Check for comment line
       IF LINE[1,1] # '*' THEN
         NEW  = FIELD( LINE, BLANK, 1 )
         OLD  = FIELD( LINE, BLANK, 2 )
         MSG(INDEX.SWAPS : "|" : OLD : "|" : NEW, 'UB', IMAGE, '')
         SWAP OLD WITH NEW IN RECORD
         MSG('', 'DB', IMAGE, '')
       END
      END
     NEXT INDEX.SWAPS
     WRITE RECORD TO F.OUT, RECORDNAME
    END ELSE
     MSG('Unable to read swap information!','','','')
    END
  END ELSE
    MSG('Unable to read source file!', '', '', '')
  END
  STOP

  OPEN.DATA.FILES:
  OPEN 'BIG.SOURCE' TO F.BIG.SOURCE ELSE
    FSMSG()
    STOP
  END
  OPEN 'SWAPS' TO F.SWAPS ELSE
    FSMSG()
    STOP
  END
  OPEN FILENAME TO F.OUT ELSE
    FSMSG()
    STOP
  END
  RETURN

Create two files, one called SWAPS and one called BIG.SOURCE. Create a record in the SWAPS file with the same name as your program source which you should place in the BIG.SOURCE file. In this record assign all subroutine swaps first, followed by the variable swaps, one swap per line, with the short name first, followed by at least one space, and then the name of the subroutine or variable identifier. For example:

     S1$  GENERATE.REPORT.TOTALS
     S2$  OPEN.DATA.FILES
     S3$  PARSE.USER.INPUT
     A$   ANSWER
     B$   BIN.NUMBER
     C$   CHARACTER.COUNT

Invoke the SWAP utility (see below and also on the REVMEDIA Volume 4 utility diskette) in the form:

     SWAP   destination.source  .filename program.name

It will apply the substitutions assigned in the program.name record in the SWAPS file against the source contained in the program.name record in the BIG.SOURCE file, and store the resulting smaller source code in the file identified by the destination.source.filename parameter.

At this point (if you are a diehard RevG user [there are at least two of us left]) you might want to run the squeeze program against the source to compress it further before compiling. (That utility is also on the REVMEDIA Volume 4 utility diskette and is called SQUEEZE.) By using this strategy I have been able to write source code of up to nearly 64K, however, if your usual choice of variable names tends towards terse cryptic codes anyway, then u won't c s drmtc a cmprsn s I did (nor do u dsrv 2).

(Volume 4, Issue 7, Pages 4-6)