by Brian Moriarty
ABC by MONARCH DATA SYSTEMS
P.O. Box 207
Cochituate, Massachusetts 01778
40K Disk $69.95
THE BASIC COMPILER by DATASOF’T
9421 Winnetka Avenue
Chatsworth, California 913 11
48K Disk $99.95
BASM by COMPUTER ALLIANCE
21115 Devonshire Street
Chatsworth, California 913 11
32K Disk $99.95
The world is full of ATARI BASIC programmers lusting for speed. They squirm with envy as the disciples of FORTH and C expound the virtues of those fast and exotic languages, and gaze with wonder upon machine-code hackers who wield their mysterious powers at 1.7 MHz.
Why this insatiable craving for faster programs? The answer is simple:GAMES. Every serious ATARI user has a secret desire to write the Ultimate Computer Game, a dazzling tour-de-force that would make Tempest look like Pong. Unfortunately, many would-be Chris Crawfords don’t have time to master more than one programming language - and guess which excruciatingly slow language that one usually is.
If you’ve ever been frustrated by the speed of ATARI BASIC, then a BASIC compiler may be just what you need. The recent release of THREE new compilers for the ATARI offers programmers a long-overdue alternative to BASIC that TRS-80 and Apple users have been enjoying for years.
A compiler is a utility program that reads a program written in BASIC and translates it into a lower-level code that executes faster than the original. A compiled BASIC program is completely self-contained; it is treated exactly like a binary DOS file and does not need the BASIC cartridge or any other special software to run.
Monarch Data Systems’ ABC, Datasoft’s BASIC Compiler and BASM by Computer Alliance are significantly different in terms of features, performance and cost. Since ABC reached the market soonest, we’ll examine it first.
ABC is a single-pass integer compiler. “Single-pass” means that your BASIC
program is scanned only once as it is being compiled. “Integer” means that
numbers are stored in straight 3-byte binary instead of the 6-byte
floating-point format used by
the BASIC cartridge. The elimination of floating-point math is one of the main reasons for the speed of ABC.
The best way to understand ABC is to review what happens when you compose a BASIC program. Each time you hit RETURN over a line of BASIC, the instructions are “tokenized” into a special internal code that can be understood by the cartridge.
ABC takes this process a step further. It reads the tokens produced by the BASIC cartridge and translates them into an even more compact form called psuedo-code or “P-code.” The P-code is then linked to a small machine-language program called a run-time interpreter, which reads and executes each P-coded instruction.
The big difference between tokenized BASIC and ABC P-code is conciseness. By using only whole-number integer arithmetic and a more efficient memory-management scheme, ABC simplifies the execution of each command in the ATARI BASIC repertoire. The result is a significant increase in the speed of the compiled program. According to Monarch, the speed improvement factor can range between four and twelve times, with seven times being a reasonable average.
It should be noted that the P-code produced by ABC is not 6502 machine language. It’s essentially a series of pointers into the run-time interpreter, much like a FORTH program. You can’t LIST, disassemble or make any sense at all out of the P-code without a detailed understanding of the ABC interpreter. This is an important feature if you’re thinking about distributing your compiled software, because the code will be protected from all but the most determined pirates.
Experienced BASIC programmers should have no trouble using ABC. First, you SAVE your completed BASIC program on a disk. Then you pull out the BASIC cartridge and boot the ABC disk. ABC asks for the name of your BASIC file and the name of a destination file, which will eventually become the compiled version of your program.
ABC next writes a copy of the run-time interpreter out to the destination file. It then scans your BASIC program and translates it into P-code, one line at a time. Finally, the P-code is appended to the interpreter, and you’re left with a binary-format disk file that can be loaded and executed using DOS option “L.” The original BASIC program is completely unaffected.
A couple of different run-time interpreters are included on the ABC disk. These provide a choice of loading addresses to match different memory configurations and DOS requirements. There is also a clever little program called MKRELO that makes your compiled software completely relocatable - a handy feature for commercial development because it assures that your software will run in virtually any machine with enough memory to fit it.
Datasoft’s BASIC Compiler is a 4-pass utility that converts BASIC programs directly into 6502 machine code. Because machine language doesn’t need to be interpreted, the execution speed of the compiled code can be very impressive. Datasoft claims a speed improvement of 5 to 20 times over the original BASIC version.
Like ABC, a run-time support package must be linked to the code in order for it to run. Datasoft gives you a choice of two different run-time packages: a high-speed integer version or a slower version that will accept floating-point functions.
The compilation procedure for the Datasoft compiler is fairly involved. After specifying the source and destination filenames, the program asks you to select either integer or floating-point math; the appropriate run-time package is linked to your code. The compiler then studies your BASIC code and converts it into one or more mnemonic assembler files which are written out to the disk.
Next, the Datasoft system loads a 3-pass assembler which reads the intermediate files created by the compiler, and produces a machine-language output file which is the final, executable version of your BASIC program. All assembler files remain intact on the disk, and may be accessed by Datasoft’s DATASM Editor/Assembler (sold separately) for later tweaking by hardcore hackers.
Datasoft’s product is tricky to use if you have only one disk drive. Because the assembler and output files must be written onto the same disk as your BASIC code, you have to be sure to leave enough space for them. According to Datasoft, this limits the maximum size of your BASIC program to about 100 sectors (12.5K). Users with more than one drive can lessen the limitation by putting their BASIC source on a separate disk.
An interesting feature of the Datasoft compiler is the Line Reference Map. This function displays each line number of your original BASIC program along with the exact address where its machine-language counterpart can be found. The map can be sent either to the screen, a printer or a disk file. Line references are very useful if you want to modify or debug the compiled version of a program.
The error handling of the Datasoft system is also helpful. Problems that occur during the execution of a compiled program produce a standard ATARI error number along with the address of the instruction that caused the foul-up. If you prepared a reference map of the program, you can determine which line of the original BASIC code produced the error. The Datasoft system also allows you to re-enter a crashed program at any point by specifying a new run address.
It would be wonderful if you could take any old BASIC program,send it through one of these compilers and get a nice, speedy output file. Unfortunately, things aren’t that simple. Both the Monarch and Datasoft products impose restrictions on the type of BASIC code that can be successfully compiled.
Listings 1 and 2 show the documented programming restrictions of the ABC and Datasoft BASIC compilers, respectively. Notice that the program access statements like LOAD, SAVE, ENTER and LIST are not supported by either system.This makes sense because of the self-standing nature of a compiled program. Also note that the floating-point math functions (SIN, COS, etc.) cannot be used by ABC, or by the integer version of the Datasoft compiler.
The documentation provided with ABC suggests a number of sneaky ways to get around its lack of floating-point arithmetic. It gives examples of how to simulate fractions, trigonometry and the RND() function without producing a compilation error. ABC’s 24-bit integer math package allows a usable variable range of l 8 million, so it’s possible to scale almost any value to a convenient whole number.
Both the integer and floating-point versions of the Datasoft compiler offer a nice implementation of the RND() function. Datasoft also allows you to use RUN statements as long as you don’t include a filespec such as RUN “D1:PROGRAM.”
Datasoft won’t let you use variables as line references (GOT0 X or GOSUB 100+Y, for example). Also, you can’t imbed DATA statements in your BASIC code. You have to place them all at the very end of the program, preceded by an END, STOP or GOT0 statement.
I like to keep DATA statements close to their corresponding READS because it makes programs easier to debug. I also like to use variables as line references because it makes my code self-documenting: statements like GOSUB NEXTLINE are inherently more meaningful than GOSUB 2011. Hopefully a later version of the Datasoft compiler will deal with this common stylistic approach more realistically.
Speed is one of the main reasons for using an ATARI BASIC compiler. To compare the speed performance of the Monarch and Datasoft products, I wrote a short benchmark program that uses nested FOR/NEXT loops to fill a GRAPHICS 24 screen with direct POKEs (see Listing 3). The hardware timers at locations 19 and 20 keep track of the execution speed in 60ths of a second or “jiffies.”
The benchmark was compiled and executed on a standard 48K system, using ATARI BASIC, ABC and both versions of the Datasoft compiler. Just for the fun of it, I also tried the program on ATARI’s disk-based Microsoft BASIC and Optimized Systems Software’s BASIC A+ 3.05. The program was run 3 times under each system, and the results were averaged to produce the data in Listing 4.
The 5-to-20 times speed improvement claimed by Datasoft’s integer compiler is clearly justified. ABC’s increase is about 7.4 times, also right in line with Monarch’s advertising. The floating-point version of Datasoft’s compiler isn’t very impressive in this example - it’s not all that much faster than BASIC A+.
Prospective users should know that graphics commands like PLOT, DRAWTO and FILL will not be significantly speeded up by using one of these compilers. The ROM routines that perform these functions are the same ones used by the BASIC cartridge. It would be nice to see a super-compiler with its own set of speedy graphics routines, similar to those offered by the valFORTH language system.
The amount of memory required by a compiled BASIC program depends on three things: the size and type of program being compiled, the efficiency of the compilation, and the size of the run-time package required to support the code.
ABC’s run-time interpreter takes up 36 disk sectors or about 4.5K of RAM. The floating-point package for Datasoft’s compiler requires 32 sectors (4K), while the integer package needs 29 sectors (3.6K). These figures represent the minimum RAM overhead required by any compiled program, regardless of its size or function.
We looked far and wide for a large BASIC program that could be used as the basis for a size comparison between the Datasoft compiler and ABC. Most of the trouble was caused by the Datasoft product, which would not accept the imbedded DATA statements found in virtually every off-the-shelf BASIC program we tried. In desperation, I decided to write this issue’s feature game (Adventure in the 5th Dimension) without using variable GOTOs or GOSUBs, “misplaced” DATA lines or anything else that would violate the restrictions documented by either product.
After thoroughly de-bugging the adventure, I SAVEd it onto a disk and checked its size. The BASIC version required 99 sectors, just below the maximum recommended by Datasoft for a single-drive system. So far, so good. Then I tried compiling the program with ABC using my single-drive 48K system. I experienced no problems until the very end of the compilation, when the program informed me of an Error # 166 (Point data length). Puzzled, I called Monarch and spoke to the author of the program. He tracked down the problem (too many GOSUBs in line 66), suggested an easy fix and promised to eliminate the limitation in future releases. My second compilation was flawless; the P-code produced by ABC is 129 sectors in length, about 30% larger than the original. And the adventure plays perfectly.
Next I tried compiling The 5th Dimension with Datasoft, again using a single drive and 48K. I followed the instructions in the user’s manual and transferred the system equate file SYSEQU.ABC onto the same disk as the BASIC program. Then I ran the compiler. Before the end of Pass 1, the compiler reported an Error # 162 (Disk Full). I looked at the disk with DOS and found that the assembler files had completely filled the disk, leaving no room for the assembly itself!
I borrowed another drive and re-compiled, using a second disk containing copies of the assembler, system equate and run-time library files. Again I was greeted with an Error # 162. Not to be deterred, I put the assembler file ASM.OBJ onto the same disk as the adventure and tried one more time. Success! The compiler just barely found enough room to write the assembler files, and I made it through Pass 1.
My disk space difficulty was caused by the fact that Datasoft always writes the assembler files onto Drive # 1. The reference manual estimates that these files require about five times as much space as the BASIC source file. That places the maximum possible source file size at somewhere around 144 sectors (18K), regardless of the number of disk drives you can borrow.
Now the compiler started on Passes 2 and 3. In Pass 3, the compiler stopped to tell me I had some unresolved line numbers. It didn’t say which lines were causing the problem, so I checked carefully through the BASIC program for GOSUBs or GOTOs that used variable line references. Nothing.
The RESTORE statements in lines 73 and 79 do use variable line references. But Datasoft’s documentation doesn’t say anything about RESTORES. I wrote a little BASIC test program to see if the compiler would accept RESTOREs with a variable in the line reference. Sure enough, the test failed.
I consider this “undocumented restriction” (read BUG) to be very serious. Data line addressing is one of the most powerful features of ATARI BASIC. I used it extensively in the adventure program because it made object handling so much easier. Rewriting the adventure was out of the question; so I compiled the program one more time and ordered the assembler to ignore the “unresolved line numbers.” The remainder of the compilation proceeded without error. Final program size was 214 sectors, more than twice the size of the original. Due to the presence of known errors, I did not try to run the compiled version.
Other bugs in the Datasoft BASIC Compiler have been discovered by users of the first release. I have personally verified difficulties with TRAP and VAL, along with some confusing problems with strings and numeric arrays. Datasoft is reportedly aware of these bugs and will hopefully offer updated disks to purchasers of the early release.
The choice between Monarch’s ABC and Datasoft’s BASIC compiler is not an easy one. Each product has a unique personality that make it suitable for specific applications and programming styles.
If ultra-high speed is very important to you, then the machine code produced by the Datasoft integer compiler is tough to beat. Datasoft’s product is also the better choice if you want to play around with the compiled versions of your software. And if you absolutely have to use transcendental math, the Datasoft floating-point package offers a slow but effective way to get it.
On the debit side, Datasoft’s product is very greedy with disk space and RAM. You need at least two disk drives to compile anything except small programs; and you have to put up with an alarming range of BASIC programming restrictions. Before you buy the Datasoft compiler, I suggest that you check with your dealer to make sure you’re getting a bug-free version.
ABC isn’t as picky about your source code as the Datasoft compiler. It will compile just about anything that doesn’t use fractions - and its wide usable number range gives it a decided advantage when it comes to simulating floating-point operations at high speed. The P-code produced by ABC offers a degree of software protection you can’t get with straight 6502 machine code. Last but not least, Monarch’s ABC costs $30 less than the Datasoft product.
You may be wondering why I haven’t yet mentioned BASM, the third “BASIC compiler” listed at the beginning of this article. The reason is simple: BASM isn’t really a BASIC compiler at all. It’s a BASIC assembler - an entirely new programming environment for the ATARI that looks like BASIC but acts like assembly language.
Take a look at Listing 5. This is the BASM equivalent of the speed benchmark used to test the ABC and Datasoft compilers. Notice that some of the lines look like ordinary BASIC, while others look like 6502 mnemonics. REM statements are included in those places where the BASM code differs significantly from the original BASIC.
BASM programs are composed using a text editor supplied with the software. Then the source file is saved onto a disk and assembled into machine language. A very small run-time library is linked to the code, and your application is ready to run.
The BASM system understands a very usable subset of ATARI BASIC, along with a number of statements and conditionals not found in the cartridge (see Listing 6). “Primitive” commands like PEEK and POKE must be replaced with their assembly-language equivalents, LDA (Load Accumulator) and STA (Store Accumulator). READ/DATA structures are implemented by using the 6502 X- and Y-registers as indexes.
BASM allows you to mix BASIC and assembly statements freely, even on the same logical line. This arrangement combines the simplicity of BASIC with the power of machine language in a most ingenious manner.
Because BASM programs have an assembly-like syntax, the efficiency of compilation is much greater than either ABC or Datasoft. Only the pure BASIC statements are actually “compiled” - the assembly-language sections are incorporated into the program as in-line machine code. This means that the speed of a BASM program can approach the limits of the hardware. I compiled and executed the BASM program in Listing 5 and obtained an execution time of 18 jiffies or less than 1/3 of a second. This is 231 times faster than the ATARI BASIC equivalent! Computer Alliance claims a more conservative speed improvement of up to 130 times.
BASM is not as straightforward to use as the ABC or the Datasoft compilers. You’ll have a hard time following the 72-page reference manual unless you know something about 6502 architecture and assembly-language programming. It took me a while to grasp the syntax required for certain types of BASIC variables and addressing modes. More complete documentation is definitely called for - even if it means raising the price a bit.
I also ran across a bug in the disk interface. My review copy of BASM bombed out whenever I tried to load and RUN a compiled program more than once. This made it impossible to repeat my bench-mark demo without a complete system re-boot. When Computer Alliance fixes this problem, they will have a fascinating and very powerful “BASIC compiler” on their hands.
A stigma against BASIC programming has arisen in the ATARI software market over the past few years. The prejudice is based on the absurd idea that the quality of a program has something to do with the language it was written in.
The compilers reviewed in this article will help make BASIC programming respectable again. For this reason, I think they are the most important pieces of ATARI software to come down the pike since valForth. They may actually be more significant, because they offer much of the performance of FORTH without the need to learn a new programming language. That means BASIC hackers can spend less time puzzling over stacks, disk screens and other unfamiliar concepts, and more time improving the quality of their BASIC.
I’m happy to report that not one of the compilers mentioned in this article requires a licensing fee. You can sell your compiled software royalty-free as long as you include a credit in your documentation.
BASIC compilers are about to open the world of professional software development to a whole new range of talented authors. Let’s hope the code they produce will be as sophisticated and valuable as these three products.
Listing 1. ABC Programming Restrictions.
ATN CLOG COS EXP LOG RND SIN SQR
Unsupported Arithmetic Operators:
BYE CLOAD CONT CSAVE DEG DOS
ENTER LIST LOAD LPRINT NEW RAD
Cannot use fractional (non-integer) values.
Cannot use constants larger than 65,535 (variable range is +-8 million)
Listing 2. Datasoft Programming Restrictions.
Unsupported Functions (integer mode only):
ATN CLOG COS EXP LOG SIN SQR
Unsupported Arithmetic Operators:
BYE CLOAD CONT CSAVE DOS ENTER
LIST LOAD NEW RUN - “FILESPEC” SAVE
Integer mode values limited to +- 32,767 (except address constants).
DATA statements must be at end of program and cannot be “executed” (see text).
DIM statements cannot use variables for size allocation (e.g.; DIM X(A)).
GOTO and GOSUB cannot use variables for line references (e.g.; GOTO X).
10 REM ****************************
15 REM * BENCHMARK TEST FOR BASIC *
20 REM * COMPILERS *
25 REM ****************************
30 POKE 19,0:POKE 20,0
35 GRAPHICS 24
40 SETCOLOR 1,0,14:SETCOLOR 2,0,0
50 FOR X=0 TO 191:FOR J=0 TO 39
55 P0KE SCREEN+J,255
60 NEXT J:SCREEN=SCREEN+40:NEXT I
65 GRAPHICS 0
70 PRINT PEEK(20);" jiffies"
75 PRINT PEEK(19);" jiffies x 256"
Listing 4. Speed Test Results.
|ATARI BASIC Cartridge||4160||69.3|
|ATARI Microsoft BASIC (disk)||3348||55.8|
|OSS BASIC A+ 3.05 (disk)||2717||45.3|
|Monarch ABC Compiler||565||9.4|
|Datasoft Compiler (Integer Mode)||218||3.6|
|Datasoft Compiler (FP Mode)||2435||40.6|
0100 REM * PROGRAM EQUATES
0140 REM * POKE 19,0:POKE 20,0
0150 LET TIMER256 = 0 : LET TIMER = 0
0160 GRAPHICS 24
0170 SETCOLOR 5 , 0 , 14 : SETCOLOR 6 , 0 , 0
0180 FOR I = 0 TO 191 : FOR J = 0 TO 39
0190 REM * POKE SCREEN+J,255
0200 LDA #255 : LDY J : STA (SCREEN),Y : NEXT J
0210 REM * SCREEN=SCREEN+40
0220 REM * THIS IS A 16-BIT BINARY ADDITION
0230 CLC : LDA SCREEN : ADC #40 : STA SCREEN
0240 LDA SCREEN+1 : ADC #0 : STA SCREEN+1
0250 NEXT I
0300 REM * GRAPHICS 0
0310 FILE 0
0320 BPRINT TIMER : PRINT " jiffies"
0330 BPRINT TIMER256 : PRINT " jiffies x 256"
0340 RETURN : REM * BACK TO BASM
0350 REM * LINE 360 INITIALIZES THE VARIABLES I & J
0360 DIM I , J
Listing 6. BASM keywords.