Atari BASIC: 

A High-Level Language   

Translator  








The programming language which has become the de facto 

standard for the Atari Home Computer is the Atari 8K BASIC 

Cartridge, known simply as Atari BASIC. It was designed to 

serve the programming needs of hoth the computer novice and 

the experienced programmer who is interested in developing 

sophisticated applications programs. In order to meet such a 

wide range of programming needs, Atari BASIC was designed 

with some unique features.

    In this chapter we will introduce the concepts of high level 

language translators and examine the design features of Atari 

BASIC that allow it to satisfy such a wide variety of needs.



Language Translators

Atari BASIC is what is known as a high level language translator. 

    A language, as we ordinarily think of it, is a system for

communication. Most languages are constructed around a set 

of symbols and a set of rules for combining those symbols.

The English language is a good example. The symbols are 

the words you see on this page. The rules that dictate how to 

combine these words are the patterns of English grammar. 

Without these patterns, communication would be very 

difficult, if not impossible: Out sentence this believe, of make 

don't this trying if sense you to! If we don't use the proper 

symbols, the results are also disastrous: @twu2 yeggopt 

gjsiem, keorw?

    In order to use a computer, we must somehow 

communicate with it. The only language that our machine 

really understands is that strange but logical sequence of ones 

and zeros known as machine language. In the case of the Atari, 

this is known as 6502 machine language.

    When the 6502 central processing unit (CPU) "sees" the 

sequence 01001000 in just the right place according to its rules 

of syntax, it knows that it should push the current contents of



Chapter One




the accumulator onto the CPU stack. (If you don't know what 

an "accumulator" or a "CPU stack" is' don't worry about it. 

For the discussion which follows, it is sufficient that you be 

aware of their existence.)

    Language translators are created to make it simpler for 

humans to communicate with computers. There are very few 

6502 programmers, even among the most expert of them, who 

would recognize 01001000 as the push-the-accumulator 

instruction. There are more 6502 programmers, but still not 

very many, who would recognize the hexadecimal form of 

01001000, $48, as the push-the-accumulator instruction. 

However, most, if not all, 6502 programmers will recognize 

the symbol PHA as the instruction which will cause the 6502 

to push the accumulator.

    PHA, $48, and even 01001000, to some extent, are 

translations from the machine's language into a language that 

humans can understand more easily. We would like to be able 

to communicate to the computer in symbols like PHA; but if 

the machine is to understand us, we need a language translator 

to translate these symbols into machine language.

    The Debug Mode of Atari's Editor/Assembler cartridge, for 

example, can be used to translate the symbols $48 and PHA to 

the ones and zeros that the machine understands. The 

debugger can also translate the machine's ones and zeros to 

$48 and PHA. The assembler part of the Editor/Assembler 

cartridge can be used to translate entire groups of symbols like 

PHA to machine code.



Assemblers

An assembler - for example, the one contained in the 

Assembler/Editor cartridge - is a program which is used to 

translate symbols that a human can easily understand into the 

ones and zeros that the machine can understand. In order for 

the assembler to know what we want it to do, we must 

communicate with it by using a set of symbols arranged 

according to a set of rules. The assembler is a translator, and 

the language it understands is 6502 assembly language.

    The purpose of 6502 assembly language is to aid program 

authors in writing machine language code. The designers of 

the 6502 assembly language created a set of symbols and rules 

that matches 6502 machine language as closely as possible.

    This means that the assembler retains some of the

Chapter One




disadvantages of machine language. For instance, the process 

of adding two large numbers takes dozens of instructions in 

6502 machine language. If human programmers had to code 

those dozens of instructions in the ones and zeros of machine 

language, there would be very few human programmers.

    But the process of adding two large numbers in 6502 

assembly language also takes dozens of instructions. The 

assembly language instructions are easier for a programmer to 

read and remember, but they still have a One-to-one cor–

respondence with the dozens of machine language 

instructions. The programming is easier, but the process 

remains the same.



High Level Languages

High level languages, like Atari BASIC, Atari PILOT, and Atari 

Pascal, are simpler for people to use because they more closely 

approximate human speech and thought patterns. However, 

the computer still understands only machine language. So the 

high level languages, while seeming simple to their users, are 

really much more complex in their internal operations than 

assembly language.

    Each high level language is designed to meet the specific 

need of some group of people. Atari Pascal is designed to 

implement the concept of structured programming. Atari 

PILOT is designed as a teaching tool. Atari BASIC is designed 

to serve both the needs of the novice who is just learning to 

program a computer and the needs of the expert programmer 

who is writing a sophisticated application program, but wants 

the program to be accessible to a large number of users.

    Each of these languages uses a different set of symbols and 

symbol-combining rules. But all these language translators 

were themselves written in assembly language.





Language Translation Methods

There are two different methods of performing language 

translation - compilation and interpretation. Languages which 

translate via interpretation are called interpreters. Languages 

which translate via compilation are called compilers.

    Interpreters examine the program source text and simulate 

the operations desired. Compilers translate the program source 

text into machine language for direct machine execution.

Chapter One




    The compilation method tends to produce faster, more 

efficient programs than does the interpretation method. 

However, the interpretation method can make programming 

easier.



Problems with the Compiler Method

The compiler user first creates a program source file on a disk, 

using a text editing program. Then the compiler carefully 

examines the source program text and generates the machine 

language as required. Finally, the machine language code is 

loaded and executed. While this three-step process sounds 

fairly simple, it has several serious 'gotchas."

    Language translators are very particular about their 

symbols and symbol-combining rules. If a symbol is 

misspelled, if the wrong symbol is used, or if the symbol is not 

in exactly the right place, the language translator will reject it. 

Since a compiler examines the enure program in one gulp, one 

misplaced symbol can prevent the compiler from 

understanding any of the rest of the program - even though 

the rest of the program does not violate any rules! The result is 

that the user often has to make several trips between the text 

editor and the compiler before the compiler successfully 

generates a machine language program.

    But this does not guarantee that the program will work. If 

the programmer is very good or very lucky, the program will 

execute perfectly the very first time. Usually, however, the user 

must debug the program.

    This nearly always involves changing the source program, 

usually many times. Each change in the source program sends 

the user back to step one: after the text editor changes the 

program, the compiler still has to agree that the changes are 

valid, and then the machine code version must be tested again. 

This process can be repeated dozens of times if the program is 

very complex.

Faster Programming or Faster Programs?

The interpretation method of language translation avoids many 

of these problems. Instead of translating the source code into 

machine language during a separate compiling step, the 

interpreter does all the translation while the program is running. 

This means that whenever you want to test the program you're 

writing, you merely have to tell the interpreter to run it. If 

things don't work right; stop the program, make a few 

changes, and run the program again at once.

Chapter One




    You must pay a few penalties for the convenience of using 

the interpreter's interactive process, but you can generally 

develop a complex program much more quickly than the

compiler user can. 

    However, an interpreter is similar to a compiler in that the

source code fed to the interpreter must conform to the rules of

the language. The difference between a compiler and an

interpreter is that a compiler has to verify the symbols and 

symbol-combining rules only once - when the program is 

compiled. No evaluation goes on when the program is 

running. The interpreter, however, must verify the symbols 

and symbol-combining rules every time it attempts to run the 

program. If two identical programs are written, one for a 

compiler and one for an interpreter, the compiled program will 

generally execute at least ten to twenty times faster than the 

interpreted program.



Pre-compiling Interpreter

Atari BASIC has been incorrectly called an interpreter. It does 

have many of the advantages and features of an interpretive 

language translator, but it also has some of the useful features 

of a compiler. A more accurate term for Atari's BASIC 

Language Translator is pre-compiIing interpreter.

    Atari BASIC, like an interpreter, has a text editor built into

it. When the user enters a source line, though, the line is not 

stored in text form, but is translated into an intermediate code, 

a set of symbols called tokens. The program is stored by the 

editor in token form as each program line is enterred. Syntax

and symbol errors are weeded out at that time.

    Then, when you run the program, these tokens are examined 

and their functions simulated; but hecause much of 

the evaluation has already been done, the execution of an Atari 

BASIC program is faster than-that of a pure interpreter. Yet 

Atari BASIC's program-building process is much simpler than 

that of a compiler.

    Atari BASIC has advantages over compilers 

and interpreters alike. With Atari BASIC, every time you enter a 

line it is verified for language correctness. You don't have to 

wait until compilation; you don't even have to wait until a test 

run. When you type RUN you already know there are no 

syntax errors in your program.

Chapter One





Internal Design 

Overview 




Atari BASIC is divided into two major functional areas: the 

Program Constructor and the Program Executor. The Program 

Constructor is used when you enter and edit a BASIC program. 

The source line pre-compiler, also part of the Program 

Constructor, translates your BASIC program source text lines 

into tokenized lines. The Program Executor is used to execute 

the tokenized program - when you type RUN, the Program 

Executor takes over.

    Both the Program Constructor and the Program Executor 

are designed to use data tables. Some of these tables are 

already contained in BASIC's ROM (read-only memory). 

Others are constructed by BASIC in the user RAM (random-

access memory). Understanding these various tables is an 

important key to understanding the design of Atari BASIC.



Tokens

In Atari BASIC, tokens are the intermediate code into which 

the source text is translated. They represent source-language 

symbols that come in various lengths - some as long as 100 

characters (a long variable name) and others as short as one 

character ("+" or "-"). Every token, however, is exactly one 

eight-bit byte in length.

    Since most BASIC Language Symbols are more than one 

character long, the representation of a multi-character BASIC 

Language Symbol with a single-byte token can mean a 

considerable saving of program storage space.

    A single-byte token symbol is also easier for the Program 

Executor to recognize than a multi-character symbol, since it 

can be evaluated by machine language routines much more 

quickly. The SEARCH routine - 76 bytes long - located at 

$A462 isa good example of how much assembly language it 

takes to recognize a multi-character symbol. On the other 

hand, the two instructions located at $AB42 are enough to
<-Start		Chapter 02->