Page 6 - Issue 9 - Understanding Strings

Understanding Strings

by Bob Anthony

Issue 9

May/Jun 84

An introduction to Atari strings and file handling

Understanding Strings on Atari computers can be difficult but, like most other problems, the difficulties can be overcome. One of the more common uses for a String is to hold an individual piece of information - a record - or several pieces of information, referred to as a file. Let's start with an explanation of how to keep individual records.

THE RECORD

Records (and files) are generally held as strings of characters, for example A$='A RECORD'. The $ sign means 'string'. To allow A$ to hold the characters we must first reserve enough memory in the computer to hold the information we want to store (our characters). There are two ways of doing this on Atari machines

10 COM A$(10) or DIM A$(10)

The most common version is DIM which is an instruction to DIMension the string and we will stick to this in future. COM is unique to the Atari but merely duplicates DIM and serves no other purpose.

Next we load our DIMensioned string with the desired characters

20 A$='A RECORD'

Now lets add a third line

30 PRINT A$

We could also use 30 ? A$ as ? is an abbreviation for PRINT. Now let's RUN it. Type RUN and your screen should show

A RECORD
READY

which has proved to us that our machine has now stored the words A RECORD in A$.

THE FILE

Now that we have A RECORD stored in A$, we can go about creating a FILE, but for a file we need more than one record so type NEW on your computer and type in Program 1. Now type RUN and look at the resulIt. What have we here? A$ and B$ have been added together (concatenated) to make a new string - C$ - which contains 'A RECORDB RECORD'.

There are two key lines in the little program you just typed. First, line 30 C$=A$ which means quite simply make C$ equal to A$ so that C$ now contains 'A RECORD'. The second key line is line 40 which says C$(LEN(C$)+1)=B$. This is slightly trickier, but what it says is this. Find the length of C$ - LEN(C$) - add 1 to it and then at that position along C$ tag on the contents of the string variable B$ (or in other words at C$(9) add 'B RECORD').

On many other computers we could have simply said 30 C$=A$+B$ or even C$=A$: C$=C$+ B$ but unfortunately in Atari BASIC we cannot concatenate strings in this way because the language does not support the facility. No matter, we can program round it by using LEN to find the length of the string we wish to add characters to - LEN (C$), which equals 8, add one so we don't overwrite the last character LEN(C$)+ 1, which equals 9 and then make that part of the string equal to our next record. So in our case we can imagine C$ to look like this

1 2 3 4 5 6 7 8 9 10 11 12 13 14

A R E C O R D B R E C O R D

From here you can see that so long as we keep within the limits of our DIM statement we can keep adding records to C$ until we've completely filled it. We must be careful though because although trying to add a record at a position beyond the DIMensioned length of C$ will give us an error message, there will not be an error if the starting position is within C$ but we will lose the end of our last record. To prove it try the following in direct mode

CLR: DIM C$(3),A$(8): A$='A RECORD': C$=A$: ? C$

Your computer should have printed out 'A R'. There will be no error report and the only indication that anything is wrong is when C$ is printed out. However, provided that you know about this, it can be turned to good use in certain circumstances.

THE STRING

Let's recap. We now have a RECORD called A$ - A$='A RECORD' and a FILE called C$ - C$='A RECORDB RECORD'. Both of these are related in that they are both strings and as such we can manipulate them in identical ways. The only real difference between a record and a file is the length. We could go one step further and break up our record into smaller segments which are called FIELDS. A RECORD can be made up of several FIELDS which are concatenated together to give us a RECORD which in itself can be added to a FILE.

Now that we have loaded up our file, how can we get the records back out again for practical use? We have to start thinking a little now because unless we know where everything is, we've got a problem.

Let's go back to 'A RECORDB RECORD'. We know that 'A RECORD' is 8 characters long so we can use that information to get it back from the file. Type in and RUN program 2. You should see on your screen 'A$ now = B RECORD'. If you didn't, you must have typed something wrong, so try again.

What we have done here is we have taken the second record out of the file by making A$ equal to the 9th position plus the rest of C$. Although we have 'taken out' this information and put it in A$, it is still held in C$ and cannot therefore be lost. To further explain what is happening, type in and RUN Program 3.

What happened? We have already discussed up to line 30 but at line 40 something new takes place. We have taken the fifth position along B$ and put that at position 2 along A$ so we get

1 5 6 7 8
A C O R D

A$ has become 'ACORD'. We then did the same thing again, but moved the starting position along A$ up one to position 3 which is now the 0 in ACORD to get

1 2 5 6 7 8
A C C O R D

As you may have worked out B$(5)='CORD'. So you see if we state a number in brackets after the name of the string variable, we have access to everything in the string, starting at that position.

If we state two numbers in brackets after the name of the string variable we can do even more interesting things.

Program 1

AtariLister - requires Java

Program 2

AtariLister - requires Java

Program 3

AtariLister - requires Java

Program 4

AtariLister - requires Java

Program 5

AtariLister - requires Java

BEING SPECIFIC

Having mastered moving the ends of strings around we can now move on to the somewhat more interesting idea of moving blocks of, or even individual, characters around. In the previous examples we've only indicated a starting address within a string but we can also indicate a finishing address. Type in Program 4 and RUN it and you should get 'B$=REC'. We can do the same thing backwards, and in a different place if we wish. Enter the following without a line number

A$(6,8)=B$:?A$

This means put B$ in the 6th, 7th and 8th position of A$. If B$ were longer than 3 characters, then the rest would be ignored. The computer will respond with A RECREC. In effect we have altered the position of the characters REC whilst keeping them in their original positions and overwriting the characters ORD.

Now let's go one step further. Enter these commands without line numbers.

A$='A RECORD': B$=A$(6,8)
A$(6,8)=A$(3,5): ? A$: ? B$

If you turned off your computer you must type DIM A$(8) first. The second part of the above line puts the 6th, 7th and 8th characters from A$ into B$ and the third part puts the 3rd, 4th and 5th characters from A$ into the 6th, 7th and 8th positions in A$!

We have deleted ORD from A$(6,8) by putting REC in its place but we saved ORD in B$ first so we can type A$(3,5)=B$: ? A$ and the computer will give us 'A ORDREC'.

What we have done effectively is a sort and if you consider that you can do the above on a string of any length, you are well on the way to understanding string handling and file manipulation.

THE ADVANCED STUFF

We know now all we need to know about strings to be able to create a file made up of many records, provided they are of fixed length but most records are of a variable length such as a name, telephone number etc. What, would make our file more flexible is to be able to hold variable length records and perhaps fixed length records in such a way that they can be searched and sorted later.

How can this be done? Program 5 illustrates a way of using characters to keep track of individual records of any length within a file. We first DIMension the strings and then in line 20 enter our records. Note that pressing RETURN without entering a record will cause us to jump to line 100. In line 30 we count the length of the record and add 1 to this in line 40. Next we convert RECLEN into a character by using the CHR$ command and put this character into FILE$ before we add our record in line 60. The character now acts as a pointer to the beginning of the next record.

To pull the records out of the file, we can set up a loop to go through FILE$, finding each pointer in turn and using ASC to convert this back into a number to point at the next record in the file. This all happens in lines 100 to 140. Try entering some records of your own and then in direct mode type PRINT FILE$ and count through the string to follow the program lines.

Program 5 is a simple demonstration of extracting files in order and counting them but we can put much more complicated routines within the loop. In a later issue I will present a couple of programs to demonstrate how to create, sort and search complex files. In the meantime, I hope that this article has brought you a little enlightenment on string handling in general and I hope that you can begin to write your own record keeping programs.

top