lib6502 standard library

Version 0.3 (08 july 1997)

This document consists of three parts, the Introduction, the actual library interface definition and the Implementation Notes. There also is an Index of the used function names, as well as a section for ideas for the development. Also acknowledgements are given.

Introduction

This file has been written to define a certain compatibility level between different operating systems for 6502 computer. It defines, on a more abstract level that can be fitted to different real OSes, an interface for system services. The goal of this definition is, to be able to fit into various different environments, like

Computers like the C64 or the Gecko computer, with a single 6502, without advanced memory management.
Computers like the C128, with a more enhanced memory management (two banks with 64kByte each, relocatable zeropage and stack)
The CS/A65 computer, with a 6502 and a fully virtualized address management.
Computers with the 65816 CPU, for programs running in 8 bit emulation mode.

The possible operating systems also differ in style and features:

Single-tasking OSes, like the ones provided with the C64, or other 6502 computers (if interprocess communication and process management calls are not used, lib6502 programs can even run on such a machine).
'Monolithic' multitasking systems, that have all system services in the kernel
Microkernel based OSes, that communicate with system servers via kernel communication paths, providing this functionality to the program via the lib6502.

This interface offers a common level for all these scenarios. Programs written for this library can run on any of these platforms, a simple recompile on the proprietary 6502 assembler/compiler will be enough. If people can agree on a standard file format, not even recompiling might be necessary.

In this library there still is a certain degree of freedom in the implementation. System process IDs can still be 8 bit, as long as the library offers a 16 bit interface to the application. Memory allocation can still be pagewise (in 256 byte blocks), as long as an application does not rely on that - in other OSes it can probably allocate smaller chunks. The purpose of this interface is, to hide the system OS behind. The application, on the other hand, should never assume any implementation specifics, that are not documented in this definition.

Most of the calls are modeled along the standard libc C library, but there also are some calls from the Unix world.

This file does not make any assumptions about the implementation of the calls, although the behaviour is (i.e. should be :-) noted as exactly as possible.

File interface

The file interface uses file numbers. These file numbers are valid in the local environment, and need not be globally valid. But the lib6502 always has to accept these numbers, and then tranforms them internally wo whatever appropriate for the given OS.

file-nrs in lib6502 are always treated as an uni-directional FIFOs, i.e. an application can either read or write to a provided file-nr, but not both at a time. The OS can provide bi-directional (with read/write operations possible on the same file-descriptor in one task) or uni-directional (where a write to one end can be read from the other end even in the same task.) file-descriptors, however. In the former case, open on a R/W file would return two identical file-nrs, while pipe must return two different ones. In the latter case, it must be vice versa: open returning two different file-nrs, while pipe returns a single file-nr for reading and writing. This leaves the flexibility for the implementation.

When an application is started, three file-nrs do already exist and are open: STDIN, STDOUT, and STDERR. STDOUT gives the file-nr for writing task output, STDERR is for error output, while STDIN is for reading program input.

All open files are closed when the process terminates.

   fopen:       
		<- a/y = address of struct '.byte mode, "filename" '
                        mode :  0 = read-only
                                1 = write-only
                                2 = read-write
				3 = append
                -> c=0 : x = file-nr 
                         if read-write, x = file-nr for reading, 
			 y = file-nr for writing.
			 (The y-register only applies if file-nrs
			 are always uni-directional. Otherwise x will
			 always hold the right file-nr.
			 See note)
                   c=1 : a = error code         E_NOTFOUND
                                                E_PERMISSION
                                                E_DIRECTORY

Files are named as: "Device:directory/filename", where Device depends on the OS. There might be OSes where it's a single character (like A: or 8:), others might have a real name. The lib needs to be able to parse its own, implementation dependent namespace, only the application should not assume anything about the length of the device name. Directory separator is "/". Escape sequence for the directory separator is "\/". Behind the colon, all names are assumed global to the device, even if the directory or filename doesn't start with a "/". Character set is PETSCII or ASCII (i.e. all codes between 0x20 and 0x7f must be useable, others are not allowed).

The filesystem might imply other limitations on filename length, if there are directories at all, or the allowed characters. Wildcards are "*" that match any string and "?" that match exacly one character. The interpretation might depend on the filesystem. Escape sequences are "\*" and "\?". Escape sequence for "\" is "\\".

   fclose       
		<- x = file-nr, 

		If file-nrs can be uni- and bi-directional, no other
		options are necessary (see also note)

		y = 0: close file for reading
		y = 1: close file for writing
		(when having bi-directional files, not having this distinction
		would behave differently from uni-directional files)

		But I am still looking for another solution, probably like this:
		When opening a file for read-write, the lib returns two
		identical file-nrs (if the OS has bi-directional file desc.)
		and sets some 'close counter' to 2. The application 
		would have to close both (identical) file-nrs, but only
		the second one would do any effect. To be sure all data
		has been written before such a close, an fcntl(PUSH)
		is done in each close call. 

   fgetc        
		<- x = file-nr
                   c=0 : return immediately, c=1 : block till byte
                -> c=0 : a = data byte
                   c=1 : a = error code         E_NOFILE
                                                E_EMPTY
                                                E_EOF
   fputc        
		<- x = file-nr, a = data byte
                   c=0 : return immediately, c=1 : block till byte
                -> c=0 : ??
                   c=1 : a = error code         E_NOFILE
                                                E_FULL     (E_TRYAGAIN?)
                                                E_NUL      (noone reads it)

   fread        
		<- x = file-nr, a/y = address of struct:
			.word address_of_buffer, length_of_buffer
		   c = 0 : return immediately, even if nothing read or
			   buffer only partially read.
		   c = 1 : wait till buffer is full or EOF (or error)
		-> c = 0 : ok, a/y = length of data read
			   struct given holds address+a/y, length-a/y,
			   such that it can directly given to fread again.
		   c = 1 : error code		E_NOFILE
						E_EMPTY	   (E_TRYAGAIN?)
						E_EOF

   fwrite       
		<- x = file-nr, a/y = address of struct:
			.word address_of_buffer, length_of_buffer
		   c = 0 : return immediately, even if nothing written,
			   or buffer only partially written.
		   c = 1 : wait till buffer is empty (or error)
		-> c = 0 : ok, a/y = length of data written
			   struct given holds address+a/y, length-a/y,
			   such that it can directly given to fwrite again.
		   c = 1 : error code		E_NOFILE
						E_FULL	   (E_TRYAGAIN?)
						E_NUL	   (noone reads it)
   fseek
		<- x = file-nr, a/y = address of struct:
			.byt mode ; offset is relative to
					0 = start of file
					1 = end of file
					2 = actual position
			.word 0,0 ; 32 bit offset
		-> c = 0 : ok;
		   c = 1 : error code		E_NOFILE
						E_NOSEEK

fgetc and fread, and fputc and fwrite can be used interchangeably. fread/fwrite don't guarantee that the whole buffer is really read/written, even with carry set. For this, see fcntl below. When opening a file read-write, then when changing between read and write, there always has to be an fseek operation.

There are, however, files that cannot be seeked, namely character devices. If trying to use fseek on such a device, E_NOSEEK is returned. If a seekable file is given to STDIN and STDOUT/STDERR, the behaviour is not defined. Only non-seekable files should be given to STDIN and STDOUT/STDERR, when opened read-write.

    pipe	
		-> x = file-nr for reading
		   y = file-nr for writing

		opens a uni-directional pipe with two file numbers,
		one for writing, and one for reading. To close the pipe,
		each end has to be closed separately.

    flock	
		<- x = file-nr
		   a = operation: LOCK_SH, LOCK_EX, LOCK_UN
		   c = 0: don't block
		   c = 1: block till you get it
		-> c = 0: ok, got lock
		   c = 1: a = error code	E_NOTIMP
						E_NOFILE
						E_LOCKED

The flock call locks a file for other tasks access. If locked shared, then other tasks may also aquire shared locks - for reading, for example. An exclusive lock can only be aquired by exactly one task at a time - for writing. The flock call is optional. If not implemented, return E_NOTIMP

    fcntl	
		<- x = file-nr, a = operation
		  	a = FC_PUSH	all buffers are flushed and sent
			    FC_PULL	actively try to get everything 
					that has already been sent
		-> c = 0: ok
		   c = 1: a = error code	E_NOFILE
						E_NOTIMP

The fcntl return code should be ignored, as it is probably not implemented in most of the systems.

    fcmd
		<- x = operation, a/y = filename,0 [ , filename2, 0 ]
			x = FC_RENAME	filename -> filename2
			    FC_DELETE
			    FC_MKDIR
			    FC_RMDIR
			    FC_FORMAT	filename only to determine drive
			    FC_CHKDSK	 - " -

Other important calls are the stddup and the dup call.

    stddup
		<- x = old stdio file-nr (STDIN, STDOUT or STDERR)
		   y = new file-nr for stdio file.
		-> c = 0: ok, x = old stdio file
		   c = 1: a = error code	E_NOFILE

This call replaces a stdio file-nr (the pre-defined STDIN, STDOUT, and STDERR file-nrs) with a new file-nr.

     dup
		<- x = old file-nr
		-> c = 0: ok, x = new file-nr
		   c = 1: a = error code	E_NOFILE

This call 'reopens' a file, i.e. it returns a new file-nr that is used as the old one. They share the same read/write pointers etc. Both file-nrs must be closed. This way the same file can be given to STDOUT and STDERR in a fork call, for example.

dup is currently defined for read-only and write-only files.

Here we have the same problem as with the close call. When dup'ing a read-write file, which end should be dup'ed?

Note on read-write file-nrs

Now should any file-nr as used in this lib be read-only or write-only? Systems with read-write OS file descriptors should then probably return write file-nrs with an offset, to move them out of the OSes valid file-nr range. So they can be relatively easily handled.

Otherwise it is necessary to complicate calls like close or dup. But then, how does fseek now work?

I could move the read-write files out of OS range, or check them separately in other ways. Close would then close both sides, and dup would also dup both sides. When using a read-write file for one side of stdio, the other half of the communication channel would be left unused, however, which is a resource waste.

Probably the second version (file-nrs uni- and bi-directional) should be taken for the final lib definition, what do you think?

Directory Interface

The library maintains a path that is used for each file system operation. If a filename does not start with a "/" and not with a drive, the path is put in front of the filename. If the filename starts with a drive, it is always taken as an absolute filename, even if the "/" is missing. If the drive is missing, it is taken from the path.

A special case is the directory call with a filename as "*:". It does not use the path, but returns, in each entry, an available device name. The length attribute should give the available amount of storage space on the device. A wildcard in the device field is not allowed otherwise.

    fopendir	
		<- a/y = address of filename
		-> c = 0: ok, x = file-nr
		   c = 1: a = error code		E_NOFILE
							E_NOTDIR

    freaddir	
		<- x = file-nr a/y = address of buffer
		   c = 0: don't block, 	c = 1: block till entry read

		reads _one_ directory entry into the buffer, which is
		of length (DIR_STRUCT_LEN + MAX_FILENAME)
		One entry consists of a directory struct

		.byt	0		; valid bits
		.byt	0		; permissions (drwxrwxrwx)
		.word	0,0		; file length in byte
		.byt	0,0,0,0,0,0	; last modification date
					; (year-1990, month, day, hr, min, sec)

		The valid bit say, which entry in the struct is valid.
		bit 0 is for the permissions, bit 1 for the file length, bit 2
		for the date. The file length, if not zero, is an 
		approximate value (like the blocks *254 in a vc1541)
		this struct is followed by the null-terminated filename.

    fgetattr	
		<- a/y = address of dir struct, incl filename (like in freaddir)
		-> c = 0: ok,
		   c = 1: a = error code	E_NOTIMP

		This tries to fill in the bits that are _not_ valid in a
		dir struct. For example, if freaddir returned
		the filelength only, but no permissions, then calling
		fgetattr should get the file permissions.
		But it is not guaranteed, that all fields are filled,
		as some are not implemented on a certain filesystem.
		So even after fgetattr, a check of the valid bits is needed.
		The filename must be completed with the device and path.

    fsetattr	
		<- a/y = address of dir struct, incl. filename
		-> c = 0: ok
		   c = 1: a = error code	E_NOTIMP

		Tries to write new file attributes (the ones where the
		valid bits are set). Need not succeed. Clears the valid bits
		for the attributes it has successfully set.
		The filename must be completed with the device and path.

    chdir
		<- a/y = address of new path, relative to the old one
		-> c = : ok
		   c = 1: a = error code	E_ILLDRIVE
						E_ILLPATH

The chdir call changes the saved path in the library. A "." filename means the same directory, while ".." means the parent directory.

Network Interface

Network streams are used as well as any other file, so we only need opening calls. Currently only TCP/IP is defined and thought of, but there should be no problem allowing other networks.

    connect	
		<- a/y address of : byte length of address (incl. length byte), 
		   plus 4 byte inet addr (+2 byte port for TCP/UDP)
		   x = protocol (IPV4_TCP, IPV4_UDP,...)
		-> c = 0: x = file-nr for write (send)
		   	  y = file-nr for read (receive)
			  (for the y register see the same comment on
			  the read-write fopen call)
		   c = 1: a = error code	E_NOTIMP
						E_PROT
						E_NOROUTE
						E_NOPERM

    listen	
		<- c = 0: a/y = addr of: 
			     byte length of port, 2 byte port number 
			     (for TCP/UDP on IP)
		          x = protocol
		-> c = 0: ok, x = listenport
		   c = 1: a = error code 	E_NOTIMP
						E_PROT
						E_PORTINUSE
		opens a port to listen at

		<- c = 1: x = listenport 
		-> c = 0: ok
		   c = 1: a = error code	E_NOTIMP
						E_NOPORT
		closes the listenport again.


    accept	
		<- a/y = address of buffer for struct:
			 1 byte length of buffer (incl. length byte)
			 for TCP/IP and UDP/IP:
			 4 byte IP address + 
			 2 byte port, 
		   x = listenport
		   c = 0: don't block
		   c = 1: block
		-> c = 0: x = file-nr for write
			  y = file-nr for read
			  (for the y register see the same comment on
			  the read-write fopen call)
		   The buffer contains the address that the remote 
		   machine uses. The 1st byte contains the length of the 
		   address (should not differ from the length indicated
		   by the protocol number in listen.
		   c = 1: a = error code		E_NOTIMP
							E_ADDRLEN
							E_NOMEM

connect is something like Unix socket() and connect() together. listen is something like socket(), bind() and listen() together. listen tells the network layer, that the application is going to accept connections on a certain port. Therefore, when a connection is requested from remote, the network layer can accept them already and hold them "on line" until the task gets the connection with accept. The maxmimum number of acceptable connections is implementation specific.
accept gets the first connection waiting for an accept. the other sides IP and port are stored in the buffer given in a/y If a connection is refused after checking IP or port, the 'accepted' connection should be closed immediately.

Memory Management

It is possible to have an allocation at byte boundaries, or at page boundaries - an application does not have to rely on a certain alignement!

   malloc       
		<- a/y length of block needed
                -> c=0: a/y address of block allocated
                   c=1: a = error code          E_NOMEM

   mfree        
		<- a/y address of block released
                -> c=0: ok
                   c=1: error code              E_ILLADR

Allocated memory blocks are automatically freed on process termination.

Process Management

Process management is a bit more complicated. Process ID interface is 16 bit, although they need not all be used, of course. Only processes are supported, no threads (so far).

   exec         
		<- a/y = addr of filename,0 [, parameter1, 0 ...] ,0
                -> c=0: new program starts and gets
			a/y = address of filename,0 [, parameter1, 0 ...],0
                   c=1: a = error code          E_NOTFOUND
                                                E_NOMEM
                allocates new environment and removes old environment.
                starts newly loaded o65 executable file.

   forkto    
		<- a/y addr of struct: '.byte STDIN, STDOUT, STDERR, exec_struct
                -> c=0: x/y = child pid
                   c = 1: a = error code	E_NOMEM
						E_ILLSTR
						E_NOTFOUND

                This is not really a fork like in Unix, but it creates a
                new process, so it still 'forks'. The new process is 
		started with executing the file given in the exec_struct
		- which is the same struct as given to exec.
		The file-nrs given for STDIN, STDOUT and STDERR share the same
		read/write pointers as the ones in this process.
		They are internally 'duped', and the calling task has to 
		close them after calling forkto.

   forkthread
		-> c = 0: x = 0 for old thread, 1 for new thread
		   c = 1: a = error code	E_NOTIMP

		forkthread is a call closer to the Unix fork call.
		It duplicates the current fork's stack, and sets up
		a new thread to be executed (i.e. scheduled) in the 
		very same memory environment.
		The new thread is started directly after the forkthread
		call, just with a different x register value than
		the original thread.

   term         
		<- a = return code

   kill         
		<- a = return code, x/y = pid (or OWNTASK = myself -> suicide)
                -> c=0: ok (except for OWNTASK)
                   c=1: a = error code          E_ILLPID

The term call terminates the current thread only. The memory etc is only freed when all threads in this environment have terminated. Kill terminates all threads in the environment indicated by the process ID.


   getpid       
		-> x/y = own PID

When forking, the files still share the same seek pointer (address in the file where they read/write). When one process writes to a file, the other processes write pointer moves on too, same for the read pointer. Otherwise file sharing with the 1541 would be impossible, for example.

STDIN/STDOUT and STDERR file-nrs appear to be opened before process start. They can be closed as any other file, though. When calling forkto, the file-nrs given to it are 'duped' internally, such that they have to be closed in the calling process, as well as in the newly created process.

All files opened by this task are closed when it terminates. All memory blocks allocated by this task are freed when it terminates.

The newly created process is started by calling the "main" function, with a/y pointing to a list of arguments:

	.byt "arg0",0, "arg1",0, ... ,"argn",0,0

Interprocess Communication

Interprocess communication heavily depends on the system underneath the library, so it's not that easy. So far we handle semaphores, signals, and send/receive.

Semaphores

    semget	
		<- c = 0: don't block, c = 1: wait till you get one
		-> c = 0: ok, x = semaphore number
		   c = 1: a = error code	E_NOTIMP
						E_NOSEM
		gets a new semaphore 

    semfre	
		<- x = semaphore number
		-> c = 0: ok
		   c = 1: a = error code	E_NOTIMP
						E_NOSEM
						E_INUSE
		releases a used semaphore. If a process is waiting for
		the semaphore, returns E_INUSE

    semgetnamed	
		<- c = 0: a/y = name of semaphore
		   	  x = 0 : if not found, return error,
		   	  x = 1 : if not found, alloc name and return ok
		-> c = 0 : ok, x = semaphore number
		   c = 1 : a = error code	E_NOTIMP
						E_NOTFOUND
						E_NOSEM
		This calls tries to allocate a 'named' semaphore.
		If the name already exists, the associated semaphore number
		is returned. If the name doesn't exist, and x=0, then an
		error is returned. If a name doesn't exist, and x=1, then
		the new name is allocated, a semaphore is allocated and
		associated with the name.

		<- c = 1: a/y = name of semaphore
		-> c = 0: ok
		   c = 1: a = error code	E_NOTIMP
						E_NOTFOUND
		The named semaphore is de-allocated. The named semaphore
		handler counts the number of allocations and frees a semphore
		if the name is totally deallocated.

		With this call, one can system-independently allocate
		system and hardware resources, if they are protected by 
		semaphores. 
		predefined semaphore names are:

			SEM_C64_SERIEC, SEM_C64_PARIEC, SEM_C64_SID,
			SEM_C64_VID, SEM_C64_KEYBOARD,
			SEM_C64_CIA1TA, SEM_C64_CIA1TB, SEM_C64_CIA1TOD,
			SEM_C64_CIA2TA, SEM_C64_CIA2TB, SEM_C64_CIA2TOD

    psem	
		<- x = semaphore number
		   c = 0: don't block; c = 1: wait till gotten
		-> c = 0: got semaphore
		   c = 1: a = error code	E_NOSEM
		Pass operation on a semaphore. Locks the semaphore.

    vsem	
		<- x = semaphore number
		Free semaphore.

All semaphores held by the process are freed when it terminates. In general, positive semaphore numbers are lib stuff, negative numbers are system stuff.

Signals

Signals are some kind of 'remote procedure call' - a signal handler for a certain signal is called upon another process' request.

    signal	
		<- x = signal-number
		   a/y = address of signal handler
		-> c = 0: ok
		   c = 1: a = error code	E_NOTIMP
						E_ILLSIG
		installs a signal handler for a signal
		signal handler address NULL de-installs a handler.

    sendsignal	
		<- a/y pid of receiving process
		   x = signal number
		-> c = 0: ok, sent
		   c = 1: a = error code	E_ILLPID
						E_ILLSIG
		sends a signal to another process. A signal is an emulated
		interrupt to the address specified as the signal handler
		address.

Predefined signals are

		SIG_TERM	calls signal handler, terminates if none.
		SIG_KILL	terminates process anyway
		SIG_HUP		calls signal handler, ignored if none.

In general, positive signal numbers are lib stuff, negative numbers are system stuff.

Send/Receive

This section is very preliminary, as the SEND/RECEIVE interface in OS/A65 is not really useable without MMU, and Lunix doesn't have SEND/RECEIVE.

    send	
		<- a/y = address of 
			.word receiver_pid
			.word address_of_data
			.word length_of_data
		   c = 0: don't block
		   c = 1: wait till accepted
		-> c = 0: block sent
		   c = 1: a = error code	E_ILLPID
						E_NOTIMP
		sends a message to another process. The data sent is not
		changed, or freed or whatever.

    receive	
		<- a/y = address of three words, second and third word
			 give address and length of receiver buffer
		   x = 0 : accept any sender
		   x = 1 : first word in (a/y) contains the sender 
		   c = 0 : don't block
		   c = 1 : wait till received
		-> c = 0 : message received, (a/y) has
			.word sender_pid
			.word address_of_data
			.word length_of_data 
		   c = 1 : a = error code	E_NOTIMP
						E_ILLPID
						E_NOMEM
		The data is stored in the buffer, and length_of_data is
		changed to the length actually received. If the buffer
		is too short, length_of_data is set to the length needed,
		and E_NOMEM is returned.

Implementation Notes

Implementation notes are currently available for the o65 file format only. This file format is rather flexible, and some of the ideas can be taken for other lib6502 file formats.

o65 file format

The o65 file format is defined in another file format specification. It allows the use of undefined references. In order to simplify the relocation procedure, lib6502 files have one undefined reference, namely "STDLIB". This reference defines the base of the lib6502 jump table. At STDLIB+0 there is a JMP opcode pointing to the code for fopen. At STDLIB+3 is a JMP opcode pointing to the code for fclose etc. The order is determined by the order given in the index of this definition.

A global variable is the "main" address, which is the start address for any lib6502 executable.

The lib6502 file format allows the use of "header options", where some OS specific options may be saved. The lib6502 files can - but don't need to - use a lib6502 header option (as defined in the o65 file format specification). This lib6502 header option contains the following struct:

	.byt lib6502_major_version_nr, lib6502_minor_version_number
	.byt lib6502_needed_level, lib6502_possible_level

The version numbers are hints as to which library version the file is compiled with. The level is a new number that describes which functions are used, and which are not. A library may provide a certain amount of lib calls in the library call table (STDLIB). The maximum number of calls used is given in the "possible_level" value. The maximum number that must be functional (and not just return "E_NOTIMP") is given by the "needed_level" number.

The level numbers are defined as:

Level 1: fopen - chdir
Level 2: connect - accept
Level 3: malloc - mfree
Level 4: exec - getpid
Level 5: semget - receive

If the file needs a Possible Level greater than the level provided by the library, an "E_LIBLEVEL" error code should be returned by forkto.

For example, a program that needs the file functions, and can optionally (i.e. if available) use the exec and fork calls, should have "1" as a Needed Function, and level 4 as Possible Functions.

Index

fopen
fclose
fgetc
fputc
fread
fwrite
fseek
pipe
flock
fcntl
fcmd
stddup
dup

fopendir
freaddir
fgetattr
fsetattr
fsetattr

connect
listen
accept

malloc
mfree

exec
forkto
forkthread
term
kill
getpid

semget
semfre
semgetnamed
psem
vsem
signal
sendsignal
send
receive

Ideas

service numbers conversion from text to number by generalized getxxxfromname?
listen + accept to _one_ call with local space for remote address?

Acknowledgements

Acknowledgements go to

Daniel Dallmann, author of Lunix, for discussions and comments on the library
Craig Bruce, the author of ACE, for some comments and bugfixes