1
0
mirror of https://github.com/hsoft/collapseos.git synced 2024-11-02 06:40:56 +11:00

Compare commits

...

5 Commits

Author SHA1 Message Date
Virgil Dupras
415bd7a169 Support nested LOAD 2020-04-14 21:04:07 -04:00
Virgil Dupras
aec19e5c87 Add word "LOAD" 2020-04-14 18:15:07 -04:00
Virgil Dupras
a67101fb8b Add word "EMPTY" 2020-04-14 16:07:07 -04:00
Virgil Dupras
add3b6629b Make DO .. LOOP binary code more compact
Only a few bytes saved in forth1.bin, but the DO .. LOOP construct
isn't used much yet. It's still significant savings per LOOP call.
2020-04-14 14:59:01 -04:00
Virgil Dupras
b8dd86bd18 Move notes.txt in blk 2020-04-14 14:54:42 -04:00
34 changed files with 321 additions and 228 deletions

View File

@ -42,8 +42,6 @@ also open `blk/000` in a modern text editor.
See `/emul/README.md` for getting an emulated system running.
There is also `/notes.txt` for implementation notes.
## Organisation of this repository
* `forth`: Forth is slowly taking over this project (see issue #4). It comes

View File

@ -1,3 +1,4 @@
MASTER INDEX
2 Documentation
3 Usage 30 Dictionary
70 Implementation notes 100 Block explorer

View File

@ -1,4 +0,0 @@
Documentation index
3 Usage
30 Dictionary

View File

@ -12,5 +12,5 @@ ALLOT n -- Move HERE by n bytes
C, b -- Write byte b in HERE and advance it.
DELW a -- Delete wordref at a. If it shadows another
definition, that definition is unshadowed.
FORGET x -- Rewind the dictionary (both CURRENT and HERE) up to
x's previous entry. (cont.)
EMPTY -- Rewind HERE and CURRENT where they were at
system initialization. (cont.)

View File

@ -1,5 +1,6 @@
(cont.)
PREV a -- a Return a wordref's previous entry.
WHLEN a -- n Get word header length from
wordref. That is, name length + 3.
a is a wordref
FORGET x -- Rewind the dictionary (both CURRENT and HERE)
up to x's previous entry.
PREV a -- a Return a wordref's previous entry.
WHLEN a -- n Get word header length from wordref. That is,
name length + 3. a is a wordref

View File

@ -1,3 +1,4 @@
(cont.)
UNTIL f -- *I* Jump backwards to BEGIN if f is
false.
EXIT! -- Exit current INTERPRET loop.

View File

@ -1,4 +1,6 @@
Disk
BLK> -- a Address of the current block variable.
LIST n -- Prints the contents of the block n on screen in the
form of 16 lines of 64 columns.
LOAD n -- Interprets Forth code from block n

6
blk/070 Normal file
View File

@ -0,0 +1,6 @@
Implementation notes
71 Execution model 73 Executing a word
75 Stack management 77 Dictionary
80 System variables 85 Word routines
89 Initialization sequence

11
blk/071 Normal file
View File

@ -0,0 +1,11 @@
EXECUTION MODEL
After having read a line through readln, we want to interpret
it. As a general rule, we go like this:
1. read single word from line
2. Can we find the word in dict?
3. If yes, execute that word, goto 1
4. Is it a number?
5. If yes, push that number to PS, goto 1
6. Error: undefined word.

16
blk/073 Normal file
View File

@ -0,0 +1,16 @@
EXECUTING A WORD
At it's core, executing a word is pushing the wordref on PS and
calling EXECUTE. Then, we let the word do its things. Some
words are special, but most of them are of the compiledWord
type, and that's their execution that we describe here.
First of all, at all time during execution, the Interpreter
Pointer (IP) points to the wordref we're executing next.
When we execute a compiledWord, the first thing we do is push
IP to the Return Stack (RS). Therefore, RS' top of stack will
contain a wordref to execute next, after we EXIT.
At the end of every compiledWord is an EXIT. This pops RS, sets
IP to it, and continues.

14
blk/075 Normal file
View File

@ -0,0 +1,14 @@
Stack management
The Parameter stack (PS) is maintained by SP and the Return
stack (RS) is maintained by IX. This allows us to generally use
push and pop freely because PS is the most frequently used.
However, this causes a problem with routine calls: because in
Forth, the stack isn't balanced within each call, our return
offset, when placed by a CALL, messes everything up. This is
one of the reasons why we need stack management routines below.
IX always points to RS' Top Of Stack (TOS)
This return stack contain "Interpreter pointers", that is a
pointer to the address of a word, as seen in a compiled list of
words.

16
blk/077 Normal file
View File

@ -0,0 +1,16 @@
Dictionary
A dictionary entry has this structure:
- Xb name. Arbitrary long number of character (but can't be
bigger than input buffer, of course). not null-terminated
- 2b prev offset
- 1b size + IMMEDIATE flag
- 2b code pointer
- Parameter field (PF)
The prev offset is the number of bytes between the prev field
and the previous word's code pointer.
The size + flag indicate the size of the name field, with the
7th bit being the IMMEDIATE flag. (cont.)

10
blk/078 Normal file
View File

@ -0,0 +1,10 @@
(cont.) The code pointer point to "word routines". These
routines expect to be called with IY pointing to the PF. They
themselves are expected to end by jumping to the address at
(IP). They will usually do so with "jp next".
That's for "regular" words (words that are part of the dict
chain). There are also "special words", for example NUMBER,
LIT, FBR, that have a slightly different structure. They're
also a pointer to an executable, but as for the other fields,
the only one they have is the "flags" field.

16
blk/080 Normal file
View File

@ -0,0 +1,16 @@
System variables
There are some core variables in the core system that are
referred to directly by their address in memory throughout the
code. The place where they live is configurable by the RAMSTART
constant in conf.fs, but their relative offset is not. In fact,
they're mostly referred to directly as their numerical offset
along with a comment indicating what this offset refers to.
This system is a bit fragile because every time we change those
offsets, we have to be careful to adjust all system variables
offsets, but thankfully, there aren't many system variables.
Here's a list of them:
(cont.)

16
blk/081 Normal file
View File

@ -0,0 +1,16 @@
(cont.)
RAMSTART INITIAL_SP +53 readln's variables
+02 CURRENT +55 adev's variables
+04 HERE +57 blk's variables
+06 IP +59 z80a's variables
+08 FLAGS +5b FUTURE USES
+0a PARSEPTR +70 DRIVERS
+0c CINPTR +80 RAMEND
+0e WORDBUF
+2e BOOT C< PTR
+4e INTJUMP
+51 CURRENTPTR
(cont.)

16
blk/082 Normal file
View File

@ -0,0 +1,16 @@
(cont.) INITIAL_SP holds the initial Stack Pointer value so
that we know where to reset it on ABORT
CURRENT points to the last dict entry.
HERE points to current write offset.
IP is the Interpreter Pointer
FLAGS holds global flags. Only used for prompt output control
for now.
PARSEPTR holds routine address called on (parse)
CINPTR holds routine address called on C<
(cont.)

16
blk/083 Normal file
View File

@ -0,0 +1,16 @@
(cont.) WORDBUF is the buffer used by WORD
BOOT C< PTR is used when Forth boots from in-memory
source. See "Initialization sequence" below.
INTJUMP All RST offsets (well, not *all* at this moment, I
still have to free those slots...) in boot binaries are made to
jump to this address. If you use one of those slots for an
interrupt, write a jump to the appropriate offset in that RAM
location.
CURRENTPTR points to current CURRENT. The Forth CURRENT word
doesn't return RAM+2 directly, but rather the value at this
address. Most of the time, it points to RAM+2, but sometimes,
when maintaining alternative dicts (during cross compilation
for example), it can point elsewhere. (cont.)

6
blk/084 Normal file
View File

@ -0,0 +1,6 @@
(cont.) FUTURE USES section is unused for now.
DRIVERS section is reserved for recipe-specific
drivers. Here is a list of known usages:
* 0x70-0x78: ACIA buffer pointers in RC2014 recipes.

16
blk/085 Normal file
View File

@ -0,0 +1,16 @@
Word routines
This is the description of all word routine you can encounter
in this Forth implementation. That is, a wordref will always
point to a memory offset containing one of these numbers.
0x17: nativeWord. This words PFA contains native binary code
and is jumped to directly.
0x0e: compiledWord. This word's PFA contains an atom list and
its execution is described in "EXECUTION MODEL" above.
0x0b: cellWord. This word is usually followed by a 2-byte value
in its PFA. Upon execution, the *address* of the PFA is pushed
to PS.
(cont.)

16
blk/086 Normal file
View File

@ -0,0 +1,16 @@
(cont.)
0x2b: doesWord. This word is created by "DOES>" and is followed
by a 2-byte value as well as the adress where "DOES>" was
compiled. At that address is an atom list exactly like in a
compiled word. Upon execution, after having pushed its cell
addr to PSP, it execute its reference exactly like a
compiledWord.
0x20: numberWord. No word is actually compiled with this
routine, but atoms are. Atoms with a reference to the number
words routine are followed, *in the atom list*, of a 2-byte
number. Upon execution, that number is fetched and IP is
avdanced by an extra 2 bytes.
0x24: addrWord. Exactly like a numberWord, except that it is
treated differently by meta-tools. (cont.)

6
blk/087 Normal file
View File

@ -0,0 +1,6 @@
(cont.)
0x22: litWord. Similar to a number word, except that instead of
being followed by a 2 byte number, it is followed by a
null-terminated string. Upon execution, the address of that
null-terminated string is pushed on the PSP and IP is advanced
to the address following the null.

16
blk/089 Normal file
View File

@ -0,0 +1,16 @@
Initialization sequence
On boot, we jump to the "main" routine in boot.fs which does
very few things.
1. Set SP to 0x10000-6
2. Sets HERE to RAMEND (RAMSTART+0x80).
3. Sets CURRENT to value of LATEST field in stable ABI.
4. Look for the word "BOOT" and calls it.
In a normal system, BOOT is in icore and does a few things:
1. Find "(parse)" and set "(parse*)" to it.
2. Find "(c<)" a set CINPTR to it (what C< calls).
3. Write LATEST in SYSTEM SCRATCHPAD ( see below )
4. Find "INIT". If found, execute. Otherwise, "INTERPRET"(cont)

16
blk/090 Normal file
View File

@ -0,0 +1,16 @@
(cont.) On a bare system (only boot+icore), this sequence will
result in "(parse)" reading only decimals and (c<) reading
characters from memory starting from CURRENT (this is why we
put CURRENT in SYSTEM SCRATCHPAD, it tracks current pos ).
This means that you can put initialization code in source form
right into your binary, right after your last compiled dict
entry and it's going to be executed as such until you set a new
(c<).
Note that there is no EMIT in a bare system. You have to take
care of supplying one before your load core.fs and its higher
levels.
(cont.)

7
blk/091 Normal file
View File

@ -0,0 +1,7 @@
(cont.) In the "/emul" binaries, "HERE" is readjusted to
"CURRENT @" so that we don't have to relocate compiled dicts.
Note that in this context, the initialization code is fighting
for space with HERE: New entries to the dict will overwrite
that code! Also, because we're barebone, we can't have
comments. This can lead to peculiar code in this area where we
try to "waste" space in initialization code.

10
blk/100 Normal file
View File

@ -0,0 +1,10 @@
Block explorer
This is an application to conveniently browse the contents of
the disk blocks. You can launch it with "102 LOAD".
USAGE: When loaded, the Forth interpreter is replaced by the
explorer interpreter. Typing "Q" quits the program.
Typing a decimal number followed by space or return lists the
contents of that block.

2
blk/102 Normal file
View File

@ -0,0 +1,2 @@
: foo ." Hello world! " 42 . ;
foo

1
blk/103 Normal file
View File

@ -0,0 +1 @@
42 . 102 LOAD 43 .

View File

@ -20,11 +20,11 @@ BLKPACK = ../tools/blkpack
.PHONY: all
all: $(TARGETS)
$(STRIPFC):
$(SLATEST):
$(BIN2C):
$(BLKPACK):
$(MAKE) -C ../tools
$(STRIPFC): $(BLKPACK)
$(SLATEST): $(BLKPACK)
$(BIN2C): $(BLKPACK)
# z80c.bin is not in the prerequisites because it's a bootstrap
# binary that should be updated manually through make updatebootstrap.
@ -77,5 +77,5 @@ updatebootstrap: forth/stage2
.PHONY: clean
clean:
rm -f $(TARGETS) emul.o forth/*-bin.h forth/forth?.bin
rm -f $(TARGETS) emul.o forth/*-bin.h forth/forth?.bin blkfs
$(MAKE) -C ../tools clean

View File

@ -11,5 +11,6 @@
['] EFS@ BLK@* !
RDLN$
Z80A$
LIT< _sys [entry]
INTERPRET
;

Binary file not shown.

View File

@ -11,7 +11,11 @@
: BLK$
H@ 0x57 RAM+ !
( 1024 for the block, 6 for variables )
1030 ALLOT
( LOAD detects end of block with ASCII EOT. This is why
we write it there. EOT == 0x04 )
4 C,
-1 BLK> !
;
@ -30,3 +34,39 @@
CRLF
LOOP
;
: _
(boot<)
DUP 4 = IF
DROP
( We're finished interpreting )
EXIT!
THEN
;
: LOAD
( save BLK>, CINPTR and boot< ptr to RSP )
BLK> @ >R
0x0c RAM+ @ >R
0x2e RAM+ @ >R
BLK@
( Point to beginning of BLK )
BLK( 0x2e RAM+ !
( 0c == CINPTR )
['] _ 0x0c RAM+ !
INTERPRET
R> 0x2e RAM+ !
( Before we restore CINPTR, are we restoring it to "_"?
if yes, it means we're in a nested LOAD which means we
should also load back the saved BLK>. Otherwise, we can
ignore the BLK> from RSP. )
I 0x0c RAM+ @ = IF
( nested load )
R> DROP ( CINPTR )
R> BLK@
ELSE
( not nested )
R> 0x0c RAM+ !
R> DROP ( BLK> )
THEN
;

View File

@ -94,12 +94,20 @@
H@
; IMMEDIATE
( Increase loop counter and returns whether we should loop. )
: _
R> ( IP, keep for later )
R> 1 + ( ip i+1 )
DUP >R ( ip i )
I' = ( ip f )
SWAP >R ( f )
;
( One could think that we should have a sub word to avoid all
these COMPILE, but we can't because otherwise it messes with
the RS )
: LOOP
COMPILE R> 1 LITN COMPILE + COMPILE DUP COMPILE >R
COMPILE I' COMPILE = COMPILE (?br)
COMPILE _ COMPILE (?br)
H@ - ,
COMPILE R> COMPILE DROP COMPILE R> COMPILE DROP
; IMMEDIATE
@ -136,3 +144,17 @@
DUP WHLEN - HERE ! ( w )
PREV CURRENT !
;
: EMPTY
LIT< _sys (find) NOT IF ABORT THEN
DUP HERE ! CURRENT !
;
( Drop RSP until I-2 == INTERPRET. )
: EXIT!
['] INTERPRET ( I )
BEGIN ( I )
DUP ( I I )
R> DROP I 2 - @ ( I I a )
= UNTIL
;

View File

@ -146,10 +146,9 @@
AGAIN
;
: (entry)
HERE @ ( h )
WORD ( h s )
SCPY ( h )
: [entry]
HERE @ ( w h )
SWAP SCPY ( h )
( Adjust HERE -1 because SCPY copies the null )
HERE @ 1 - ( h h' )
DUP HERE ! ( h h' )
@ -161,6 +160,8 @@
HERE @ CURRENT !
;
: (entry) WORD [entry] ;
: INTERPRET
BEGIN
WORD
@ -177,7 +178,7 @@
( system c< simply reads source from binary, starting at
LATEST. Convenient way to bootstrap a new system. )
: (c<)
: (boot<)
( 2e == BOOT C< PTR )
0x2e RAM+ @ ( a )
DUP C@ ( a c )
@ -191,7 +192,7 @@
( 2e == SYSTEM SCRATCHPAD )
CURRENT @ 0x2e RAM+ !
( 0c == CINPTR )
LIT< (c<) (find) DROP 0x0c RAM+ !
LIT< (boot<) (find) DROP 0x0c RAM+ !
LIT< INIT (find)
IF EXECUTE
ELSE DROP INTERPRET THEN

203
notes.txt
View File

@ -1,203 +0,0 @@
Collapse OS' Forth implementation notes
*** EXECUTION MODEL
After having read a line through readln, we want to interpret it. As a general
rule, we go like this:
1. read single word from line
2. Can we find the word in dict?
3. If yes, execute that word, goto 1
4. Is it a number?
5. If yes, push that number to PS, goto 1
6. Error: undefined word.
*** EXECUTING A WORD
At it's core, executing a word is pushing the wordref on PS and calling EXECUTE.
Then, we let the word do its things. Some words are special, but most of them
are of the compiledWord type, and that's their execution that we describe here.
First of all, at all time during execution, the Interpreter Pointer (IP) points
to the wordref we're executing next.
When we execute a compiledWord, the first thing we do is push IP to the Return
Stack (RS). Therefore, RS' top of stack will contain a wordref to execute next,
after we EXIT.
At the end of every compiledWord is an EXIT. This pops RS, sets IP to it, and
continues.
*** Stack management
The Parameter stack (PS) is maintained by SP and the Return stack (RS) is
maintained by IX. This allows us to generally use push and pop freely because PS
is the most frequently used. However, this causes a problem with routine calls:
because in Forth, the stack isn't balanced within each call, our return offset,
when placed by a CALL, messes everything up. This is one of the reasons why we
need stack management routines below. IX always points to RS' Top Of Stack (TOS)
This return stack contain "Interpreter pointers", that is a pointer to the
address of a word, as seen in a compiled list of words.
*** Dictionary
A dictionary entry has this structure:
- Xb name. Arbitrary long number of character (but can't be bigger than
input buffer, of course). not null-terminated
- 2b prev offset
- 1b size + IMMEDIATE flag
- 2b code pointer
- Parameter field (PF)
The prev offset is the number of bytes between the prev field and the previous
word's code pointer.
The size + flag indicate the size of the name field, with the 7th bit being the
IMMEDIATE flag.
The code pointer point to "word routines". These routines expect to be called
with IY pointing to the PF. They themselves are expected to end by jumping to
the address at (IP). They will usually do so with "jp next".
That's for "regular" words (words that are part of the dict chain). There are
also "special words", for example NUMBER, LIT, FBR, that have a slightly
different structure. They're also a pointer to an executable, but as for the
other fields, the only one they have is the "flags" field.
*** System variables
There are some core variables in the core system that are referred to directly
by their address in memory throughout the code. The place where they live is
configurable by the RAMSTART constant in conf.fs, but their relative offset is
not. In fact, they're mostly referred to directly as their numerical offset
along with a comment indicating what this offset refers to.
This system is a bit fragile because every time we change those offsets, we
have to be careful to adjust all system variables offsets, but thankfully,
there aren't many system variables. Here's a list of them:
RAMSTART INITIAL_SP
+02 CURRENT
+04 HERE
+06 IP
+08 FLAGS
+0a PARSEPTR
+0c CINPTR
+0e WORDBUF
+2e BOOT C< PTR
+4e INTJUMP
+51 CURRENTPTR
+53 readln's variables
+55 adev's variables
+57 blk's variables
+59 z80a's variables
+5b FUTURE USES
+70 DRIVERS
+80 RAMEND
INITIAL_SP holds the initial Stack Pointer value so that we know where to reset
it on ABORT
CURRENT points to the last dict entry.
HERE points to current write offset.
IP is the Interpreter Pointer
FLAGS holds global flags. Only used for prompt output control for now.
PARSEPTR holds routine address called on (parse)
CINPTR holds routine address called on C<
WORDBUF is the buffer used by WORD
BOOT C< PTR is used when Forth boots from in-memory source. See "Initialization
sequence" below.
INTJUMP All RST offsets (well, not *all* at this moment, I still have to free
those slots...) in boot binaries are made to jump to this address. If you use
one of those slots for an interrupt, write a jump to the appropriate offset in
that RAM location.
CURRENTPTR points to current CURRENT. The Forth CURRENT word doesn't return
RAM+2 directly, but rather the value at this address. Most of the time, it
points to RAM+2, but sometimes, when maintaining alternative dicts (during
cross compilation for example), it can point elsewhere.
FUTURE USES section is unused for now.
DRIVERS section is reserved for recipe-specific drivers. Here is a list of
known usages:
* 0x70-0x78: ACIA buffer pointers in RC2014 recipes.
*** Word routines
This is the description of all word routine you can encounter in this Forth
implementation. That is, a wordref will always point to a memory offset
containing one of these numbers.
0x17: nativeWord. This words PFA contains native binary code and is jumped to
directly.
0x0e: compiledWord. This word's PFA contains an atom list and its execution is
described in "EXECUTION MODEL" above.
0x0b: cellWord. This word is usually followed by a 2-byte value in its PFA.
Upon execution, the *address* of the PFA is pushed to PS.
0x2b: doesWord. This word is created by "DOES>" and is followed by a 2-byte
value as well as the adress where "DOES>" was compiled. At that address is an
atom list exactly like in a compiled word. Upon execution, after having pushed
its cell addr to PSP, it execute its reference exactly like a compiledWord.
0x20: numberWord. No word is actually compiled with this routine, but atoms are.
Atoms with a reference to the number words routine are followed, *in the atom
list*, of a 2-byte number. Upon execution, that number is fetched and IP is
avdanced by an extra 2 bytes.
0x24: addrWord. Exactly like a numberWord, except that it is treated
differently by meta-tools.
0x22: litWord. Similar to a number word, except that instead of being followed
by a 2 byte number, it is followed by a null-terminated string. Upon execution,
the address of that null-terminated string is pushed on the PSP and IP is
advanced to the address following the null.
*** Initialization sequence
On boot, we jump to the "main" routine in boot.fs which does very few things.
1. Set SP to 0x10000-6
2. Sets HERE to RAMEND (RAMSTART+0x80).
3. Sets CURRENT to value of LATEST field in stable ABI.
4. Look for the word "BOOT" and calls it.
In a normal system, BOOT is in icore and does a few things:
1. Find "(parse)" and set "(parse*)" to it.
2. Find "(c<)" a set CINPTR to it (what C< calls).
3. Write LATEST in SYSTEM SCRATCHPAD ( see below )
4. Find "INIT". If found, execute. Otherwise, execute "INTERPRET"
On a bare system (only boot+icore), this sequence will result in "(parse)"
reading only decimals and (c<) reading characters from memory starting from
CURRENT (this is why we put CURRENT in SYSTEM SCRATCHPAD, it tracks current
pos ).
This means that you can put initialization code in source form right into your
binary, right after your last compiled dict entry and it's going to be executed
as such until you set a new (c<).
Note that there is no EMIT in a bare system. You have to take care of supplying
one before your load core.fs and its higher levels.
In the "/emul" binaries, "HERE" is readjusted to "CURRENT @" so that we don't
have to relocate compiled dicts. Note that in this context, the initialization
code is fighting for space with HERE: New entries to the dict will overwrite
that code! Also, because we're barebone, we can't have comments. This can lead
to peculiar code in this area where we try to "waste" space in initialization
code.