mirror of
https://github.com/hsoft/collapseos.git
synced 2024-11-08 06:38:08 +11:00
251 lines
7.9 KiB
Plaintext
251 lines
7.9 KiB
Plaintext
# Forth Primer
|
|
|
|
# First steps
|
|
|
|
Before you read this primer, let's try a few commands, just for
|
|
fun.
|
|
|
|
42 .
|
|
|
|
This will push the number 42 to the stack, then print the number
|
|
at the top of the stack.
|
|
|
|
4 2 + .
|
|
|
|
This pushes 4, then 2 to the stack, then adds the 2 numbers on
|
|
the top of the stack, then prints the result.
|
|
|
|
42 0x8000 C! 0x8000 C@ .
|
|
|
|
This writes the byte "42" at address 0x8000, and then reads
|
|
back that bytes from the same address and print it.
|
|
|
|
# Interpreter loop
|
|
|
|
Forth's main interpeter loop is very simple:
|
|
|
|
1. Read a word from input
|
|
2. Look it up in the dictionary
|
|
3. Found? Execute.
|
|
4. Not found?
|
|
4.1. Is it a number?
|
|
4.2. Yes? Parse and push on the Parameter Stack.
|
|
4.3. No? Error.
|
|
5. Repeat
|
|
|
|
# Word
|
|
|
|
A word is a string of non-whitepace characters. We consider that
|
|
we're finished reading a word when we encounter a whitespace
|
|
after having read at least one non-whitespace character
|
|
|
|
# Character encoding
|
|
|
|
Collapse OS doesn't support any other encoding than 7bit ASCII.
|
|
A character smaller than 0x21 is considered a whitespace,
|
|
others are considered non-whitespace.
|
|
|
|
Characters above 0x7f have no special meaning and can be used in
|
|
words (if your system has glyphs for them).
|
|
|
|
# Dictionary
|
|
|
|
Forth's dictionary link words to code. On boot, this dictionary
|
|
contains the system's words (look in dict.txt for a list of
|
|
them), but you can define new words with the ":" word. For
|
|
example:
|
|
|
|
: FOO 42 . ;
|
|
|
|
defines a new word "FOO" with the code "42 ." linked to it. The
|
|
word ";" closes the definition. Once defined, a word can be
|
|
executed like any other word.
|
|
|
|
You can define a word that already exists. In that case, the new
|
|
definition will overshadow the old one. However, any word def-
|
|
ined *before* the overshadowing took place will still use the
|
|
old word.
|
|
|
|
# Cell size
|
|
|
|
The cell size in Collapse OS is 16 bit, that is, each item in
|
|
stacks is 16 bit, @ and ! read and write 16 bit numbers.
|
|
Whenever we refer to a number, a pointer, we speak of 16 bit.
|
|
|
|
To read and write bytes, use C@ and C!.
|
|
|
|
# Number literals
|
|
|
|
Traditional Forth often uses HEX/DEC switches to go from deci-
|
|
mal to hexadecimal parsing. Collapse OS parses literals in a
|
|
way that is closer to C.
|
|
|
|
Straight numbers are decimals, numbers starting with "0x"
|
|
are hexadecimals (example "0x12ef"), "0b" prefixes indicate
|
|
binary (example "0b1010"), char literals are single characters
|
|
surrounded by ' (example 'X'). Char literals can't be used for
|
|
whitespaces.
|
|
|
|
# Parameter Stack
|
|
|
|
Unlike most programming languages, Forth execute words directly,
|
|
without arguments. The Parameter Stack (PS) replaces them. There
|
|
is only one, and we're constantly pushing to and popping from
|
|
it. All the time.
|
|
|
|
For example, the word "+" pops the 2 number on the Top Of Stack
|
|
(TOS), adds them, then pushes back the result on the same stack.
|
|
It thus has the "stack signature" of "a b -- n". Every word in
|
|
a dictionary specifies that signature because stack balance, as
|
|
you can guess, is paramount. It's easy to get confused so you
|
|
need to know the stack signature of words you use very well.
|
|
|
|
# Return Stack
|
|
|
|
There's a second stack, the Return Stack (RS), which is used to
|
|
keep track of execution, that is, to know where to go back after
|
|
we've executed a word. It is also used in other contexts, but
|
|
this is outside of the scope of this primer.
|
|
|
|
# Conditional execution
|
|
|
|
Code can be executed conditionally with IF/ELSE/THEN. IF pops
|
|
PS and checks whether its nonzero. If it is, it does nothing.
|
|
If it's zero, it jumps to the following ELSE or the following
|
|
THEN. Similarly, when ELSE is encountered in the context of a
|
|
nonzero IF, we jump to the following THEN.
|
|
|
|
Because IFs involve jumping, they only work inside word defin-
|
|
itions. You can't use IF directly in the interpreter loop.
|
|
|
|
Example usage:
|
|
|
|
: FOO IF 42 ELSE 43 THEN . ;
|
|
0 FOO --> 42
|
|
1 FOO --> 43
|
|
|
|
# Loops
|
|
|
|
Loops work a bit like conditionals, and there's 3 forms:
|
|
|
|
BEGIN..AGAIN --> Loop forever
|
|
BEGIN..UNTIL --> Loop conditionally
|
|
DO..LOOP --> Loop X times
|
|
|
|
UNTIL works exactly like IF, but instead of jumping forward to
|
|
THEN, it jumps backward to BEGIN.
|
|
|
|
DO pops the lower, then the higher bounds of the loop to be
|
|
executed, then pushes them on RS. Then, each time LOOP is
|
|
encountered, RS' TOS is increased. As long as the 2 numbers at
|
|
RS' TOS aren't equal, we jump back to DO.
|
|
|
|
The word "I" copies RS' TOS to PS, which can be used to get our
|
|
"loop counter".
|
|
|
|
Beware: the bounds arguments for DO are unintuitive. We begin
|
|
with the upper bound. Example:
|
|
|
|
42 0 DO I . SPC LOOP
|
|
|
|
Will print numbers 0 to 41, separated by a space.
|
|
|
|
# Memory access and HERE
|
|
|
|
We can read and write to arbitrary memory address with @ and !
|
|
(C@ and C! for bytes). For example, "1234 0x8000 !" writes the
|
|
word 1234 to address 0x8000. We call the @ and ! actions
|
|
"fetch" and "store".
|
|
|
|
There's a 3rd kind of memory-related action: "," (write). This
|
|
action stores value on PS at where a special variable called
|
|
"HERE" points to, and then advances HERE by 2 (there's also
|
|
"C," for bytes).
|
|
|
|
Note that the HERE word returns the address containing the
|
|
pointer (it doesn't change). There's a shortcut word for
|
|
"HERE @" named "H@", which is used much more often.
|
|
|
|
HERE is initialized at the first writable address in RAM, often
|
|
directly following the latest entry in the dictionary. Explain-
|
|
ing the "culture of HERE" is beyond the scope of this primer,
|
|
but know that it's a very important concept in Forth. For examp-
|
|
le, new word definitions are written to HERE.
|
|
|
|
# Variables
|
|
|
|
The word "VARIABLE" links a name to an address. For example,
|
|
"VARIABLE FOO" defines the word "FOO" and "reserves" 2 bytes of
|
|
memory. Then, when FOO is executed, it pushes the address of the
|
|
"reserved" area to PS.
|
|
|
|
For example, "1234 FOO !" writes 1234 to memory address reserved
|
|
for FOO.
|
|
|
|
Another way to create a variable is with the word CREATE, which
|
|
creates a variable entry without reserving anything for it: it's
|
|
your responsibility to reserve memory for it after you call it.
|
|
It can be useful for arrays. For example, look at VARIABLE's
|
|
definition:
|
|
|
|
: VARIABLE CREATE 2 ALLOT ;
|
|
|
|
# DOES>
|
|
|
|
Calling DOES> makes the newly created entry into a special
|
|
"does word" which behaves like a variable, that is, it pushes
|
|
the address of the "reserved" space to PS, but with additional
|
|
behavior attached to it.
|
|
|
|
DOES> must be called in the context of a word definition and
|
|
calling it stops the definition right there. Every word follow-
|
|
ing the DOES> is our new entry's behavior. For example, let's
|
|
look at CONSTANT's definition:
|
|
|
|
: CONSTANT CREATE , DOES> @ ;
|
|
|
|
A constant is created with "42 CONSTANT FOO" and FOO, instead
|
|
of putting FOO's address on PS, puts 42 on it.
|
|
|
|
You can see above that after we've created our FOO entry, we
|
|
write it to HERE and then assign the behavior "@" to it, which
|
|
means that it will transform the address currently on PS to its
|
|
value.
|
|
|
|
# IMMEDIATE
|
|
|
|
We approach the end of our primer. So far, we've covered the
|
|
"cute and cuddly" parts of the language. However, that's not
|
|
what makes Forth powerful. Forth becomes mind-bending when we
|
|
throw IMMEDIATE into the mix.
|
|
|
|
A word can be declared immediate thus:
|
|
|
|
: FOO ; IMMEDIATE
|
|
|
|
That is, when the IMMEDIATE word is executed, it makes the
|
|
latest defined word immediate.
|
|
|
|
An immediate word, when used in a definition, is executed
|
|
immediately instead of being compiled. This seemingly simple
|
|
mechanism (and it *is* simple) has very wide implications.
|
|
|
|
For example, The words "(" and ")" are comment indicators. In
|
|
the definition:
|
|
|
|
: FOO 42 ( this is a comment ) . ;
|
|
|
|
The word "(" is read like any other word. What prevents us from
|
|
trying to compile "this" and generate an error because the word
|
|
doesn't exist? Because "(" is immediate. Then, that word reads
|
|
from input stream until a ")" is met, and then returns to word
|
|
compilation.
|
|
|
|
Words like "IF", "DO", ";" are all regular Forth words, but
|
|
their "power" come from the fact that they're immediate.
|
|
|
|
Starting Forth by Leo Brodie explain all of this in details.
|
|
Read this if you can. If you can't, well, let this sink in for
|
|
a while, browse the dictionary (dict.txt) and try to understand
|
|
why this or that word is immediate. Good luck!
|