That's because of two main reasons. First, Forth relies heavily on passing control to the next word, instead of calling word implementations as subroutines. This is difficult to do correctly, and dynamically, in C. I suppose you could get by with computed gotos, but...
Secondly, Forth and assembly (or more generally any lower-level instruction/register view of what it's running on) go hand in hand. Forth words may carry their own assembly implementation within them, or they may be partially assembly.
Without those two things, you don't really have a Forth system, just a Forth interpreter. And you cannot define new primitive words, or words with very custom compilation semantics, from within Forth itself.
And therein also lies Forth's strength. It is very easily on-the-fly expandable, and it can even bootstrap itself. It is a great tool for machine-level exploration: Do complex low-level things with a lot less boilerplate.
The author mentions that this second attempt is a "hacker level" implementation, but short of reading the source code does not seem to go into any detail what that means exactly, and how it's implemented?
The proposed definition is not particularly difficult to follow. This a readability argument, and we know it is 90% subjective. OP could be a rookie Lisper and I could be a grey beard Lisper: I don't see a problem here, cause I don't see the parens any more.
Second, Forth as 2 stacks, and it shouldn't be overlooked:
: +! dup >R @ + R> ! ;
Of course, this will still be opaque to those here not familiar with Forth, but basically this definition copies the address you need twice to the "other" stack, fetches the memory value, adds, then retrieves the memory address to store the result.Third, stack comments are as common as training wheels on bicycles. Ok, a bit more common actually. But you get the idea: you need them when you have not yet learned to feel forces and balance your weight accordingly. This is also a key point which is not visible because it is not syntax nor semantics, but design. Forth programmers are very insistent on simplification and factoring for this reason. Take them seriously. If you see 2+ instances of the same sequence of words, don't worry about execution speed, factor it.
Moore explained that definitions are like abbreviations except they can take a parameter or two. Follow that. He also isn't joking when he says than messing with more than two parameters on the stack (like the proposed +!) is already too much. The author even agrees, in a way.
When you do all that, this is when you can make cherries rain on your cake: you can rewrite in machine code the critical definitions, and it is easy to do because they are so short and simple. You can make up for the cost of interpretation and the cost of convenience factoring, if needed.
> How can you know the arity of the functions without adding explicit comments?
Oh, the classic example with foo and bar that give zero clue about what they do. Compare with the "+!" proposed by the author himself - any Forth programmer instantly knows it most likely takes 2 arguments and returns nothing. If it doesn't they will argue that it is as terrible name and you should know better.
This is one of the biggest challenges of Forth: stop FUDing yourself with "what ifs". They won't happen. Ok, sometimes they do happen. But by this time you will have built some confidence, stopped guarding against alien invasions from the next multiverse (which simplifies coding a lot), and you will be able to deal with those unexpected turn of events.
I won't pretend the problem doesn't exist at all, though. But it is far smaller than this type of argument makes it seem to be. It can be a problem when reading someone else's code who has an unfamiliar coding style, granted. But I have written thousands of lines of Forth and I rarely have this problem.
Also, a note on the C equivalent: again, it is focused on the syntax, but we know that semantics are where the fun is: do foo or bar have nasty side effects? Doesn't bar return a signed integer when foo expects an unsigned? I would argue that C's more "natural" syntax creates an illusion of understanding which makes it more friendly than it really is. Maybe you've felt it if you reviewed some code once, and then later had to debug or adapt it.
His implementation, while working for his use cases, is actually quite far from the originals. He uses separate Go structures for words, memory, for-loops, etc. Most of the default Forth implementations of words will fail, since they expect the heap and stack to behave in a certain way. Also, every major word is implemented directly in Go, with no bootstrapping.
> However, it's insufficient for the hacker level, because the host language interpreter (the one in Go) has all the control, so it's impossible to implement IF...THEN in Forth
Well, I did it with a ... heap []any ... data structure. Works well enough.
PowerPC and Intel Macs used to use Open Firmware[0] to define their firmware (BIOS). Macs with M1[1] architectures and subsequent generations may also, but I cannot attest to this.
For those Macs which do use Open Firmware[0], a key implementation component of it is FCode[2]:
FCode is a Forth dialect compliant to ANS Forth, that is
available in two different forms: source and bytecode.
FCode bytecode is the compiled form of FCode source.
So if anyone asks, "who uses Forth anymore?", a correct answer is - a large percentage of everyday people.0 - https://www.openbios.org/Open_Firmware.html
> How can you know the arity of the functions without adding explicit comments? Sure, if you have a handful of words like bar and foo you know like the back of your hand, this is easy. But imagine reading a large, unfamiliar code base full of code like this and trying to comprehend it.
Cool project by the way! Well done.
The second implementation - in C - does it the "right way", with both code and data living in the same memory, and thus allows all the usual Forth shenanigans. Again, this has all to do with the design approach and not the implementation language. Nothing precludes one from writing such an implementation in Go :-)
Alt: c go forth
BTW, have you heard of https://factorcode.org? You might like it!
( addr updated-value )
[1] | This has nothing to do with threads in the sense of concurrency. Rather, it's thread like in sewing, where the elements of the list are all connected to each other as if with a thread. See this page for more details. |
[2] |
[3] | Assuming the convention that multi-parameter functions have their parameters pushed to the stack from left to right. |