The Inner Interpreters

  Some parts of this page/site are currently incomplete & will be updated asap
  Other parts will change continually so use “Refresh” in your browser !!
  There is extensive use of “Tooltips” text to support learning which do not seem to render on a Smartphone. This site is best viewed via a computer’s HD monitor


Inner Interpreters

(Ting 2013, p. 52) takes a wide (and very insightful) view of Forth inner interpreters, saying that they “are a set of execution procedures, usually in the machine code of the host computer, which execute various Forth words by processing the information stored in their parameter fields”. The address of such a procedure is stored in the code field of a word definition. Forth definitions of the same class have the same address in their code fields

“Two major inner interpreters are used to process code definitions defined by machine instructions and colon definitions defined in terms of other existing Forth words…”

I am not convinced that there are two code interpreters as Ting claims; I only see one and I see the other as a (re-)entry point to the address interpreter

A number of “other minor inner interpreters are used to process constants, variables, user variables and other types of data and structures” (Ting 2013, p. 52)

I like the way Ting perceives these Forth words as separate Interpreters (and Compilers) because that has enhanced my own understanding of them. So I will follow Ting’s example in my discussion of them in the next few sections…
My own view of them is that the compile time code (“Compiler”) that is executed, compiles (inserts) the pointer to the address of their run-time code (“Interpreter”) and this is exactly how SPELL, my decompiler exposes them

Code Interpreters

Forth executes its words by passing their code field address (“CFA”) via the stack to a word called EXECUTE. As you can see below in Image#1, all EXECUTE does is to pop the CFA from the stack into the eZ80’s HL register pair and then jump to the NEXT1 (“RUN”) entry point of the address interpreter. This has the effect of executing the code pointed to by the CFA… This code will be another interpreter 1 of some kind…

Image#1. The Code Interpreter EXECUTE & the entry point NEXT
Image: (Husband 2011); eZ80 code based on the Intel 8080 figForth model

So execution is initiated by using EXECUTE, which has a header and is linked into the dictionary and so is available interactively

The System’s Start-up Code…

Image#2. Using the NEXT entry to execute COLD to kick-start a Forth system
Image: (Husband 2011); eZ80 code based on the Intel 8080 figForth model

More about the System’s Start-Up Code

So having executed a word, how is control returned to the Forth system?

If the word is a code definition (machine code primitive) such as EXECUTE itself or DROP both shown above in Image#1, the code must eventually end with a jump back into the address interpreter. This is normally to the entry point NEXT. See next section for a full discussion

If you know how to “prime” the appropriate Forth VM registers and pointers, you can execute a Forth word by performing a JP NEXT to the address interpreter. This method is used to initially “kick-start” a Forth system. See Image#2 above

Address Interpreter

The Address Interpreter forms part of the figForth Virtual Machine (“VM”) along with some other words…

Image#3. The Forth Address Interpreter in Pseudocode Image: (Nordstrom 1990)

Image#4. A Simplified Forth Address Interpreter in Pseudocode Image: (Rodriguez 1993)

Address Interpreter in action

Image#5. The Forth Address (“Inner”) Interpreter in Pseudocode (illustrated)
Image: (Husband 2011), based upon material by (Ting 2013, pp. 27-29) & (Husband 2011)
Based on the Intel 8080 figForth model & using the “SPELL” Forth Decompiler Tool

Image#6. Some Address Interpreters implemented on various microprocessors
Image: (Husband 2011), based upon some work done since 1983 based on the figForth models
Zilog ZNEO based upon work by (Rodriguez 2006)

Forth VM Registers mapped to eZ80 Registers

Image#7. Forth VM Registers mapped to eZ80 Registers (Typical mapping)
Image: (Husband 2011), based upon (Ting 2013, p. 27)

The Colon Interpreter

DOCOL is the run-time code for all Colon Definitions inserted by the Colon Compiler

This code does three things:

It saves the contents of the Interpretive Pointer (“IP”) onto the Return Stack
Then it moves its parameter field address into IP, so pointing to the first word of the colon definition 2

Finally, it jumps to NEXT, the normal entry point to the Address Interpreter to execute that word…
Based upon (Brodie 1981, p. 226),(Ting 2013, p. 29)

Image#8. Pseudocode for the Colon Interpreter DOCOL
Image: (Husband 2011), based upon (Ting 2013, p. 29)

Image#9. COLON & DOCOL in eZ80 Code
Image: (Husband 2011), based on the Intel 8080 figForth model

Semicolon Interpreter

Called Semis, this is the high-level equivalent of the microprocessor’s “Return” from Call instruction and does the opposite of the Colon Interpreter DOCOL

Image#10. The Forth Semicolon Interpreter Image: (Husband 2011)

Constant Interpreter

DOCON is the run-time code for all Constant Definitions inserted by the Constant Compiler
This code does two things:

  • It uses the W register (current word pointer) to point to the parameter field of the constant definition
  • fetches and pushes the contents of this parameter field address onto the stack

Image#11. Pseudocode for the Constant Interpreter DOCON
Image: (Husband 2011), based upon (Ting 2013, p. 74)

Image#12. CONSTANT & DOCON in eZ80 Code
Image: (Husband 2011), based on the Intel 8080 figForth model

Compare DOCON with DOVAR which is very subtly different… as indeed are the corresponding compilers…
So what is going on? How can something so simple be so fiendish…??

Variable Interpreter

DOVAR is the run-time code for all Variable Definitions inserted by the Variable Compiler

This code does only one thing:
It uses the W register (current word pointer) to point to the parameter field of the variable definition and pushes this parameter field address onto the stack132

However, this is not the whole story, because DOVAR replies upon the neat way that the Variable Compiler VARIABLE uses the Constant Compiler CONSTANT at compile-time

Image#13. Pseudocode for the Variable Interpreter DOVAR
Image: (Husband 2011), based upon (Ting 2013, p. 74)

Image#14. VARIABLE & DOVAR in eZ80 Code
Image: (Husband 2011), based on the Intel 8080 figForth model

User Variable Interpreter

User Variables were originally defined for a multitasking or a multiuser Forth system 3 (Ting 2013, p. 55) but are vital for any embedded Forth because variables need to be held in RAM and not ROM 4

DOUSE is the run-time code for all User Variable Definitions inserted by the User Variable Compiler USER

This code does four things:

  • It uses the W register (current word pointer) to point to the parameter field of the constant definition
  • fetches the contents (of the constant) and treats it as an offset
  • it adds this offset to the User Pointer (“UP”) to calculate a new parameter field address
  • and pushes this parameter field address onto the stack

Image#15. Pseudocode for the User Variable Interpreter DOUSE
Image: (Husband 2011), based upon (Ting 2013, p. 75)

Image#16. USER & DOUSE in eZ80 Code
Image: (Husband 2011), based on the Intel 8080 figForth model

Vocabulary Interpreter

A vocabulary in Forth is where the dictionary (which is a single-linked list) is spilt into a tree-like structure of other dictionaries. This is discussed in detail in Vocabulary Structure & Fig 180

The Vocabulary Interpreter DOVOC is relatively simple; the Vocabulary Compiler VOCABULARY is rather more complex

Image#17. DOVOC vocabulary interpreter source
Image: (Husband 2011), based on the Intel 8080 figForth model

The Vocabulary Compiler is a good example of a new defining word which uses the High Level Inner Interpreter <BUILDS … DOES> to insert the DOVOC run-time code into vocabularies so created. See Image#17 above

DOVOC is invoked when a word created with VOCABULARY is executed. In Image#18 below, APPLICATION is a vocabulary “switching” word

Image#18. Data contained by the vocabulary “switching” word APPLICATION
Image: (Husband 2011), based on the Intel 8080 figForth model & Forth archives

In Image#18 above, the structure of the word APPLICATION can be seen. This word was created by VOCABULARY which set up all the data you can see in APPLICATION’s parameter field which is entirely for the “benefit” of DOVOC (in Image#17)

Referring to Image#18 again, DOVOC takes the offset of “6”, adds it to VTABLE (which is a user variable and so is an alias for the address in RAM represented by UP0+36) and fetches the contents of UP0+36 +6, which in this case from looking at Image#19 below is WORDS-8 which is the link field address (“LFA”) of the top-most word in the APPLICATION vocabulary. This address is then placed into the system user variable CONTEXT so that any words subsequently wanting to search the dictionary will start at the intended point

Image#19. The Vocabulary Tables used to link various vocabularies into the root FORTH vocabulary
Image: (Husband 2011), based on the Intel 8080 figForth model

The mysterious Mr X makes a number of appearances. In this instance, he plays the role of a dummy header at the intersection of vocabularies (Ting, 1989, p. 59), because vocabularies must also be threaded into the root FORTH vocabulary and not mess-up the address threading mechanisms

He has one or two brothers, and is discussed in detail “The mysterious Mr X”

High Level Inner Interpreter

Inner interpreters are normally coded in the host’s machine code so that they can be executed directly. However, Forth does offer the <BUILDS … DOES> high-level defining word construct. (Ting, 1986, p. 56; 1989, p. 73)

See <BUILDS … DOES> for a full discussion…

The mechanism that allows this type of inner interpreter to execute correctly is DODOE, shown below in Image#20 & Image#21
See section C.9.8 Vocabulary Interpreter to see DODOE being used in a new vocabulary defining word

Image#20. Pseudocode for the High-Level Interpreter DODOE
Image: (Husband 2011), based upon (Ting, 1989, p. 73)

Image#21. DODOE in eZ80 Code
Image: (Husband 2011), based on the Intel 8080 figForth model

In-line Data Interpreters

In the parameter field of a colon definition there is normally a list of execution addresses (code field addresses), which the address interpreter processes and executes… (Ting, 1986, p. 57; 1989, p. 30)

However, there are times when it is needed to embed other data such as literal numbers or strings of characters, or addresses for routing program flow137…

  Literal Interpreter

Image#22. Pseudocode for the In-Line Data Interpreter LIT
Image: (Husband 2011), based upon (Ting, 1989, p. 30)

Image#23. LIT in eZ80 Code
Image: (Husband 2011), based on the Intel 8080 figForth model

Although LIT does have a header and so is available to a user, it is normally compiled automagically by INTERPRET via LITERAL or DLITERAL

When LIT is executed, it fetches the value from the next word following it and pushes it onto the stack, or to put it another way, LIT forces an override, and causes the address interpreter to treat the data as a numeric literal rather than an execution address

Image#25. Fragment of code from COLD showing the internal form of numeric literals compiled by LITERAL
Image: (Husband 2011), based on the Intel 8080 figForth model

  Text Interpreter 5

(.”) and the character string following it are compiled by the immediate word .”

Image#26. Example of (.”) being used in the word ERROR
Image: (Husband 2011), based on the Intel 8080 figForth model

In Image#26 above is an example of (.”) (“P-dot-Q”) being used to output the string “? Message “ The string is prefixed with a one-byte length count

When (.”) is executed, the count and the string is pulled out of the execution sequence and output and then control is passed to the next word in the sequence. See Image#26 & Image#27 below (Ting, 1986, pp. 57,58)

Image#27. Source for (.”)
Image: (Husband 2011), based on the Intel 8080 figForth model

  Control Structure Interpreter

BRANCH & 0BRANCH

Forth is a totally structured language with no “Goto’s” and whereas this is true from a compiler point-of-view, the control structures actually only compile two kinds of “Goto” at execution time; an unconditional branch BRANCH and a conditional branch 0BRANCH

See C.12.10 Control Structure Compiler (Ting, 1986, pp. 58,59; 1989, pp. 78,79)

BRANCH is compiled by ELSE, AGAIN and REPEAT whereas 0BRANCH is com-piled by IF, UNTIL and WHILE

Image#28. Pseudocode for the Control Structure Interpreter BRANCH
Image: (Husband 2011), based upon (Ting, 1989, p. 78)

Image#29. BRANCH (BRAN) in eZ80 Code
Image: (Husband 2011), based on the Intel 8080 figForth model

Image#30. Pseudocode for the Control Structure Interpreter 0BRANCH
Image: (Husband 2011), based upon (Ting, 1989, pp. 78,79)

Image#31. 0BRANCH (ZBRAN) in eZ80 Code
Image: (Husband 2011), based on the Intel 8080 figForth model

Note some economy of code/structure in ZBRAN with a branch back into BRAN…
To see examples of in-line code involving ZBRAN & BRAN see (highlighted in green) QUIT in Fig 189

  Iteration with (DO) & (LOOP)

The BRANCH and 0BRANCH structures are simple because the program flow is diverted around the colon definition (Ting, 1989, p. 81)

The DO-LOOP type of constructs are more complicated because additional mechanisms other than branching are needed to keep track of the loop limits and loop counts. (Ting, 1989, p. 82)

DO compiles its run-time word (DO), detailed in Image#32 below

Image#32. Pseudocode for the Control Structure Interpreter (DO)
Image: (Husband 2011), based upon (Ting, 1989, p. 82)

The current loop index can be accessed by using the word I, detailed in Image#33 below

Image#33. Pseudocode for the Iteration index accessor word I
Image: (Husband 2011), based upon (Ting, 1989, p. 82)

Use of the word LEAVE can be used to force the loop to terminate at the next (LOOP) or (+LOOP), detailed in Image#34 below

Image#34. Pseudocode for the Control Structure forced exit word LEAVE
Image: (Husband 2011), based upon (Ting, 1989, p. 83)

LOOP compiles its run-time word (LOOP) which terminates the DO-LOOP structure, as detailed in Image#35 below

Pseudocode for the Control Structure Interpreter (LOOP)

Image#35. Pseudocode for the Control Structure Interpreter (LOOP)
Image: (Husband 2011), based upon (Ting, 1989, p. 83)

+LOOP compiles its run-time word (+LOOP) which also terminates the DO-LOOP struc-ture, and is used where the loop index needs to be other than the default of one. Detailed in Image#36 below

Pseudocode for the Control Structure Interpreter (+LOOP)

Image#36. Pseudocode for the Control Structure Interpreter (+LOOP) Image: (Husband 2011), based upon (Ting, 1989, p. 84)

Example of (DO), I & (LOOP) being used in TYPE

Image#37. Example of (DO), I & (LOOP) being used in TYPE
Image: (Husband 2011), based on the Intel 8080 figForth model

Primitives

Summary

So, figForth indirectly-threaded code is a list of execution addresses but if it were just that the language would be very limited…

So other words have been designed (compilers & interpreters) to deviate from that very narrow requirement and Forth gives you the ability to define others to suit your problem-solving needs!

References:

Brodie, L., 1981. Starting FORTH : an introduction to the FORTH language and operating system for beginners and professionals [online]. Englewood Cliffs, N.J: Prentice-Hall. Available from: https://www.forth.com/starting-forth/.

Husband, D., 2011. M.Sc in IT (Software Engineering). Master’s thesis. University of Liverpool.

Nordstrom, D. J., 1990. Threading Lisp.

Rodriguez, B., 1993. Moving Forth. Available from: http://www.bradrodriguez.com/papers/moving1.htm.

Rodriguez, B. J., 2006. Camel_FORTH for ZNEO. Available from: http://www.hytherion.com/beattidp/comput/z80forth.htm.

Ting, C. H., 2013. Systems Guide to fig-Forth [online]. 3rd ed. San Mateo, CA 94402, USA: Offete Enterprises, Inc. Available from: http://figforth.org.uk/library/Systems.Guide.to.figForth.pdf.

  1. Most likely DOCOL if the word is a Colon Definition or $+2 to execute the code directly until a return to NEXT in the case of a Code Definition (“Primitive”)… 

  2. … because that is the next word that should be executed… 

  3. Multitasking and Multiuser abilities were removed from the figForth model – but are easy to put back in..! 

  4. See figForth in ROM for a discussion of this topic 

  5. Not to be confused with parsing that interprets input text but in a totally different way… 


Updated: 1st March 2022 by David Husband
© 2021 David Husband, a.k.a. Baremetal Engineer Extraordinaire
All Rights Reserved – All Trademarks & Copyrights Acknowledged
All personal information is subject to the Data Protection Act 2018 & the UK GDPR
“ad auxilium aliis ad auxilium sibi”