3. SORTs and TOKENs
- 3.1. Token applications and first-class SORTs
- 3.2. Token definitions and declarations
- 3.3. A simple use of a TOKEN
- 3.4. Second class SORTs
In the syntax of language like C or Pascal, we find various syntactic units like <Expression>, <Identifier> etc. A SORT bears the same relation to TDF as these syntactic units bear to the language; roughly speaking, the syntactic unit <Expression> corresponds to the SORT EXP and <Identifier> to TAG. However, instead of using BNF to compose syntactic units from others, TDF uses explicit constructors to compose its SORTs; each constructor uses other pieces of TDF of specified SORTs to make a piece of its result SORT. For example, the constructor plus uses an ERROR_TREATMENT and two EXPs to make another EXP.
At the moment, there are 58 different SORTS, from ACCESS to VARIETY given in tables 1 and 2. Some of these have familiar analogues in standard language construction as with EXP and TAG above. Others will be less familiar since TDF must concern itself with issues not normally addressed in language definitions. For example, the process of linking together TDF programs is at the root of the architecture neutrality of TDF and so must form an integral part of its definition. On the other hand, TDF is not meant to be a language readily accessible to the human reader or writer; computers handle it much more easily. Thus a great many choices have been made in the definition which would be intolerable in a standard language definition for the human programmer but which, paradoxically enough, make it much simpler for a computer to produce and analyse TDF.
The SORTs and constructors in effect form a multi-sorted algebra. There were two principal reasons for choosing this algebraic form of definition. First, it is easy to extend - a new operation on existing constructs simply requires a new constructor. Secondly, the algebraic form is highly amenable to the automatic construction of programs. Large parts of both TDF producers and TDF translators have been created by automatic transformation of the text of the specification document itself, by extracting the algebraic signature and constructing C program which can read or produce TDF. To this extent, one can regard the specification document as a formal description of the free algebra of TDF SORTs and constructors. Of course, most of the interesting parts of the definition of TDF lies in the equivalences of parts of TDF, so this formality only covers the easy bit.
Another distinction between the TDF definition and language syntactic description is that TDF is to some extent conscious of its own SORTs so that it can specify a new construction of a given SORT. The analogy in normal languages would be that one could define a new construction with new syntax and say this is an example of an <Expression>, for example; I don't know of any standard language which permits this, although those of you with a historical bent might remember Algol-N which made a valiant attempt at it. Of course, the algebraic method of description makes it much easier to specify, rather than having to give syntax to provide the syntax for the new construction in a language.
3.1. Token applications and first-class SORTs
A new construction is introduced by the SORT TOKEN; the constructors involving TOKENs allow one to give an expansion for the TOKEN in terms of other pieces of TDF, possibly including parameters. We can encapsulate a (possibly parameterised) fragment of TDF of a suitable SORT by giving it a TOKEN as identification. Not all of the SORTs are available for this kind of encapsulation - only those which have a SORTNAME constructor (from access to variety). These are the "first-class" SORTs given in table 1. Each of these have an appropriate _apply_token constructor (e.g. exp_apply_token) give the expansion. Most of these also have _cond constructors (e.g.see exp_cond in section 9.1) which allows translate time conditional expansion of the SORT.
SORT | Usage | SORT | Usage | |
---|---|---|---|---|
ACCESS | Properties of TAG s | AL_TAG | Name for alignment | |
ALIGNMENT | Abstraction of data alignment | BITFIELD_VARIETY | Gives number of bits in bit-field with sign | |
BOOL | true or false | ERROR_TREATMENT | How to handle errors in operations | |
EXP | Piece of TDF program, manipulating values | FLOATING_VARIETY | Kind of floating point number of unbounded size | |
LABEL | Mark on EXP to jump to | NAT | Non-negative static number of unbounded size | |
NTEST | Test in comparisons | PROCPROPS | Properties of calls and definitions of procedures | |
ROUNDING_MODE | How to round floating point operations | SHAPE | Abstraction of size and representation of values | |
SIGNED_NAT | Static number of unbounded size | STRING | Static string of n-bit integers | |
TAG | Name for a value in run-time program | TRANSFER_MODE | Controls special contents and assignment operations | |
TOKEN | Installation-time function | VARIETY | Kind of integer used in run-time program |
SORT
sEvery TOKEN has a result SORT, i.e. the SORT of its resulting expansion and before it can be expanded, one must have its parameter SORTs. Thus, you can regard a TOKEN as having a type defined by its result and parameter SORTs and the _apply_token as the operator which expands the encapsulation and substitutes the parameters.
However, if we look at the signature of exp_apply_token:
token_value: TOKEN token_args: BITSTREAM param_sorts(token_value) -> EXP x
we are confronted by the mysterious BITSTREAM where one might expect to find the actual parameters of the TOKEN.
To explain BITSTREAMs requires a diversion into the bit-encoding of TDF. Constructors for a particular SORT are represented in a number of bits depending on the number of constructors for that SORT; the context will determine the SORT required, so no more bits are required. Thus since there is only one constructor for UNITs, no bits are required to represent make_unit; there are about 120 different constructors for EXPs so 7 bits are required to cover all the EXPs. The parameters of each constructor have known SORTs and so their representations are just concatenated after the representation of the constructor. [a] While this is a very compact representation, it suffers from the defect that one must decode it even just to skip over it. This is very irksome is some applications, notably the TDF linker which is not interested detailed expansions. Similarly, in translators there are places where one wishes to skip over a token application without knowledge of the SORTs of its parameters. Thus a BITSTREAM is just an encoding of some TDF, preceded by the number of bits it occupies. Applications can then skip over BITSTREAMs trivially. Similar considerations apply to BYTESTREAMs used elsewhere; here the encoding is preceded by the number of bytes in the encoding and is aligned to a byte boundary to allow fast copying.
3.2. Token definitions and declarations
Thus the token_args
parameter of exp_apply_token is just the BITSTREAM formed from the actual parameters in the sequence described by the definition of the token_value
parameter. This will be given in a TOKEN_DEFN somewhere with constructor token_definition:
result_sort: SORTNAME tok_params: LIST(TOKFORMALS) body: result_sort -> TOKEN_DEFN
The result_sort
is the SORT of the construction of body
; e.g. if result_sort
is formed from exp then body
would be constructed using the EXP constructors and one would use exp_apply_token to give the expansion. The list tok_params
gives the formal parameters of the definition in terms of TOKFORMALS constructed using make_tok_formals:
sn: SORTNAME tk: TDFINT -> TOKFORMALS
The TDFINT tk
will be the integer representation of the formal parameter expressed as a TOKEN whose result sort is sn
(see more about name representation in section 3.1). To use the parameter in the body of the TOKEN_DEFN, one simply uses the _apply_token appropriate to sn
.Note that sn may be a TOKEN but the result_sort
may not.
Hence the BITSTREAM param_sorts
(token_value
) in the actual parameter of exp_apply_token above is simply formed by the catenation of constructions of the SORTs given by the SORTNAMEs in the tok_params
of the TOKEN being expanded.
Usually one gives a name to a TOKEN_DEFN to form a TOKDEF using make_tokdef:
tok: TDFINT signature: OPTION(STRING) def: BITSTREAM TOKEN_DEFN -> TOKDEF
Here, tok
gives the name that will be used to identify the TOKEN whose expansion is given by def
. Any use of this TOKEN (e.g. in exp_apply_token) will be given by make_token(tok
). Once again, a BITSTREAM is used to encapsulate the TOKEN_DEFN.
The significance of the signature parameter is discussed in section 3.2.2.
Often, one wishes a token without giving its definition - the definition could, for example, be platform-dependent. A TOKDEC introduces such a token using make_tokdec:
tok: TDFINT signature: OPTION(STRING) s: SORTNAME -> TOKDEC
Here the SORTNAME, s
, is given by token:
result: SORTNAME params: LIST(SORTNAME) -> SORTNAME
which gives the result and parameter SORTs of tok
.
One can also use a TOKEN_DEFN in an anonymous fashion by giving it as an actual parameter of a TOKEN which itself demands a TOKEN parameter. To do this one simply uses use_tokdef:
tdef: BITSTREAM TOKEN_DEFN -> TOKEN
3.3. A simple use of a TOKEN
The crucial use of TOKENs in TDF is to provide abstractions of APIs (see section 10) but they are also used as shorthand for commonly occurring constructions. For example, given the TDF constructor plus, mentioned above, we could define a plus with only two EXP parameters more suitable to C by using the wrap constructor as the ERROR_TREATMENT:
make_tokdef(C_plus, empty, token_definition(exp(), (make_tokformals(exp(), l), make_tokformals(exp(), r)), plus(wrap(), exp_apply_token(l, ()), exp_apply_token(r, ())))
3.4. Second class SORTs
Second class SORTs (given in table 2) cannot be TOKENised. These are the "syntactic units" of TDF which the user cannot extend; he can only produce them using the constructors defined in core-TDF.
Some of these constructors are implicit. For example, there are no explicit constructors for LIST or SLIST which are both used to form lists of SORTs; their construction is simply part of the encoding of TDF. However, it is forseen that LIST constructors would be highly desireable and there will probably extensions to TDF to promote LIST from a second-class SORT to a first-class one. This will not apply to SLIST or to the other SORTs which have implicit constructions. These include BITSTREAM, BYTESTREAM, TDFINT, TDFIDENT and TDFSTRING.
SORT | Usage | SORT | Usage | |
---|---|---|---|---|
AL_TAGDEF | Alignment name definition | AL_TAGDEF_PROPS | Body of UNIT containing AL_TAGDEF s | |
BITSTREAM | Encapsulation of a bit encoding | BYTESTREAM | Encapsulation of a byte encoding | |
CALLEES | Actual callee parameters | CAPSULE | Independent piece of TDF program | |
CAPSULE_LINE | Number and kind of linkable entities in CAPSULE | CASELIM | Bounds in case constructor | |
ERROR_CODE | Encoding for exceptions | EXTERNAL | External name used to connect CAPSULE name | |
EXTERN_LINK | List of LINKEXTERN s in CAPSULE | GROUP | List of UNIT s with same identification | |
LINK | Connects names in CAPSULE | LINKEXTERN | Used to connect CAPSULE names to outside world | |
LINKS | List of LINK s | LIST(AUX) | List of AUX SORT s; may have SORTNAME later | |
OTAGEXP | Describes a formal parameter | PROPS | Program info in a UNIT | |
SLIST(AUX) | List of AUX SORT s; will not have SORTNAME later | SORTNAME | SORT which can be parameter of TOKEN | |
TAGACC | Used in constructing proc formals | TAGDEC | Declaration of TAG at UNIT level | |
TAGDEC_PROPS | Body of UNIT containing TAGDEC s | TAGDEF | Definition of TAG at UNIT level | |
TAGDEF_PROPS | Body of UNIT containing TAG DEF s | TAGSHACC | A formal parameter | |
TDFBOOL | TDF encoding of a boolean | TDFIDENT | TDF encoding of a byte string | |
TDFINT | TDF encoding of an integer | TDFSTRING | TDF encoding of an n-bit byte string | |
TOKDEC | Declaration of a TOKEN | TOKDEC_PROPS | Body of UNIT contianing TOKDEC s | |
TOKDEF | Definition of a TOKEN | TOKDEF_PROPS | Body of UNIT contianing TOKDEF s | |
TOKEN_DEFN | Defines TOKEN expansion | TOKFORMALS | Sort and name for parameters in TOKEN_DEFN | |
UNIQUE | World-wide name | UNIT | Components of CAPSULE with LINK s to other UNIT s | |
VERSION | Version number of TDF | VERSION_PROPS | Body of UNIT contianing version information |
SORT
s without SORTNAME
s- [a]
There are facilities to allow extensions to the number of constructors, so it is not quite as simple as this.