3. SORTs and TOKENs

  1. 3.1. Token applications and first-class SORTs
  2. 3.2. Token definitions and declarations
  3. 3.3. A simple use of a TOKEN
  4. 3.4. Second class SORTs

In the syntax of language like C or Pascal, we find various syntactic units like <Expression>, <Identifier> etc. A SORT bears the same relation to TDF as these syntactic units bear to the language; roughly speaking, the syntactic unit <Expression> corresponds to the SORT EXP and <Identifier> to TAG. However, instead of using BNF to compose syntactic units from others, TDF uses explicit constructors to compose its SORTs; each constructor uses other pieces of TDF of specified SORTs to make a piece of its result SORT. For example, the constructor plus uses an ERROR_TREATMENT and two EXPs to make another EXP.

At the moment, there are 58 different SORTS, from ACCESS to VARIETY given in tables 1 and 2. Some of these have familiar analogues in standard language construction as with EXP and TAG above. Others will be less familiar since TDF must concern itself with issues not normally addressed in language definitions. For example, the process of linking together TDF programs is at the root of the architecture neutrality of TDF and so must form an integral part of its definition. On the other hand, TDF is not meant to be a language readily accessible to the human reader or writer; computers handle it much more easily. Thus a great many choices have been made in the definition which would be intolerable in a standard language definition for the human programmer but which, paradoxically enough, make it much simpler for a computer to produce and analyse TDF.

The SORTs and constructors in effect form a multi-sorted algebra. There were two principal reasons for choosing this algebraic form of definition. First, it is easy to extend - a new operation on existing constructs simply requires a new constructor. Secondly, the algebraic form is highly amenable to the automatic construction of programs. Large parts of both TDF producers and TDF translators have been created by automatic transformation of the text of the specification document itself, by extracting the algebraic signature and constructing C program which can read or produce TDF. To this extent, one can regard the specification document as a formal description of the free algebra of TDF SORTs and constructors. Of course, most of the interesting parts of the definition of TDF lies in the equivalences of parts of TDF, so this formality only covers the easy bit.

Another distinction between the TDF definition and language syntactic description is that TDF is to some extent conscious of its own SORTs so that it can specify a new construction of a given SORT. The analogy in normal languages would be that one could define a new construction with new syntax and say this is an example of an <Expression>, for example; I don't know of any standard language which permits this, although those of you with a historical bent might remember Algol-N which made a valiant attempt at it. Of course, the algebraic method of description makes it much easier to specify, rather than having to give syntax to provide the syntax for the new construction in a language.

3.1. Token applications and first-class SORTs

A new construction is introduced by the SORT TOKEN; the constructors involving TOKENs allow one to give an expansion for the TOKEN in terms of other pieces of TDF, possibly including parameters. We can encapsulate a (possibly parameterised) fragment of TDF of a suitable SORT by giving it a TOKEN as identification. Not all of the SORTs are available for this kind of encapsulation - only those which have a SORTNAME constructor (from access to variety). These are the "first-class" SORTs given in table 1. Each of these have an appropriate _apply_token constructor (e.g. exp_apply_token) give the expansion. Most of these also have _cond constructors (e.g.see exp_cond in section 9.1) which allows translate time conditional expansion of the SORT.

SORTUsageSORTUsage
ACCESSProperties of TAGsAL_TAGName for alignment
ALIGNMENTAbstraction of data alignmentBITFIELD_VARIETYGives number of bits in bit-field with sign
BOOLtrue or falseERROR_TREATMENTHow to handle errors in operations
EXPPiece of TDF program, manipulating valuesFLOATING_VARIETYKind of floating point number of unbounded size
LABELMark on EXP to jump toNATNon-negative static number of unbounded size
NTESTTest in comparisonsPROCPROPSProperties of calls and definitions of procedures
ROUNDING_MODEHow to round floating point operationsSHAPEAbstraction of size and representation of values
SIGNED_NATStatic number of unbounded sizeSTRINGStatic string of n-bit integers
TAGName for a value in run-time programTRANSFER_MODEControls special contents and assignment operations
TOKENInstallation-time functionVARIETYKind of integer used in run-time program
Table 6. First class SORTs

Every TOKEN has a result SORT, i.e. the SORT of its resulting expansion and before it can be expanded, one must have its parameter SORTs. Thus, you can regard a TOKEN as having a type defined by its result and parameter SORTs and the _apply_token as the operator which expands the encapsulation and substitutes the parameters.

However, if we look at the signature of exp_apply_token:

token_value: TOKEN
token_args:  BITSTREAM param_sorts(token_value)
             -> EXP x

we are confronted by the mysterious BITSTREAM where one might expect to find the actual parameters of the TOKEN.

To explain BITSTREAMs requires a diversion into the bit-encoding of TDF. Constructors for a particular SORT are represented in a number of bits depending on the number of constructors for that SORT; the context will determine the SORT required, so no more bits are required. Thus since there is only one constructor for UNITs, no bits are required to represent make_unit; there are about 120 different constructors for EXPs so 7 bits are required to cover all the EXPs. The parameters of each constructor have known SORTs and so their representations are just concatenated after the representation of the constructor. [a] While this is a very compact representation, it suffers from the defect that one must decode it even just to skip over it. This is very irksome is some applications, notably the TDF linker which is not interested detailed expansions. Similarly, in translators there are places where one wishes to skip over a token application without knowledge of the SORTs of its parameters. Thus a BITSTREAM is just an encoding of some TDF, preceded by the number of bits it occupies. Applications can then skip over BITSTREAMs trivially. Similar considerations apply to BYTESTREAMs used elsewhere; here the encoding is preceded by the number of bytes in the encoding and is aligned to a byte boundary to allow fast copying.

3.2. Token definitions and declarations

Thus the token_args parameter of exp_apply_token is just the BITSTREAM formed from the actual parameters in the sequence described by the definition of the token_value parameter. This will be given in a TOKEN_DEFN somewhere with constructor token_definition:

result_sort: SORTNAME
tok_params:  LIST(TOKFORMALS)
body:        result_sort
             -> TOKEN_DEFN

The result_sort is the SORT of the construction of body; e.g. if result_sort is formed from exp then body would be constructed using the EXP constructors and one would use exp_apply_token to give the expansion. The list tok_params gives the formal parameters of the definition in terms of TOKFORMALS constructed using make_tok_formals:

sn: SORTNAME
tk: TDFINT
    -> TOKFORMALS

The TDFINT tk will be the integer representation of the formal parameter expressed as a TOKEN whose result sort is sn (see more about name representation in section 3.1). To use the parameter in the body of the TOKEN_DEFN, one simply uses the _apply_token appropriate to sn.Note that sn may be a TOKEN but the result_sort may not.

Hence the BITSTREAM param_sorts(token_value) in the actual parameter of exp_apply_token above is simply formed by the catenation of constructions of the SORTs given by the SORTNAMEs in the tok_params of the TOKEN being expanded.

Usually one gives a name to a TOKEN_DEFN to form a TOKDEF using make_tokdef:

tok:       TDFINT
signature: OPTION(STRING)
def:       BITSTREAM TOKEN_DEFN
           -> TOKDEF

Here, tok gives the name that will be used to identify the TOKEN whose expansion is given by def. Any use of this TOKEN (e.g. in exp_apply_token) will be given by make_token(tok). Once again, a BITSTREAM is used to encapsulate the TOKEN_DEFN.

The significance of the signature parameter is discussed in section 3.2.2.

Often, one wishes a token without giving its definition - the definition could, for example, be platform-dependent. A TOKDEC introduces such a token using make_tokdec:

tok:       TDFINT
signature: OPTION(STRING)
s:         SORTNAME
           -> TOKDEC

Here the SORTNAME, s, is given by token:

result: SORTNAME
				params: LIST(SORTNAME)
						-> SORTNAME

which gives the result and parameter SORTs of tok.

One can also use a TOKEN_DEFN in an anonymous fashion by giving it as an actual parameter of a TOKEN which itself demands a TOKEN parameter. To do this one simply uses use_tokdef:

tdef: BITSTREAM TOKEN_DEFN
      -> TOKEN

3.3. A simple use of a TOKEN

The crucial use of TOKENs in TDF is to provide abstractions of APIs (see section 10) but they are also used as shorthand for commonly occurring constructions. For example, given the TDF constructor plus, mentioned above, we could define a plus with only two EXP parameters more suitable to C by using the wrap constructor as the ERROR_TREATMENT:

make_tokdef(C_plus, empty, token_definition(exp(), (make_tokformals(exp(), l),
                                                    make_tokformals(exp(), r)),
            plus(wrap(), exp_apply_token(l, ()), exp_apply_token(r, ())))

3.4. Second class SORTs

Second class SORTs (given in table 2) cannot be TOKENised. These are the "syntactic units" of TDF which the user cannot extend; he can only produce them using the constructors defined in core-TDF.

Some of these constructors are implicit. For example, there are no explicit constructors for LIST or SLIST which are both used to form lists of SORTs; their construction is simply part of the encoding of TDF. However, it is forseen that LIST constructors would be highly desireable and there will probably extensions to TDF to promote LIST from a second-class SORT to a first-class one. This will not apply to SLIST or to the other SORTs which have implicit constructions. These include BITSTREAM, BYTESTREAM, TDFINT, TDFIDENT and TDFSTRING.

SORTUsageSORTUsage
AL_TAGDEFAlignment name definitionAL_TAGDEF_PROPSBody of UNIT containing AL_TAGDEFs
BITSTREAMEncapsulation of a bit encodingBYTESTREAMEncapsulation of a byte encoding
CALLEESActual callee parametersCAPSULEIndependent piece of TDF program
CAPSULE_LINENumber and kind of linkable entities in CAPSULECASELIMBounds in case constructor
ERROR_CODEEncoding for exceptionsEXTERNALExternal name used to connect CAPSULE name
EXTERN_LINKList of LINKEXTERNs in CAPSULEGROUPList of UNITs with same identification
LINKConnects names in CAPSULELINKEXTERNUsed to connect CAPSULE names to outside world
LINKSList of LINKsLIST(AUX)List of AUX SORTs; may have SORTNAME later
OTAGEXPDescribes a formal parameterPROPSProgram info in a UNIT
SLIST(AUX)List of AUX SORTs; will not have SORTNAME laterSORTNAMESORT which can be parameter of TOKEN
TAGACCUsed in constructing proc formalsTAGDECDeclaration of TAG at UNIT level
TAGDEC_PROPSBody of UNIT containing TAGDECsTAGDEFDefinition of TAG at UNIT level
TAGDEF_PROPSBody of UNIT containing TAG DEFsTAGSHACCA formal parameter
TDFBOOLTDF encoding of a booleanTDFIDENTTDF encoding of a byte string
TDFINTTDF encoding of an integerTDFSTRINGTDF encoding of an n-bit byte string
TOKDECDeclaration of a TOKENTOKDEC_PROPSBody of UNIT contianing TOKDECs
TOKDEFDefinition of a TOKENTOKDEF_PROPSBody of UNIT contianing TOKDEFs
TOKEN_DEFNDefines TOKEN expansionTOKFORMALSSort and name for parameters in TOKEN_DEFN
UNIQUEWorld-wide nameUNITComponents of CAPSULE with LINKs to other UNITs
VERSIONVersion number of TDFVERSION_PROPSBody of UNIT contianing version information
Table 6. SORTs without SORTNAMEs
  1. [a]

    There are facilities to allow extensions to the number of constructors, so it is not quite as simple as this.