Input syntax – TenDRA

1. Input syntax

1.1. Primitives
1.2. Identities
1.3. Enumerations
1.4. Structures
1.5. Unions
1.6. Type constructors
1.7. Relations between algebras

The overall input file format is as follows:

algebra:
			ALGEBRA identifier version?: item-list?

version:
			(integer.integer)

item-list:
			item
			item-list item

item:
			primitive
			identity
			enumeration
			structure
			union

The initial identifier gives the overall name of the algebra. A version number may also be associated with the algebra (if this is omitted the version is assumed to be 1.0). The main body of the algebra definition consists of a list of items describing the primitives, the identities, the enumerations, the structures and the unions comprising the algebra.

Here identifier has the same meaning as in C. The only other significant lexical units are integer, which consists of a sequence of decimal digits, and string, which consists of any number of characters enclosed in double quotes. There are no escape sequences in strings. C style comments may be used anywhere in the input. White space is not significant.

1.1. Primitives

Primitives form the basic components from which the other types in the algebra are built up. They are described as follows:

primitive:
			object-identifier = quoted-type;

where the primitive identifier is given by:

object-identifier:
			#?:? identifier
			#?:? identifier(identifier)

and the primitive definition is a string which gives the C type corresponding to this primitive:

quoted-type:
			string

Note that each primitive (and also each identity, each enumeration, each structure and each union) has two names associated with it. The second name is optional; if it is not given then it is assumed to be the same as the first name. The first name is that which will be given to the corresponding type in the output file. The second is a short form of this name which will be used in forming constructor names etc. in the output.

The optional hash and colon which may be used to qualify an object identifier are provided for backwards compatibility only and are not used in the output routines.

1.2. Identities

Identities are used to associate a name with a particular type in the algebra. In this they correspond to typedefs in C. They are described as follows:

identity:
			object-identifier = type;

where the definition type, type, is as described below.

1.3. Enumerations

Enumerations are used to define types which can only take values from some finite set. They are described as follows:

enumeration:
			enum !? object-identifier = { enumerator-list };
			enum !? object-identifier = base-enumeration + { enumerator-list };

where:

base-enumeration:
			identifier

is the name of a previously defined enumeration type. The latter form is used to express extension enumeration types. An enumeration type may be qualified by an exclamation mark to indicate that no lists of this type will be constructed.

The enumeration constants themselves are defined as follows:

enumerator:
			identifier
			identifier = enumerator-value

enumerator-list:
			enumerator
			enumerator-list, enumerator

Each enumerator is assigned a value in an ascending sequence, starting at zero. The next value to be assigned can be set using an enumerator-value. This is an expression formed from integers, identifiers representing previous enumerators from the same enumeration, and the question mark character which stands for the previous enumeration value. The normal C arithmetic operations can be applied to build up more complex enumerator-values. All enumerator evaluation is done in the unsigned long type of the host machine. Values containing more than 32 bits are not portable.

Enumerations thus correspond to enumeration types in C, except that they are genuinely distinct types.

1.4. Structures

Structures are used to build up composite types from other types in the algebra. They correspond to structures in C. They are described as follows:

structure:
			struct object-identifier = component-group;
			struct object-identifier = base-structure + component-group;

where:

base-structure:
			identifier

is the name of a previously defined structure type. The latter form is used to express (single) inheritance of structures. All components of the base structure also become components of the derived structure.

The structure components themselves are defined as follows:

component-group:
			{ component-list? }

component-list:
			component-declarations;
			component-list component-declarations;

component-declarations:
			type component-declarators

component-declarators:
			component-declarator
			component-declarators, component-declarator

component-declarator:
			identifier component-initialiser?

component-initialiser:
			= string

The optional component initialiser strings are explained below.

Structures are the only algebra construct which prevent the input from being a general graph. Unions may be defined in terms of themselves, but (as in C) pointers must be used to define structures in terms of themselves.

1.5. Unions

Unions are used to build up types which can hold a variety of information. They differ from C unions in that they are discriminated. They are described as follows:

union:
			union object-identifier = component-group + field-group map-group?;
			union object-identifier = base-union + field-group map-group?;

where:

base-union:
			identifier

is the name of a previously defined union type. The latter form is used to express (single) inheritance of unions. All components, fields and maps of the base union also become components of the derived union. Note that only new fields and maps can be added in the derived union.

The component-group gives a set of components which are common to all the different union cases. The cases themselves are described as follows:

field-group:
			{ field-list }

field:
			#? #? field-identifier-list->component-group
			#? #? field-identifier-list->base-field + component-group

base-field:
			identifier

field-list:
			field
			field-list, field

field-identifier:
			identifier

field-identifier-list:
			field-identifier
			field-identifier-list, field-identifier

The optional one or two hashes which may be used to qualify a list of field identifiers are used to indicate aliasing in the disk reading and writing routines. The base-field case is a notational convenience which allows one field in a union to inherit all the components of another field.

Note that a number of field identifiers may be associated with the same set of field components. Any such list containing more than one identifier forms a field identifier set, named after the first field identifier.

In addition a number of maps may be associated with a union. These maps correspond to functions which take the union, plus a number of other map parameter types, and return the map return type. They are described as follows:

map-group:
			:[map-list?]

map:
			extended-type #? identifier(parameter-list?)

map-list:
			map
			map-list map

where:

parameter-list:
			parameter-declarations
			parameter-list; parameter-declarations

parameter-declarations:
			extended-type parameter-declarators

parameter-declarators:
			identifier
			parameter-declarators, identifier

Note that the map parameter and return types are given by:

extended-type:
								type
								quoted-type

In addition to the types derived from the algebra it is possible to use quoted C types in this context.

A map may be qualified by means of a hash. This means that the associated function also takes a destructor function as a parameter.

1.6. Type constructors

The types derived from the algebra may be described as follows:

type:
			identifier
			PTR type
			LIST type
			STACK type
			VEC type
			VEC_PTR type

The simple types correspond to primitive, identity, enumeration, structure or union names. It is possible for a type to be used before it is defined, but it must be defined at some point.

The derived type constructors correspond to pointers, lists, stacks, vectors and pointers into vectors. They may be used to build up further types from the basic algebra types.

1.7. Relations between algebras

As mentioned above, more than one input algebra may be specified to calculus. Each is processed separately, and output is generated for only one. By default this is the last algebra processed, however a specific algebra can be specified using the command-line option -A name, where name is the name of the algebra to be used for output.

Types may be imported from one algebra to another by means of commands of the form:

import:
			IMPORT identifier;
			IMPORT identifier::identifier;

which fit into the main syntax as an item. The first form imports all the types from the algebra given by identifier into the current algebra. The second imports a single type, given by the second identifier from the algebra given by the first identifier.

Note that importing a type in this way also imports all the types used in its construction. This includes such things as structure components and union fields and maps. Thus an algebra consisting just of import commands can be used to express subalgebras in a simple fashion.