TDF Diagnostic Specification
© , , The TenDRA Project.
© DERA.
First published .
Revision History
DERA | TenDRA 4.1.2 release. |
i. Introduction
The TDF diagnostic information is intended to convey all that information, used by current source level debuggers, which would conventionally be part of an object file. Any particular installer will only use those parts of this information which its native object format can represent.
The version of the diagnostics described here is the first version. It has only been tested with TDF produced from C programs. There are known to be certain deficiencies relative to other languages (in particular to FORTRAN). A later version will correct these deficiencies. The changes already envisaged are detailed in §3, and would have minimal (if any) impact on C producers.
The diagnostic system introduces one new type of TDF linkable entities, and currently adds two new units to the bitstream representation of TDF.
Much of the actual annotation of procedure bodies is currently done by reserved TOKEN
s, which installers recognize specially. These TOKEN
s are described in §2.
There is a resemblance between the TDF diagnostic information and Unix International's DWARF format. DWARF has similar aims to the TDF diagnostics, and ensuring that complete DWARF information could be generated provided a useful check during the development of the TDF diagnostics. However the TDF diagnostics are intended to be architecture (and format) neutral. No inference should be made about any link (present or future) between DWARF and TDF diagnostics.
1. Diagnostic SORTs
- 1.1. DIAG_DESCRIPTOR
- 1.2. DIAG_UNIT
- 1.3. DIAG_TAG
- 1.4. DIAG_TAGDEF
- 1.5. DIAG_TYPE_UNIT
- 1.6. DIAG_TYPE
- 1.7. ENUM_VALUES
- 1.8. DIAG_FIELD
- 1.9. DIAG_TQ
- 1.10. FILENAME
- 1.11. SOURCEMARK
As a summary of this section:
-
DIAG_TYPE
s describe programming language types (e.g. arrays, structs...).DIAG_TQ
s are qualifiers ofDIAG_TYPE
s used for attributes like volatile and const. -
FILENAME
s andSOURCEMARK
s describe source files and locations within them. -
DIAG_TAG
s associate integers withDIAG_TYPE
s. They are used in a similar manner to normal TDFTAG
s, and are held in a (TDF) linkable unit called aDIAG_TYPE_UNIT
. -
DIAG_UNIT
s hold a collection ofDIAG_DESCRIPTOR
s, used for information outside procedure bodies.
1.1. DIAG_DESCRIPTOR
Number of encoding bits: | 2 |
Is coding extendable: | yes |
DIAG_DESCRIPTOR
s are used to associate names in the source program with diagnostic items.
1.1.1. diag_desc_id
Encoding number: | 1 |
src_name: TDFSTRING(k, n) whence: SOURCEMARK found_at: EXP POINTER(al) type: DIAG_TYPE -> DIAG_DESCRIPTOR
Generates a descriptor for an identifier (of DIAG_TYPE
type), whose source name was src_name from source location whence. The EXP
found_at describes how to access the value. Note that the EXP
need not be unique (e.g. FORTRAN EQUIVALENCE might be implemented this way).
1.1.2. diag_desc_struct
Encoding number: | 2 |
src_name: TDFSTRING(k, n) whence: SOURCEMARK new_type: DIAG_TYPE -> DIAG_DESCRIPTOR
Generates a descriptor whose source name was src_name. new_type must be either a DIAG_STRUCT
, DIAG_UNION
or DIAG_ENUM
.
This construct is obsolete.
1.1.3. diag_desc_typedef
Encoding number: | 3 |
src_name: TDFSTRING(k, n) whence: SOURCEMARK new_type: DIAG_TYPE -> DIAG_DESCRIPTOR
Generates a descriptor for a type new_type whose source name was src_name. Note that diag_desc_typedef is used for associating a name with a type, rather than for any name given in the initial description of the type (e.g. in C this is used for typedef, not for struct/union/enum tags).
1.2. DIAG_UNIT
Number of encoding bits: | 0 |
Is coding extendable: | no |
Unit identification: | diagdef |
A DIAG_UNIT
is a TDF unit containing DIAG_DESCRIPTOR
s. A DIAG_UNIT
is used to contain descriptions of items outside procedure bodies (e.g. global variables, global type definitions).
1.2.1. build_diag_unit
Encoding number: | 0 |
no_labels: TDFINT descriptors: SLIST(DIAG_DESCRIPTOR) -> DIAG_UNIT
Create a DIAG_UNIT
containing DIAG_DESCRIPTOR
s. no_labels is the number of local labels used in descriptors (for conditionals).
1.3. DIAG_TAG
Number of encoding bits: | 1 |
Is coding extendable: | yes |
Linkable entity identification: | diagtag |
DIAG_TAG
s are used inter alia to break cyclic diagnostic types. They are (TDF) linkable entities. A DIAG_TAG
is made from a number, and used in use_diag_tag to obtain the DIAG_TYPE
associated with that number by make_diag_tagdef.
1.3.1. make_diag_tag
Encoding number: | 1 |
num: TDFINT -> DIAG_TAG
Create a DIAG_TAG
from num.
1.4. DIAG_TAGDEF
Number of encoding bits: | 1 |
Is coding extendable: | yes |
DIAG_TAGDEF
s associate DIAG_TAG
s with DIAG_TYPE
s.
1.4.1. make_diag_tagdef
Encoding number: | 1 |
tno: TDFINT dtype: DIAG_TYPE -> DIAG_TAGDEF
Associates tag number tno with dtype.
1.5. DIAG_TYPE_UNIT
Number of encoding bits: | 0 |
Is coding extendable: | no |
Unit identification: | diagtype |
A DIAG_TYPE_UNIT
is a TDF unit containing DIAG_TAGDEF
s.
1.5.1. build_diagtype_unit
Encoding number: | 0 |
no_labels: TDFINT tagdefs: SLIST(DIAG_TAGDEF) -> DIAG_TYPEUNIT
Create a DIAG_TYPEUNIT
containing DIAG_TAGDEF
s. no_labels is the number of local labels used in tagdefs (for conditionals).
1.6. DIAG_TYPE
Sortname: | foreign_sort("diag_type") |
Number of encoding bits: | 4 |
Is coding extendable: | yes |
DIAG_TYPE
s are used to provide diagnostic information about data types.
1.6.1. diag_type_apply_token
Encoding number: | 1 |
token_value: TOKEN token_args: BITSTREAM -> DIAG_TYPE
The token is applied to the arguments to give a DIAG_TYPE
. If there is a definition for token_value in the CAPSULE
then token_args is a BITSTREAM
encoding of the SORT
s of its parameters, in the order specified.
1.6.2. diag_array
Encoding number: | 2 |
element_type: DIAG_TYPE stride: EXP OFFSET(p,p) lower_bound: EXP INTEGER(v) upper_bound: EXP INTEGER(v) index_type: DIAG_TYPE -> DIAG_TYPE
An array of element_type objects. stride is the OFFSET
between elements of the array (i.e. p is described by element_type). The bounds are in general not runtime constants, hence the values are EXP
s (not say SIGNED_NAT
). The VARIETY
v is described by index_type. As in TDF there is no multi-dimensional array primitive.
1.6.3. diag_bitfield
Encoding number: | 3 |
type: DIAG_TYPE number_of_bits: NAT -> DIAG_TYPE
Describes number_of_bits, which when extracted will have DIAG_TYPE
type.
1.6.4. diag_enum
Encoding number: | 4 |
base_type: DIAG_TYPE enum_name: TDFSTRING(k, n) values: LIST(ENUM_VALUES) -> DIAG_TYPE
An enumeration to be stored in an object of type base_type. If enum_name is a string contining zero characters this signifies no source tag.
1.6.5. diag_floating_variety
Encoding number: | 5 |
var: FLOATING_VARIETY -> DIAG_TYPE
Creates a DIAG_TYPE
to describe an FLOATING_VARIETY
var.
1.6.6. diag_loc
Encoding number: | 6 |
object: DIAG_TYPE qualifier: DIAG_TQ -> DIAG_TYPE
Records the existence of an item of DIAG_TYPE
object, qualified by qualifier. diag_loc is used for variables (which may of course not actually occupy a memory location).
1.6.7. diag_proc
Encoding number: | 7 |
params: LIST(DIAG_TYPE) optional_args: BOOL result_type: DIAG_TYPE -> DIAG_TYPE
Describes a procedure taking n parameters. optional_args is true if and only if the make_proc which this diag_proc describes had vartag present.
1.6.8. diag_ptr
Encoding number: | 8 |
object: DIAG_TYPE qualifier: DIAG_TQ -> DIAG_TYPE
Describes a pointer to an object of DIAG_TYPE
object. The DIAG_TQ
qualifier qualifier qualifies the pointer, not the object pointed to.
1.6.9. diag_struct
Encoding number: | 9 |
tdf_shape: SHAPE src_name: TDFSTRING(k, n) fields: LIST(DIAG_FIELD) -> DIAG_TYPE
Describes a structure. If src_name is a string contining zero characters this signifies no source tag for the whole structure. tdf_shape allows the total size to be computed.
1.6.10. diag_type_null
Encoding number: | 10 |
-> DIAG_TYPE
A null DIAG_TYPE
.
1.6.11. diag_union
Encoding number: | 11 |
tdf_shape: SHAPE src_name: TDFSTRING(k, n) fields: LIST(DIAG_FIELD) -> DIAG_TYPE
Describes a union. If src_name is a string contining zero characters this signifies no source tag for the whole union. tdf_shape allows the total size to be computed.
1.6.12. diag_variety
Encoding number: | 12 |
var: VARIETY -> DIAG_TYPE
Creates a DIAG_TYPE
to describe an integer VARIETY
var.
1.6.13. use_diag_tag
Encoding number: | 13 |
dtag: DIAG_TAG -> DIAG_TYPE
Obtains the DIAG_TYPE
associated with DIAG_TAG
dtag.
1.7. ENUM_VALUES
Number of encoding bits: | 0 |
Is coding extendable: | no |
1.7.1. make_enum_values_list
Encoding number: | 0 |
value: EXP sh src_name: TDFSTRING(k, n) -> ENUM_VALUES
ENUM_VALUES
describe elements of an enumerated type. src_name is the source language name. value evaluates to a value of SHAPE
sh. Note that all members of a LIST(ENUM_VALUES
) must have the same sh.
1.8. DIAG_FIELD
Number of encoding bits: | 0 |
Is coding extendable: | no |
1.8.1. make_diag_field
Encoding number: | 0 |
field_name: TDFSTRING(k, n) found_at: EXP OFFSET(ALIGNMENT whole, ALIGNMENT this_field) field_type: DIAG_TYPE -> DIAG_FIELD
DIAG_FIELD
s describe one field of a structure or union. field_name is the source language name. found_at is the OFFSET
between whole (the enclosing structure or union), and this field (this_field). field_type is the DIAG_TYPE
of the field.
1.9. DIAG_TQ
Number of encoding bits: | 2 |
Is coding extendable: | yes |
DIAG_TQ
s are type qualifiers, used to qualify DIAG_TYPE
s. A DIAG_TQ
is constructed from diag_tq_null
and the various add_diag_XXX
operations.
1.9.1. add_diag_const
Encoding number: | 1 |
qual: DIAG_TQ -> DIAG_TQ
Marks a DIAG_TQ
type qualifier as being const
in the ANSI C sense.
1.9.2. add_diag_volatile
Encoding number: | 2 |
qual: DIAG_TQ -> DIAG_TQ
Marks a DIAG_TQ
type qualifier as being volatile
in the ANSI C sense.
1.9.3. diag_tq_null
Encoding number: | 3 |
-> DIAG_TQ
Create a null DIAG_TQ
type qualifier.
1.10. FILENAME
Sortname: | foreign_sort("~diag_file") |
Number of encoding bits: | 2 |
Is coding extendable: | yes |
FILENAME
record details of source files used in producing a CAPSULE
. They can be tokenised for abbreviation.
1.10.1. filename_apply_token
Encoding number: | 1 |
token_value: TOKEN token_args: BITSTREAM -> FILENAME
The token is applied to the arguments to give a FILENAME
. If there is a definition for token_value in the CAPSULE
then token_args is a BITSTREAM
encoding of the SORT
s of its parameters, in the order specified.
1.10.2. make_filename
Encoding number: | 2 |
date: NAT machine: TDFSTRING(k1, n1) file: TDFSTRING(k2, n2) -> FILENAME
Create a FILENAME
for file file, dated date (a UNIX timestamp; seconds since 1 Jan 1970) on machine machine.
1.11. SOURCEMARK
Number of encoding bits: | 1 |
Is coding extendable: | yes |
A SOURCEMARK
records a location in the source program. Present SOURCEMARK
s assume that a location can be described by one or two numbers within a FILENAME
.
1.11.1. make_sourcemark
Encoding number: | 1 |
file: FILENAME line_no: NAT char_offset: NAT -> SOURCEMARK
Create a SOURCEMARK
referencing the char_offset'th character on line line_no in file file.
char_offset is counted from 1, 0 meaning that no character offset is available.
2. Reserved diagnostic TOKENs
Reserved TOKEN
s were used for diagnostic extensions to EXP
s, to avoid adding new constructs the contents of an existing UNIT
. All other parts of the diagnostic system occur in other UNIT
s.
2.1. ~exp_to_source
body: EXP sh from: SOURCEMARK to: SOURCEMARK -> EXP sh
Records that the EXP
body arose from translating program between SOURCEMARK
from and SOURCEMARK
to (inclusive).
2.2. ~diag_id_source
body: EXP sh name: TDFSTRING(k, n) access: EXP POINTER(al) type: DIAG_TYPE -> EXP sh
Within the EXP
body a variable named name of DIAG_TYPE
type can be accessed via the EXP
access.
2.3. ~diag_type_scope
body: EXP sh name: TDFSTRING(k, n) type: DIAG_TYPE -> EXP sh
Within the EXP
body a source language type named name of DIAG_TYPE
type is valid.
2.4. ~diag_tag_scope
body: EXP sh name: TDFSTRING(k, n) type: DIAG_TYPE -> EXP sh
This TOKEN
is obsolete.
3. Proposed changes
It is thought likely that the new TDF entities described above will eventually be incorporated into the main TDF specification.
In several places below the absence of "standardised methods" is noted. These are cases where TDF can express some operation in several ways, and the installer cannot be expected to spot all of them and generate new diagnostic info.
3.1. Language features currently missing
The following sections list some of the language features known not to be supported by the current specification. It is not intended to be exhaustive.
3.1.1. Data types
-
Complex numbers.
-
Fortran alternate
RETURN
s.
3.1.2. C++ requirements
-
The
reference
type is not yet present. -
The accessibility attributes (
public
,private
andprotected
) are not yet present. -
No
member
function information, and no specification of how to deal with name mangling. Pointer tomember
may need special recognition. -
No operations for describing
class
es and inheritance.
3.1.3. FORTRAN requirements
-
Main
PROGRAM
attribute missing. -
Fortran optional parameters may need special treatment
-
Use of
COMMON
is not explicit in TDF. -
Fortran77 etc. has a string type, which could be implemented in several ways (other languages need this, but they may differ on the same machine).
3.1.4. Other requirements
-
No standardised method for describing static link info. TDF can express such programs, but the link could be stored in several ways.
-
No standardised method for describing arrays with either non-constant bounds, and/or where the bounds are present in the running image. (The upper_bound and lower_bound
EXP
s are sufficiently powerful, but needs some rules) -
No way to name a lexical block.
-
Formal parameters with default values cannot have the default made visible.
-
Variables which are constant, and have been inlined everywhere may be a problem.
-
No standardised method of describing the discriminant part of a discriminated union (in TDF probably represented by a struct containing the discriminant and the union).
-
The distinction between ANSI prototyped and non-prototyped functions (this is a real problem for functions taking
float
) -
No standardised method for PASCAL sets.
-
No standardised method for subrange types.
-
PASCAL and Modula have a
WITH
construct to change semantics of record field lookup. No standardised method for documenting this.
3.2. Areas for further abstraction
3.2.1. Compilation related
How a running program has been created from several components is of interest when debugging. The present system cannot record all details of how a program has been created. In particular there is no indication of the source language of any piece of TDF, nor of the full name of any of the source files.
3.2.2. C related
At present there is no defined link between the fundamental C types and the VARIETY
s etc. used for them. Present installers for 32 bit machines cannot distinguish between int
and long
when generating diagnostics, other than by means of the standard token names which form part of the C producer language interface.
3.2.3. Naming of types
At present various DIAG_TYPE
s have names, some don't. I suspect we should make a separate is_named operation and remove the other names.
3.3. Postscript - ANDF-DE
As this section makes clear, the TDF Diagnostic Specification was only ever really intended to deal with C. As of 1997, a more extensive diagnostic extension to TDF, ANDF-DE, is under development by DDC-I. This has been designed with the requirements of C, C++ and Ada in mind. It is intended that eventually ANDF-DE will be incorporated into the TDF specification, and the diagnostic format described here will be denegrated.