5. Configuration for types
- 5.1. The Portability Table
- 5.2. Specifying integer literal types
- 5.3. Extended integral types
- 5.4. Bitfield types
- 5.5. Type declarations
- 5.6. Type compatibility
- 5.7. Incomplete types
- 5.8. Built-in types
- 5.9. Sign of
char
5.1. The Portability Table
The portability table is used by the checker to describe the minimum assumptions about the representation of the integral types. It contains information on the minimum integer sizes and the minimum range of values that can be represented by each integer type, the sign of plain char
, and whether signed types can be assumed to be symmetric (for example, [-127,127]) or maximum (for example, [-128,127]). The format for this file is documented by tdfc2portability.
The minimum integer ranges are deduced from the minimum integer sizes as follows. Suppose b is the minimum number of bits that will be used to represent a certain integral type, then:
-
For unsigned integer types the minimum range is [0, 2b-1];
-
For signed integer types if
signed_range
is maximum the minimum range is[-2b-1, 2b-1-1]
. Otherwise, if signed_range is symmetric the minimum range is[-(2b-1-1), 2b-1-1]
; -
For the type char which is not specified as signed or unsigned, if char_type is
signed
thenchar
is treated assigned
, if char_type is unsigned thenchar
is treated asunsigned
, and if char_type iseither
, the minimum range ofchar
is the intersection of the minimum ranges ofsigned char
andunsigned char
.
5.2. Specifying integer literal types
By default tdfc2 assumes that all integer ranges conform to the minimum ranges prescribed by the ISO C standard, i.e. char contains at least 8 bits, short and int contain at least 16 bits and long contains at least 32 bits. If the -Y32bit flag is passed to the checker it assumes that integers conform to the minimum ranges commonly found on most 32 bit machines, i.e. int contains at least 32 bits and int is strictly larger than short so that the integral promotion of unsigned short is int under the ISO C standard integer promotion rules.
The integer literal pragmas are used to define the method of computing the type of an integer literal. Integer literals cannot be used in a program unless the class to which they belong has been described using an integer literal pragma. Each built-in checking mode includes some integer literal pragmas describing the semantics appropriate for that mode. If these built-in modes are inappropriate, then the user must describe the semantics using the pragma below:
#pragma integer literal literal_class lit_class_type_list
The literal_class identifies the type of literal integer involved. The possibilities are:
-
decimal
-
octal
-
hexadecimal
Each of these types can optionally be followed by unsigned
and/or long
to specify an unsigned and/or long type respectively.
The values of the integer literals of any particular class are divided into contiguous sub-ranges specified by the lit_class_type_list
which takes the form below:
lit_class_type_list *int_type_spec integer_constant int_type_spec | lit_class_type_listint_type_spec : : type_name * warning? : identifier ** :
The first integer constant, i1
say, identifies the range [0, i1]
, the second, i2
say, identifies the range [i1 + 1, i2]
. The symbol *
specifies the unlimited range upwards from the last integer constant. Each integer constant must be strictly greater than its predecessor.
Associated with each sub-range is an int_type_spec which is either a type, a procedure token identifier with an optional warning (see G.9) or a failure. For each sub-range:
-
If the int_type_spec is a type name, then it must be an integral type and specifies the type associated with literals in that sub-range.
-
If the
int_type_spec
is an identifier, then the type of integer is computed by a procedure token of that name which takes the integer value as a parameter and delivers its type. The procedure token must have been declared previously as#pragma token PROC ( VARIETY ) VARIETY
Since the type of the integer is computed by a procedure token which may be implemented differently on different targets, there is the option of producing a warning whenever the token is applied.
-
If the int_type_spec is
**
, then any integer literal lying in the associated sub-range will cause the checker to raise an error.
For example:
#pragma integer literal decimal 0x7fff : int | 0x7fffffff : long | * : unsigned long
divides unsuffixed decimal literals into three ranges: literals in the range [0, 0x7fff]
are of type int
, integer literals in the range [0x7fff, 0x7fffffff]
are of type long
and the remainder are of type unsigned long
.
There are four pre-defined procedure tokens supplied with the compiler which are used in the startup files to provide the default specification for integer literals:
-
~lit_int
is the external identification of a token that returns the integer type according to the rules of ISO C for an unsuffixed decimal; -
~lit_hex
is the external identification of a token that returns the integer type according to the rules of ISO C for an unsuffixed hexadecimal; -
~lit_unsigned
is the external identification of a token that returns the integer type according to the rules of ISO C for integers suffixed byU
only; -
~lit_long
is the external identification of a token that returns the integer type according to the rules of ISO C for integers suffixed byL
only.
5.3. Extended integral types
The long long
integral types are not part of ISO C or C++ by default, however support for them can be enabled using the directive:
#pragma TenDRA longlong type allow
This support includes allowing long long
in type specifiers and allowing LL
and ll
as integer literal suffixes.
There is a further directive given by the two cases:
#pragma TenDRA set longlong type : long long #pragma TenDRA set longlong type : long
which can be used to control the implementation of the long long
types. Either they can be mapped to the default representation, which is guaranteed to contain at least 64 bits, or they can be mapped to the corresponding long
types.
Because these long long
types are not an intrinsic part of C++ the implementation does not integrate them into the language as fully as is possible. This is to prevent the presence or otherwise of long long
types affecting the semantics of code which does not use them. For example, it would be possible to extend the rules for the types of integer literals, integer promotion types and arithmetic types to say that if the given value does not fit into the standard integral types then the extended types are tried. This has not been done, although these rules could be implemented by changing the definitions of the standard tokens used to determine these types. By default, only the rules for arithmetic types involving a long long
operand and for LL
integer literals mention long long
types.
5.4. Bitfield types
The C++ rules on bitfield types differ slightly from the C rules. Firstly any integral or enumeration type is allowed in a bitfield, and secondly the bitfield width may exceed the underlying type size (the extra bits being treated as padding). These properties can be controlled using the directives:
#pragma TenDRA extra bitfield int type allow #pragma TenDRA bitfield overflow allow
respectively.
The ISO C standard only allows signed int
, unsigned int
and their equivalent types as type specifiers in bitfields. Using the default checking profile, tdfc2 raises errors for other integral types used as type specifiers in bitfields. This behaviour may be modified using the pragma:
#pragma TenDRA extra int bitfield type permit
where permit is one of allow
(no errors raised), warning
(allow non-int bitfields through with a warning) or disallow
(raise errors for non-int bitfields).
If non-int bitfields are allowed, the bitfield is treated as if it had been declared with an int
type of the same signedness as the given type. The use of the type char
as a bitfield type still generally causes an error, since whether a plain char
is treated as signed
or unsigned
is implementation-dependent. The pragma:
#pragma TenDRA character set-sign
where set-sign is signed
, unsigned
or either
, can be used to specify the signedness of a plain char
bitfield. If set-sign is signed
or unsigned
, the bitfield is treated as though it were declared signed char
or unsigned char
respectively. If set-sign is either
, the sign of the bitfield is target-dependent and the use of a plain char
bitfield causes an error.
5.5. Type declarations
C does not allow multiple definitions of a typedef
name, whereas C++ allows multiple consistent definitions. This behaviour can be controlled using the directive:
#pragma TenDRA extra type definition allow
In accordence with the ISO C standard, in default mode tdfc2 does not allow a type to be defined more than once using a typedef
. The pragma:
#pragma TenDRA extra type definition permit
where permit is allow
(silently accepts redefinitions, provided they are consistent), warning
or disallow
.
5.6. Type compatibility
The directive:
#pragma TenDRA incompatible type qualifier allow
allows objects to be redeclared with different cv-qualifiers (normally such redeclarations would be incompatible). The composite type is qualified using the join of the cv-qualifiers in the various redeclarations.
The directive:
#pragma TenDRA compatible type : type-id == type-id : allow
asserts that the given two types are compatible. Currently the only implemented version is char * == void *
which enables char *
to be used as a generic pointer as it was in older dialects of C.
5.7. Incomplete types
Some dialects of C allow incomplete arrays as member types. These are generally used as a place-holder at the end of a structure to allow for the allocation of an arbitrarily sized array. Support for this feature can be enabled using the directive:
#pragma TenDRA incomplete type as object type allow
The ISO C standard (Section 6.1.2.5) states that an incomplete type e.g an undefined structure or union type, is not an object type and that array elements must be of object type. The default behaviour of the checker causes errors when incomplete types are used to specify array element types. The pragma:
#pragma TenDRA incomplete type as object type permit
can be used to alter the treatment of array declarations with incomplete element types. permit is one of allow
, disallow
or warning
as usual.
5.8. Built-in types
The definitions of implementation dependent integral types which arise naturally within the language - the type of the difference of two pointers, ptrdiff_t
, and the type of the sizeof
operator, size_t
- given in the <stddef.h>
header can be overridden using the directives:
#pragma TenDRA set ptrdiff_t : type-id #pragma TenDRA set size_t : type-id
These directives are useful when targeting a specific machine on which the definitions of these types are known; while they may not affect the code generated they can cut down on spurious conversion warnings. Note that although these types are built into the producer they are not visible to the user unless an appropriate header is included (with the exception of the keyword wchar_t
in ISO C++), however the directives:
#pragma TenDRA++ type identifier for type-name
can be used to make these types visible. They are equivalent to a typedef
declaration of identifier as the given built-in type, ptrdiff_t
, size_t
or wchar_t
.
5.9. Sign of char
Whether plain char
is signed or unsigned is implementation dependent. By default the implementation is determined by the definition of the ~char
token, however this can be overridden in the producer either by means of the portability table or by the directive:
#pragma TenDRA character character-sign
where character-sign can be signed
, unsigned
or either
(the default). Again this directive is useful primarily when targeting a specific machine on which the signedness of char
is known.