3. Scalar types
3.1. Arithmetic types
The representations of the basic arithmetic types are target dependent, so, for example, an int
may contain 16, 32, 64 or some other number of bits. Thus it is necessary to introduce a token to stand for each of the built-in arithmetic types (including the long long
types). Each integral type is represented by a VARIETY
token as follows:
Type | Token | Encoding | ||
---|---|---|---|---|
char | ~char | 0 | ||
signed char | ~signed_char | 0 | 4 | = | 4 |
unsigned char | ~unsigned_char | 0 | 8 | = | 8 |
signed short | ~signed_short | 1 | 4 | = | 5 |
unsigned short | ~unsigned_short | 1 | 8 | = | 9 |
signed int | ~signed_int | 2 | 4 | = | 6 |
unsigned int | ~unsigned_int | 2 | 8 | = | 10 |
signed long | ~signed_long | 3 | 4 | = | 7 |
unsigned long | ~unsigned_long | 3 | 8 | = | 11 |
signed long long | ~signed_longlong | 3 | 4 | 16 | = | 23 |
unsigned long long | ~unsigned_longlong | 3 | 8 | 16 | = | 27 |
Similarly each floating point type is represent by a FLOATING_VARIETY
token:
Type | Token |
---|---|
float | ~float |
double | ~double |
long double | ~long_double |
Each integral type also has an encoding as a SIGNED_NAT
as shown above. This number is a bit pattern built up from the following values:
Type | Encoding |
---|---|
char | 0 |
short | 1 |
int | 2 |
long | 3 |
signed | 4 |
unsigned | 8 |
long long | 16 |
Any target dependent integral type can be represented by a SIGNED_NAT
token using this encoding. This representation, rather than one based on VARIETY
s, is used for ease of manipulation. The token:
~convert : ( SIGNED_NAT ) -> VARIETY
gives the mapping from the integral encoding to the representing variety. For example, it will map 6
to ~signed_int
.
The token:
~promote : ( SIGNED_NAT ) -> SIGNED_NAT
describes how to form the promotion of an integral type according to the ISO C/C++ value preserving rules, and is used by the producer to represent target dependent promotion types. For example, the promotion of unsigned short
may be int
or unsigned int
depending on the representation of these types; that is to say, ~promote ( 9 )
will be 6
on some machines and 10
on others. Although ~promote
is used by default, a program may specify another token with the same sort signature to be used in its place by means of the directive:
#pragma TenDRA compute promote identifier
For example, a standard token ~sign_promote
is defined which gives the older C sign preserving promotion rules. In addition, the promotion of an individual type can be specified using:
#pragma TenDRA promoted type-id : promoted-type-id
The token:
~arith_type : ( SIGNED_NAT, SIGNED_NAT ) -> SIGNED_NAT
similarly describes how to form the usual arithmetic result type from two promoted integral operand types. For example, the arithmetic type of long
and unsigned int
may be long
or unsigned long
depending on the representation of these types; that is to say, ~arith_type ( 7, 10 )
will be 7
on some machines and 11
on others.
Any tokenised type declared using:
#pragma token VARIETY v # tv
will be represented by a SIGNED_NAT
token with external name tv
corresponding to the encoding of v
. Special cases of this are the implementation dependent integral types which arise naturally within the language. The external token names for these types are given below:
Type | Token |
---|---|
bool | ~cpp.bool |
ptrdiff_t | ptrdiff_t |
size_t | size_t |
wchar_t | wchar_t |
So, for example, a sizeof
expression has shape ~convert ( size_t )
. The token ~cpp.bool
is defined in the default implementation, but the other tokens are defined according to their definitions on the target machine in the normal API library building mechanism.
3.2. Integer literal types
The type of an integer literal is defined in terms of the first in a list of possible integral types. The first type in which the literal value can be represented gives the type of the literal. For small literals it is possible to work out the type exactly, however for larger literals the result is target dependent. For example, the literal 50000
will have type int
on machines in which 50000
fits into an int
, and long
otherwise. This target dependent mapping is given by a series of tokens of the form:
~lit_* : ( SIGNED_NAT ) -> SIGNED_NAT
which map a literal value to the representation of an integral type. The token used depends on the list of possible types, which in turn depends on the base used to represent the literal and the integer suffix used, as given in the following table:
Base | Suffix | Token | Types |
---|---|---|---|
decimal | (none) | ~lit_int | int, long, unsigned long |
octal | (none) | ~lit_hex | int , unsigned int , long , unsigned long |
hexadecimal | (none) | ~lit_hex | int , unsigned int , long , unsigned long |
any | U | ~lit_unsigned | unsigned int , unsigned long |
any | L | ~lit_long | long |
any | UL | ~lit_ulong | unsigned long |
any | LL | ~lit_longlong | long long , unsigned long long |
any | ULL | ~lit_ulonglong | unsigned long long |
Thus, for example, the shape of the integer literal 50000
is:
~convert ( ~lit_int ( 50000 ) )
3.3. Bitfield types
The sign of a plain bitfield type, declared without using signed
or unsigned
, is left unspecified in C and C++. The token:
~cpp.bitf_sign : ( SIGNED_NAT ) -> BOOL
is used to give a mapping from integral types to the sign of a plain bitfield of that type, in a form suitable for use in the TDF bfvar_bits
construct. (Note that ~cpp.bitf_sign
should have been a standard C token but was omitted.)