8. Run-time type information
Each C++ type can be associated with a run-time type information structure giving information about that type. These type information structures have shape given by the token:
~cpp.typeid.type : () -> SHAPE
which corresponds to the representation for the standard type std::type_info
declared in the header <typeinfo>
. Each type information structure consists of a tag number, giving information on the kind of type represented, a string literal, giving the name of the type, and a pointer to a list of base type information structures. These are combined to give a type information structure using the token:
~cpp.typeid.make : ( SIGNED_NAT, EXP, EXP ) -> EXP ti
Each base type information structure has shape given by the token:
~cpp.baseid.type : () -> SHAPE
It consists of a pointer to a type information structure, an expression used to describe the offset of a base class, a pointer to the next base type information structure in the list, and two integers giving information on type qualifiers etc. These are combined to give a base type information structure using the token:
~cpp.baseid.make : ( EXP, EXP, EXP, SIGNED_NAT, SIGNED_NAT ) -> EXP bi
The following table gives the various tag numbers used in type information structures plus a list of the base type information structures associated with each type. Macros giving these tag numbers are provided in the default implementation in a header, interface.h
, which is shared by the C++ producer.
Type | Form | Tag | Base information |
---|---|---|---|
integer | – | 0 | – |
floating point | – | 1 | – |
void | – | 2 | – |
class or struct | class T | 3 | [base,access,virtual], .... |
union | union T | 4 | – |
enumeration | enum T | 5 | – |
pointer | cv T * | 6 | [T,cv,0] |
reference | cv T & | 7 | [T,cv,0] |
pointer to member | cv T S::* | 8 | [S,0,0], [T,cv,0] |
array | cv T [n] | 9 | [T,cv,n] |
bitfield | cv T : n | 10 | [T,cv,n] |
C++ function | cv T ( S1, ..., Sn ) | 11 | [T,cv,0], [S1,0,0], ...., [Sn,0,0] |
C function | cv T ( S1, ..., Sn ) | 12 | [T,cv,0], [S1,0,0], ...., [Sn,0,0] |
In the form column cv T
is used to denote not only the normal cv-qualifiers but, when T is a function type, the member function cv-qualifiers. Arrays with an unspecified bound are treated as if their bound was zero. Functions with ellipsis are treated as if they had an extra parameter of a dummy type named ...
(see below). Note the distinction between C++ and C function types.
Each base type information structure is described as a triple consisting of a type and two integers. One of these integers may be used to encode a type qualifier, cv
, as follows:
Qualifier | Encoding |
---|---|
(none) | 0 |
const | 1 |
volatile | 2 |
const volatile | 3 |
The base type information for a class consists of information on each of its direct base classes. The includes the offset of this base within the class (for a virtual base class this is the offset of the corresponding ptr field), whether the base is virtual (1) or not (0), and the base class access, encoded as follows:
Access | Encoding |
---|---|
public | 0 |
protected | 1 |
private | 2 |
For example, the run-time type information structures for the classes declared in the diamond lattice above can be represented as follows:
8.1. Defining run-time type information structures
For built-in types, the run-time type information structure may be referenced by the token:
~cpp.typeid.basic : ( SIGNED_NAT ) -> EXP pti
where the argument gives the encoding of the type as given in the following table:
Type | Encoding | Type | Encoding | |
---|---|---|---|---|
char | 0 | unsigned long | 11 | |
(error) | 1 | float | 12 | |
void | 2 | double | 13 | |
(bottom) | 3 | long double | 14 | |
signed char | 4 | wchar_t | 16 | |
signed short | 5 | bool | 17 | |
signed int | 6 | (ptrdiff_t) | 18 | |
signed long | 7 | (size_t) | 19 | |
unsigned char | 8 | (...) | 20 | |
unsigned short | 9 | signed long long | 23 | |
unsigned int | 10 | unsigned long long | 27 |
Note that the encoding for the basic integral types is the same as that given above. The other types are assigned to unused values. Note that the encodings for ptrdiff_t
and size_t
are not used, instead that for their implementation is used (using the standard tokens ptrdiff_t
and size_t
). The encodings for bool
and wchar_t
are used because they are conceptually distinct types even though they are implemented as one of the basic integral types. The type labelled ...
is the dummy used in the representation of ellipsis functions. The default implementation uses an array of type information structures, __TCPPLUS_typeid
, to implement ~cpp.typeid.basic
.
The run-time type information structures for classes are defined in the same place as their virtual function tables. Other run-time type information structures are defined in whatever modules require them. In the former case the type information structure will have an external tag name; in the latter case it will be an internal tag.
8.2. Accessing run-time type information
The primary means of accessing the run-time type information for an object is using the typeid
construct. In cases where the operand type can be determined statically, the address of the corresponding type information structure is returned. In other cases the token:
~cpp.typeid.ref : ( EXP ppvt ) -> EXP pti
is used, where the argument gives a reference to the vptr field of the object being checked. From this information it is trivial to trace the corresponding type information.
Another means of querying the run-time type information for an object is using the dynamic_cast
construct. When the result cannot be determined statically, this is implemented using the token:
~cpp.dynam.cast : ( EXP ppvt, EXP pti ) -> EXP pv
where the first expression gives a reference to the vptr field of the object being cast and the second gives the run-time type information for the type being cast to. In the default implementation this token is implemented by the procedure __TCPPLUS_dynamic_cast
. The key point to note is that the virtual function table contains the offset, voff, of the vptr field from the start of the most complete object. Thus it is possible to find the address of the most complete object. The run-time type information contains enough information to determine whether this object has a sub-object of the type being cast to, and if so, how to find the address of this sub-object. The result is returned as a void *
, with the null pointer indicating that the conversion is not possible.