C/C++ Producer Configuration Guide
- i. Introduction
- 1. Configuring the Compiler
- 2. Implementation limits
- 3. Configuration for lexical analysis
- 4. Configuration for the preprocessor
- 5. Configuration for types
- 6. Configuration for literals
- 7. Configuration for declarations
- 8. Configuration for initialisers
- 9. Configuration for expressions
- 10. Configuration for functions
- 11. Configuration for linkage
- A. Standard library
© , , , , , , , , The TenDRA Project.
© DERA.
First published .
Revision History
kate | Merged in Integral Type Specification and Dialect Features from C/C++ Checker Reference Manual, and moved out sections relevant to checking to C/C++ Checker Reference Manual. Moved out documentation for the supplied portability tables from the C/C++ Producer Configuration Guide. Moved out portability table syntax to create a tdfc2portability manpage. Moved out compilation scheme for C++ spec file linking to C/C++ Checker Reference Manual. | |
kate | Restructured C/C++ Producer Configuration Guide. | |
kate | Some normalisation for makefile variables, now I'm done moving things around; this marks the start of clearing up the post-restructuring aftermath. The various Hopefully this should be a bit simpler for package maintainers to configure, by overriding whatever they wish. | |
kate | Move out the C++ LPI token implementations and the C++ (minimal) standard library to the producer project. | |
kate | Moved out the DRA producers as a standalone tool. | |
kate | Moved out the “invocation” chapter and related content to the tcpplus manpage. Moved out the tdfc2dump symbol table dump syntax into a seperate manpage, tdfc2dump. Moved out a description of the symbol table semantics into a seperate document, The C/C++ Symbol Table Dump. Moved out the The Pragma Token Syntax into a seperate document. | |
kate | Moved out the C/C++ Producer Implementation into a seperate document. | |
truedfx | Suppose three files are being used. a.c: #include "a.h" int main(void) {} a.h: #include "b.h" b.h: extern int unused; tendra outputs line directives based on the tokens that are present after preprocessing. Since no such tokens are found in a.h, there is no mention of a.h in the preprocessed output. This causes some configure scripts to fail. By printing the full | |
truedfx | Allow
# pragma TenDRA keyword __literal for keyword literal where | |
truedfx | Support anonymous unions as an extension in C. The code to handle these already exists in tendra, but is hidden in # pragma TenDRA anonymous union ... to allow this feature to be enabled or disabled during compilation, and default to an error for C, and no error for C++, to preserve the existing behaviour. | |
DERA | tcpplus 1.8.2; TenDRA 4.1.2 release. |
i. Introduction
This document is designed as a technical overview to usage of the TenDRA C++ to TDF/ANDF producer. It also describes the public interfaces to the producer.
Whereas the interface description contains most of the information which would be required in a users' guide, it is not necessarily in a readily digestible form. The C++ producer is designed to complement the existing TenDRA C to TDF producer; although they are completely distinct programs, the same design philosophy underlies both and they share a number of common interfaces. There are no radical differences between the two producers, besides the fact that the C++ producer covers a vastly larger and more complex language. This means that much of the documentation on the C producer can be taken as also applying to the C++ producer. This document tries to make clear where the C++ producer extends the C producer's interfaces, and those portions of these interfaces which are not directly applicable to C++.
A familiarity with both C++ and TDF is assumed. The version of C++ implemented is that given by the draft ISO C++ standard. All references to "ISO C++" within the document should strictly be qualified using the word "draft", but for convenience this has been left implicit. The C++ producer has a number of switches which allow it to be configured for older dialects of C++. In particular, the version of C++ described in the ARM (Annotated Reference Manual) is fully supported.
The TDF Specification (version 4.0) may be consulted for a description of the compiler intermediate language used. The paper TDF and Portability provides a useful (if slightly old) introduction to some of the ideas relating to static program analysis and interface checking which underlie the whole TenDRA compilation system.
Since this document was originally written, the old C producer, tdfc, has been replaced by a new C producer, tdfc2, which is just a modified version of the C++ producer, tcpplus. All C producer documentation continues to apply to the new C producer, but the new C producer also has many of the features described in this document as only applying to the C++ producer.
ii. Interface descriptions
The most important public interfaces of the C++ producer are the ISO C++ standard and the TDF 4.0 specification; however there are other interfaces, mostly common to both the C and C++ producers, which are described in this section.
An important design criterion of the C++ producer was that it should be strictly ISO conformant by default, but have a method whereby dialect features and extra static program analysis can be enabled. This compiler configuration is controlled by the #pragma TenDRA
directives described in the first section.
The requirement that the C and C++ producers should be able to translate portable C or C++ programs into target independent TDF requires a mechanism whereby the target dependent implementations of APIs can be represented. This mechanism, the #pragma token
syntax, is described in The Pragma Token Syntax. Note that at present this mechanism only contains support for C APIs; it is considered that the C++ language itself contains sufficient interface mechanisms for C++ APIs to be described.
The C and C++ producers provide two mechanisms whereby type and declaration information derived from a translation unit can be stored to a file for post-processing by other tools. The first is the symbol table dump, which is a public interface designed for use by third party tools. The second is the C/C++ spec file, which is designed for ease of reading and writing by the producers themselves, and is used for intermodule analysis.
The mapping from C++ to TDF implemented by the C++ producer is largely straightforward. There are however target dependencies arising within the language itself which require special handling. These are represented by certain standard tokens which the producer requires to be defined on the target machine. These tokens are also used to describe the interface between the producer and the run-time system. Note that the C++ producer is primarily concerned with the C++ language, not with the standard C++ library. An example implementation of those library components which are required as an integral part of the language (memory allocation, exception handling, run-time type information etc.) is provided. Otherwise, libraries should be obtained from third parties. A number of hints on integrating such libraries with the C++ producer are given.
1. Configuring the Compiler
This document describes the capabilities of the TenDRA C checker for enforcing the ISO C standard as well as features for detecting areas left undefined by the standard. It also lists the non-ISO dialect features supported by the checker in order to provide compatibility with older versions of C and allow the use of third-party source which may contain non-standard constructs.
This majority of this document describes how the C++ producer can be configured to apply extra static checks or to support various dialects of C++. In all cases the default behaviour is precisely that specified in the ISO C++ standard with no extra checks.
1.1. Configuration files
Certain basic type information is specified using a portability table, which may be specified to the producer using the -n
option. The syntax for this file is documented by tdfc2portability.
Mappings to arbitary execution character sets may be specified using the -C
option. The default is to the use the same character set as the host system. The syntax for this file is documented by tdfc2charset.
The tcc frontend is typically responsible for providing these files; see the TCC Users' Guide for details.
1.2. Low level configuration
The primary method of configuration is by means of #pragma
directives. These directives may be placed within the program itself, however it is generally more convenient to group them into a start-up file in order to create a user-defined compilation profile (see the -X option for tcc). The #pragma
directives recognised by the C++ producer have one of the equivalent forms:
#pragma TenDRA .... #pragma TenDRA++ ....
Some of these are common to the C and C++ producers (although often with differing default behaviour). The C producer will ignore any TenDRA++
directives, so these may be used in compilation profiles which are to be used by both producers. In the descriptions below, the presence of a ++
is used to indicate a directive which is C++ specific; the other directives are common to both producers.
Within the description of the #pragma
syntax, on stands for on
, off
or warning
, allow stands for allow
, disallow
or warning
, string-literal is any string literal, integer-literal is any integer literal, identifier is any simple, unqualified identifier name, and type-id is any type identifier. Other syntactic items are described in the text. A complete grammar for the #pragma
directives accepted by the C++ producer is given in The Pragma Token Syntax.
The simplest level of configuration is to reset the severity level of a particular error message using:
#pragma TenDRA++ error string-literal on #pragma TenDRA++ error string-literal allow
The given string-literal should name an error from the make_err error catalogue. A severity of on
or disallow
indicates that the associated diagnostic message should be an error, which causes the compilation to fail. A severity of warning
indicates that the associated diagnostic message should be a warning, which is printed but allows the compilation to continue. A severity of off
or allow
indicates that the associated error should be ignored. Reducing the severity of any error from its default value, other than via one of the dialect directives described in this section, results in undefined behaviour.
The next level of configuration is to reset the severity level of a particular compiler option using:
#pragma TenDRA++ option string-literal on #pragma TenDRA++ option string-literal allow
The given string-literal should name an option from the option catalogue. The simplest form of compiler option just sets the severity level of one or more error messages. Some of these options may require additional processing to be applied.
It is possible to link a particular error message to a particular compiler option using:
#pragma TenDRA++ error string-literal as option string-literal
Note that the directive:
#pragma TenDRA++ use error string-literal
can be used to raise a given error at any point in a translation unit in a similar fashion to the #error
directive. The values of any parameters for this error are unspecified.
The directives just described give the primitive operations on error messages and compiler options. Many of the remaining directives in this section are merely higher level ways of expressing these primitives.
1.3. Scoping options
A new checking scope may be started by inserting the pragma:
#pragma TenDRA begin
at the outermost level. The scope runs until the matching:
#pragma TenDRA end
directive, or to the end of the translation unit (the ISO C standard definition of a translation unit as being a source file, together with any headers or source files included using the #include
preprocessing directive, less any source lines skipped by any of the conditional inclusion preprocessing directives, is used throughout this document).
Checking scopes may be nested in the obvious way.
Each new checking scope inherits its initial set of checks from the checking scope which immediately contains it (this includes the implicit main checking scope consisting of the entire source file). Any checks switched on or off within the scope apply only to that scope and any scope it contains. The set of checks applied reverts to its previous state at the end of a scope. Thus, for example:
#pragma TenDRA variable analysis on /* Variable analysis is on here */ #pragma TenDRA begin #pragma TenDRA variable analysis off /* Variable analysis is off here */ #pragma TenDRA end /* Variable analysis is on again here */
Once a check has been set any attempt to change its status within the same scope is flagged as an error. If checks need to be switched on and off in the same source file, they must be properly scoped. The built-in compilation modes have the entire source file as their scope.
The method of applying different checking profiles to different parts of a program clearly needs to take into account those properties of C which can circumvent such scoping. Consider for example:
#pragma TenDRA begin #pragma TenDRA unknown escape allow #define STRING "hello\!" #pragma TenDRA end char * f () { return ( STRING ) ; }
The macro STRING
is defined in an area where unknown escape sequences, such as \!,
are allowed, but it is expanded in an area where they are not allowed (this is the default setting). The conventional approach to macro expansion would lead to the unknown escape sequence being flagged as an error, even though the user probably intended to avoid this. The checker therefore expands all macros using the checking profile in which they were defined, rather than the current checking scope.
The directives describing the user's desired checking profile could be included directly in the program itself, ideally in some configuration file which is #include
'd in all source files. It is however perhaps more appropriate to store the directives as a startup file, file say, which is passed to the checker using the -f
filecommand line option. It should be noted that user-defined compilation modes are defined on top of a built-in mode base (normally Xc, the default mode). It is therefore important to scope the new checking profile as described above.
Names may be associated with checking scopes by using an alternative form of the begin directive:
#pragma TenDRA begin name environment identifier
where identifier is any valid C identifier. Thereafter a statement of the form:
#pragma TenDRA use environment identifier
changes the current checking environment to the environment associated with identifier.
Sometimes it may be desirable to use different checking profiles for different parts of a translation unit, e.g. applying less strict checks to any system headers which may be included. The checker can be configured to apply a named checking scope, env_name, to any files included from a directory which has been named dir_name, using:
#pragma TenDRA directory dir_name use environment env_name
The directory name must be passed to the checker using the -N
dir_name :
dir -I
dir
command line option. This is equivalent to the usual -Idir
option for specifying include paths, except that it also attaches the name dir_name to the directory.
Most compiler options are scoped. A checking scope may be defined by enclosing a list of declarations within:
#pragma TenDRA begin .... #pragma TenDRA end
If the final end
directive is omitted then the scope ends at the end of the translation unit. Checking scopes may be nested in the obvious way. A checking scope inherits its initial set of checks from its enclosing scope (this includes the implicit main checking scope consisting of the entire input file). Any checks switched on or off within a scope apply only to the remainder of that scope and any scope it contains. A particular check can only be set once in a given scope. The set of applied checks reverts to its previous state at the end of the scope.
A checking scope can be named using the directives:
#pragma TenDRA begin name environment identifier .... #pragma TenDRA end
Checking scope names occupy a namespace distinct from any other namespace within the translation unit. A named scope defines a set of modifications to the current checking scope. These modifications may be reapplied within a different scope using:
#pragma TenDRA use environment identifier
The default behaviour is not to allow checks set in the named checking scope to be reset in the current scope. This can however be modified using:
#pragma TenDRA use environment identifier reset allow
Another use of a named checking scope is to associate a checking scope with a named include file directory. This is done using:
#pragma TenDRA directory identifier use environment identifier
where the directory name is one introduced via a -N
command-line option. The effect of this directive, if a #include
directive is found to resolve to a file from the given directory, is as if the file was enclosed in directives of the form:
#pragma TenDRA begin #pragma TenDRA use environment identifier reset allow .... #pragma TenDRA end
The checks applied to the expansion of a macro definition are those from the scope in which the macro was defined, not that in which it was expanded. The macro arguments are checked in the scope in which they are specified, that is to say, the scope in which the macro is expanded. This enables macro definitions to remain localised with respect to checking scopes.
2. Implementation limits
This table gives the default implementation limits imposed by the C++ producer for the various implementation quantities listed in Annex B of the ISO C++ standard, together with the minimum limits allowed in ISO C and C++. A default limit of none means that the quantity is limited only by the size of the host machine (either ULONG_MAX
or until it runs out of memory). A limit of target means that while no limits is imposed by the C++ front-end, particular target machines may impose such limits.
Quantity identifier | Min C limit | Min C++ limit | Default limit |
---|---|---|---|
statement_depth | 15 | 256 | none |
hash_if_depth | 8 | 256 | none |
declarator_max | 12 | 256 | none |
paren_depth | 32 | 256 | none |
name_limit | 31 | 1024 | none |
extern_name_limit | 6 | 1024 | target |
external_ids | 511 | 65536 | target |
block_ids | 127 | 1024 | none |
macro_ids | 1024 | 65536 | none |
func_pars | 31 | 256 | none |
func_args | 31 | 256 | none |
macro_pars | 31 | 256 | none |
macro_args | 31 | 256 | none |
line_length | 509 | 65536 | none |
string_length | 509 | 65536 | none |
sizeof_object | 32767 | 262144 | target |
include_depth | 8 | 256 | 256 |
switch_cases | 257 | 16384 | none |
data_members | 127 | 16384 | none |
enum_consts | 127 | 4096 | none |
nested_class | 15 | 256 | none |
atexit_funcs | 32 | 32 | target |
base_classes | N/A | 16384 | none |
direct_bases | N/A | 1024 | none |
class_members | N/A | 4096 | none |
virtual_funcs | N/A | 16384 | none |
virtual_bases | N/A | 1024 | none |
static_members | N/A | 1024 | none |
friends | N/A | 4096 | none |
access_declarations | N/A | 4096 | none |
ctor_initializers | N/A | 6144 | none |
scope_qualifiers | N/A | 256 | none |
external_specs | N/A | 1024 | none |
template_pars | N/A | 1024 | none |
instance_depth | N/A | 17 | 17 |
exception_handlers | N/A | 256 | none |
exception_specs | N/A | 256 | none |
It is possible to impose lower limits on most of the quantities listed above by means of the directive:
#pragma TenDRA++ option value string-literal integer-literal
where string-literal gives one of the quantity identifiers listed above and integer-literal gives the limit to be imposed. An error is reported if the quantity exceeds this limit (note however that checks have not yet been implemented for all of the quantities listed). Note that the name_limit
and include_depth
implementation limits can be set using dedicated directives.
The maximum number of errors allowed before the producer bails out can be set using the directive:
#pragma TenDRA++ set error limit integer-literal
The default value is 32.
3. Configuration for lexical analysis
- 3.1. Lexical analysis
- 3.2. Keywords
- 3.3. Nested comments
- 3.4. Identifier names
- 3.5. Identifier name length
3.1. Lexical analysis
During lexical analysis, a source file which is not empty should end in a newline character. It is possible to relax this constraint using the directive:
#pragma TenDRA no nline after file end allow
3.2. Keywords
In several places in this section it is described how to introduce keywords for TenDRA language extensions. By default, no such extra keywords are defined. There are also low-level directives for defining and undefining keywords. The directive:
#pragma TenDRA++ keyword identifier for keyword identifier
can be used to introduce a keyword (the first identifier) standing for the standard C++ keyword given by the second identifier. The directive:
#pragma TenDRA++ keyword identifier for operator operator
can similarly be used to introduce a keyword giving an alternative representation for the given operator or punctuator, as, for example, in:
#pragma TenDRA++ keyword and for operator &&
Finally the directive:
#pragma TenDRA++ undef keyword identifier
can be used to undefine a keyword.
3.3. Nested comments
C-style comments do not nest. The directive:
#pragma TenDRA nested comment analysis on
enables a check for the characters /*
within C-style comments.
The occurence of the /*
characters inside a C comment, i.e. text surrounded by the /*
and */
symbols, is usually a mistake and can lead to the termination of a comment unexpectedly. By default such nested comments are processed silently, however an error or warning can be produced by setting:
#pragma TenDRA nested comment analysis status
with status as on
or warning
. If status is off
the default behaviour is restored.
3.4. Identifier names
During lexical analysis, each character in the source file has an associated look-up value which is used to determine whether the character can be used in an identifier name, is a white space character etc. These values are stored in a simple look-up table. It is possible to set the look-up value using:
#pragma TenDRA++ character character-literal as character-literal allow
which sets the look-up for the first character to be the default look-up for the second character. The form:
#pragma TenDRA++ character character-literal disallow
sets the look-up of the character to be that of an invalid character. The forms:
#pragma TenDRA++ character string-literal as character-literal allow #pragma TenDRA++ character string-literal disallow
can be used to modify the look-up values for the set of characters given by the string literal. For example:
#pragma TenDRA character '$' as 'a' allow #pragma TenDRA character '\r' as ' ' allow
allows $
to be used in identifier names (like a
) and carriage return to be a white space character. The former is a common dialect feature and can also be controlled by the directive:
#pragma TenDRA dollar as ident allow
The ISO C standard (Section 6.1) states that the use of the character $
in identifier names is illegal. The pragma:
#pragma TenDRA dollar as ident allow
can be used to allow such identifiers, which by default are flagged as errors. There is also a disallow
variant which restores the default behaviour.
3.5. Identifier name length
Under the ISO C standard rules on identifier name length, an implementation is only required to treat the first 31 characters of an internal name and the first 6 characters of an external name as significant. The TenDRA C checker provides a facility for users to specify the maximum number of characters allowed in an identifier name, to prevent unexpected results when the application is moved to a new implementation.
The maximum number of characters allowed in an identifier name can be set using the directives:
#pragma TenDRA set name limit integer-literal #pragma TenDRA++ set name limit integer-literal warning
This length is given by the name_limit
implementation quantity mentioned above. Identifiers which exceed this length raise an error or a warning, but are not truncated.
#pragma TenDRA set name limit integer_constant
There is currently no distinction made between external and internal names for length checking. Identifier name lengths are not checked in the default mode.
4. Configuration for the preprocessor
4.1. Preprocessing directives
Non-standard preprocessing directives can be controlled using the directives:
#pragma TenDRA directive ppdir allow #pragma TenDRA directive ppdir (ignore) allow
where ppdir can be assert
, file
, ident
, import
(C++ only), include_next
(C++ only), unassert
, warning
(C++ only) or weak
. The second form causes the directive to be processed but ignored (note that there is no (ignore) disallow
form). The treatment of other unknown preprocessing directives can be controlled using:
#pragma TenDRA unknown directive allow
Cases where the token following the #
in a preprocessing directive is not an identifier can be controlled using:
#pragma TenDRA no directive/nline after ident allow
When permitted, unknown preprocessing directives are ignored.
By default, unknown #pragma
directives are ignored without comment, however this behaviour can be modified using the directive:
#pragma TenDRA unknown pragma allow
Note that any unknown #pragma TenDRA
directives always give an error.
Older preprocessors allowed text after #else
and #endif
directives. The following directive can be used to enable such behaviour:
#pragma TenDRA text after directive allow
Such text after a directive is ignored.
Some older preprocessors have problems with white space in preprocessing directives - whether at the start of the line, before the initial #
, or between the #
and the directive identifier. Such white space can be detected using the directives:
#pragma TenDRA indented # directive allow #pragma TenDRA indented directive after # allow
respectively.
4.2. File inclusion directives
There is a maximum depth of nested #include
directives allowed by the C++ producer. This depth is given by the include_depth
implementation quantity mentioned above. Its value is fairly small in order to detect recursive inclusions. The maximum depth can be set using:
#pragma TenDRA includes depth integer-literal
A further check, for full pathnames in #include
directives (which may not be portable), can be enabled using the directive:
#pragma TenDRA++ complete file includes allow
4.3. Macro definitions
By default, multiple consistent definitions of a macro are allowed. This behaviour can be controlled using the directive:
#pragma TenDRA extra macro definition allow
The ISO C/C++ rules for determining whether two macro definitions are consistent are fairly restrictive. A more relaxed rule allowing for consistent renaming of macro parameters can be enabled using:
#pragma TenDRA weak macro equality allow
In the definition of macros with parameters, a #
in the replacement list must be followed by a parameter name, indicating the stringising operation. This behaviour can be controlled by the directive:
#pragma TenDRA no ident after # allow
which allows a #
which is not followed by a parameter name to be treated as a normal preprocessing token.
In a list of macro arguments, the effect of a sequence of preprocessing tokens which otherwise resembles a preprocessing directive is undefined. The C++ producer treats such directives as normal sequences of preprocessing tokens, but can be made to report such behaviour using:
#pragma TenDRA directive as macro argument allow
5. Configuration for types
- 5.1. The Portability Table
- 5.2. Specifying integer literal types
- 5.3. Extended integral types
- 5.4. Bitfield types
- 5.5. Type declarations
- 5.6. Type compatibility
- 5.7. Incomplete types
- 5.8. Built-in types
- 5.9. Sign of
char
5.1. The Portability Table
The portability table is used by the checker to describe the minimum assumptions about the representation of the integral types. It contains information on the minimum integer sizes and the minimum range of values that can be represented by each integer type, the sign of plain char
, and whether signed types can be assumed to be symmetric (for example, [-127,127]) or maximum (for example, [-128,127]). The format for this file is documented by tdfc2portability.
The minimum integer ranges are deduced from the minimum integer sizes as follows. Suppose b is the minimum number of bits that will be used to represent a certain integral type, then:
-
For unsigned integer types the minimum range is [0, 2b-1];
-
For signed integer types if
signed_range
is maximum the minimum range is[-2b-1, 2b-1-1]
. Otherwise, if signed_range is symmetric the minimum range is[-(2b-1-1), 2b-1-1]
; -
For the type char which is not specified as signed or unsigned, if char_type is
signed
thenchar
is treated assigned
, if char_type is unsigned thenchar
is treated asunsigned
, and if char_type iseither
, the minimum range ofchar
is the intersection of the minimum ranges ofsigned char
andunsigned char
.
5.2. Specifying integer literal types
By default tdfc2 assumes that all integer ranges conform to the minimum ranges prescribed by the ISO C standard, i.e. char contains at least 8 bits, short and int contain at least 16 bits and long contains at least 32 bits. If the -Y32bit flag is passed to the checker it assumes that integers conform to the minimum ranges commonly found on most 32 bit machines, i.e. int contains at least 32 bits and int is strictly larger than short so that the integral promotion of unsigned short is int under the ISO C standard integer promotion rules.
The integer literal pragmas are used to define the method of computing the type of an integer literal. Integer literals cannot be used in a program unless the class to which they belong has been described using an integer literal pragma. Each built-in checking mode includes some integer literal pragmas describing the semantics appropriate for that mode. If these built-in modes are inappropriate, then the user must describe the semantics using the pragma below:
#pragma integer literal literal_class lit_class_type_list
The literal_class identifies the type of literal integer involved. The possibilities are:
-
decimal
-
octal
-
hexadecimal
Each of these types can optionally be followed by unsigned
and/or long
to specify an unsigned and/or long type respectively.
The values of the integer literals of any particular class are divided into contiguous sub-ranges specified by the lit_class_type_list
which takes the form below:
lit_class_type_list *int_type_spec integer_constant int_type_spec | lit_class_type_listint_type_spec : : type_name * warning? : identifier ** :
The first integer constant, i1
say, identifies the range [0, i1]
, the second, i2
say, identifies the range [i1 + 1, i2]
. The symbol *
specifies the unlimited range upwards from the last integer constant. Each integer constant must be strictly greater than its predecessor.
Associated with each sub-range is an int_type_spec which is either a type, a procedure token identifier with an optional warning (see G.9) or a failure. For each sub-range:
-
If the int_type_spec is a type name, then it must be an integral type and specifies the type associated with literals in that sub-range.
-
If the
int_type_spec
is an identifier, then the type of integer is computed by a procedure token of that name which takes the integer value as a parameter and delivers its type. The procedure token must have been declared previously as#pragma token PROC ( VARIETY ) VARIETY
Since the type of the integer is computed by a procedure token which may be implemented differently on different targets, there is the option of producing a warning whenever the token is applied.
-
If the int_type_spec is
**
, then any integer literal lying in the associated sub-range will cause the checker to raise an error.
For example:
#pragma integer literal decimal 0x7fff : int | 0x7fffffff : long | * : unsigned long
divides unsuffixed decimal literals into three ranges: literals in the range [0, 0x7fff]
are of type int
, integer literals in the range [0x7fff, 0x7fffffff]
are of type long
and the remainder are of type unsigned long
.
There are four pre-defined procedure tokens supplied with the compiler which are used in the startup files to provide the default specification for integer literals:
-
~lit_int
is the external identification of a token that returns the integer type according to the rules of ISO C for an unsuffixed decimal; -
~lit_hex
is the external identification of a token that returns the integer type according to the rules of ISO C for an unsuffixed hexadecimal; -
~lit_unsigned
is the external identification of a token that returns the integer type according to the rules of ISO C for integers suffixed byU
only; -
~lit_long
is the external identification of a token that returns the integer type according to the rules of ISO C for integers suffixed byL
only.
5.3. Extended integral types
The long long
integral types are not part of ISO C or C++ by default, however support for them can be enabled using the directive:
#pragma TenDRA longlong type allow
This support includes allowing long long
in type specifiers and allowing LL
and ll
as integer literal suffixes.
There is a further directive given by the two cases:
#pragma TenDRA set longlong type : long long #pragma TenDRA set longlong type : long
which can be used to control the implementation of the long long
types. Either they can be mapped to the default representation, which is guaranteed to contain at least 64 bits, or they can be mapped to the corresponding long
types.
Because these long long
types are not an intrinsic part of C++ the implementation does not integrate them into the language as fully as is possible. This is to prevent the presence or otherwise of long long
types affecting the semantics of code which does not use them. For example, it would be possible to extend the rules for the types of integer literals, integer promotion types and arithmetic types to say that if the given value does not fit into the standard integral types then the extended types are tried. This has not been done, although these rules could be implemented by changing the definitions of the standard tokens used to determine these types. By default, only the rules for arithmetic types involving a long long
operand and for LL
integer literals mention long long
types.
5.4. Bitfield types
The C++ rules on bitfield types differ slightly from the C rules. Firstly any integral or enumeration type is allowed in a bitfield, and secondly the bitfield width may exceed the underlying type size (the extra bits being treated as padding). These properties can be controlled using the directives:
#pragma TenDRA extra bitfield int type allow #pragma TenDRA bitfield overflow allow
respectively.
The ISO C standard only allows signed int
, unsigned int
and their equivalent types as type specifiers in bitfields. Using the default checking profile, tdfc2 raises errors for other integral types used as type specifiers in bitfields. This behaviour may be modified using the pragma:
#pragma TenDRA extra int bitfield type permit
where permit is one of allow
(no errors raised), warning
(allow non-int bitfields through with a warning) or disallow
(raise errors for non-int bitfields).
If non-int bitfields are allowed, the bitfield is treated as if it had been declared with an int
type of the same signedness as the given type. The use of the type char
as a bitfield type still generally causes an error, since whether a plain char
is treated as signed
or unsigned
is implementation-dependent. The pragma:
#pragma TenDRA character set-sign
where set-sign is signed
, unsigned
or either
, can be used to specify the signedness of a plain char
bitfield. If set-sign is signed
or unsigned
, the bitfield is treated as though it were declared signed char
or unsigned char
respectively. If set-sign is either
, the sign of the bitfield is target-dependent and the use of a plain char
bitfield causes an error.
5.5. Type declarations
C does not allow multiple definitions of a typedef
name, whereas C++ allows multiple consistent definitions. This behaviour can be controlled using the directive:
#pragma TenDRA extra type definition allow
In accordence with the ISO C standard, in default mode tdfc2 does not allow a type to be defined more than once using a typedef
. The pragma:
#pragma TenDRA extra type definition permit
where permit is allow
(silently accepts redefinitions, provided they are consistent), warning
or disallow
.
5.6. Type compatibility
The directive:
#pragma TenDRA incompatible type qualifier allow
allows objects to be redeclared with different cv-qualifiers (normally such redeclarations would be incompatible). The composite type is qualified using the join of the cv-qualifiers in the various redeclarations.
The directive:
#pragma TenDRA compatible type : type-id == type-id : allow
asserts that the given two types are compatible. Currently the only implemented version is char * == void *
which enables char *
to be used as a generic pointer as it was in older dialects of C.
5.7. Incomplete types
Some dialects of C allow incomplete arrays as member types. These are generally used as a place-holder at the end of a structure to allow for the allocation of an arbitrarily sized array. Support for this feature can be enabled using the directive:
#pragma TenDRA incomplete type as object type allow
The ISO C standard (Section 6.1.2.5) states that an incomplete type e.g an undefined structure or union type, is not an object type and that array elements must be of object type. The default behaviour of the checker causes errors when incomplete types are used to specify array element types. The pragma:
#pragma TenDRA incomplete type as object type permit
can be used to alter the treatment of array declarations with incomplete element types. permit is one of allow
, disallow
or warning
as usual.
5.8. Built-in types
The definitions of implementation dependent integral types which arise naturally within the language - the type of the difference of two pointers, ptrdiff_t
, and the type of the sizeof
operator, size_t
- given in the <stddef.h>
header can be overridden using the directives:
#pragma TenDRA set ptrdiff_t : type-id #pragma TenDRA set size_t : type-id
These directives are useful when targeting a specific machine on which the definitions of these types are known; while they may not affect the code generated they can cut down on spurious conversion warnings. Note that although these types are built into the producer they are not visible to the user unless an appropriate header is included (with the exception of the keyword wchar_t
in ISO C++), however the directives:
#pragma TenDRA++ type identifier for type-name
can be used to make these types visible. They are equivalent to a typedef
declaration of identifier as the given built-in type, ptrdiff_t
, size_t
or wchar_t
.
5.9. Sign of char
Whether plain char
is signed or unsigned is implementation dependent. By default the implementation is determined by the definition of the ~char
token, however this can be overridden in the producer either by means of the portability table or by the directive:
#pragma TenDRA character character-sign
where character-sign can be signed
, unsigned
or either
(the default). Again this directive is useful primarily when targeting a specific machine on which the signedness of char
is known.
6. Configuration for literals
- 6.1. Integer literals
- 6.2. Character literals
- 6.3. Writeable String literals
- 6.4. Concatenation of character string literals and wide character string literals
- 6.5. Escape sequences
6.1. Integer literals
The rules for finding the type of an integer literal can be described using directives of the form:
#pragma TenDRA integer literal literal-spec
where:
literal-spec : literal-base literal-suffix? literal-type-list literal-base : octal decimal hexadecimal literal-suffix : unsigned long unsigned long long long unsigned long long literal-type-list : * literal-type-spec integer-literal literal-type-spec | literal-type-list ? literal-type-spec | literal-type-list literal-type-spec : : type-id * allow? : identifier * * allow? :
Each directive gives a literal base and suffix, describing the form of an integer literal, and a list of possible types for literals of this form. This list gives a mapping from the value of the literal to the type to be used to represent the literal. There are three cases for the literal type; it may be a given integral type, it may be calculated using a given literal type token (see C/C++ Producer Implementation), or it may cause an error to be raised. There are also three cases for describing a literal range; it may be given by values less than or equal to a given integer literal, it may be given by values which are guaranteed to fit into a given integral type, or it may be match any value. For example:
#pragma token PROC ( VARIETY c ) VARIETY l_i # ~lit_int #pragma TenDRA integer literal decimal 32767 : int | ** : l_i
describes how to find the type of a decimal literal with no suffix. Values less that or equal to 32767 have type int
; larger values have target dependent type calculated using the token ~lit_int
. Introducing a warning
into the directive will cause a warning to be printed if the token is used to calculate the value.
Note that this scheme extends that implemented by the C producer, because of the need for more accurate information in the C++ producer. For example, the specification above does not fully express the ISO rule that the type of a decimal integer is the first of the types int
, long
and unsigned long
which it fits into (it only expresses the first step). However with the C++ extensions it is possible to write:
#pragma token PROC ( VARIETY c ) VARIETY l_i # ~lit_int #pragma TenDRA integer literal decimal ? : int | ? : long |\ ? : unsigned long | ** : l_i
6.2. Character literals
By default, a simple character literal has type int
in C and type char
in C++. The type of such literals can be controlled using the directive:
#pragma TenDRA++ set character literal : type-id
The type of a wide character literal is given by the implementation defined type wchar_t
. By default, the definition of this type is taken from the target machine's <stddef.h>
C header (note that in ISO C++, wchar_t
is actually a keyword, but its underlying representation must be the same as in C). This definition can be overridden in the producer by means of the directive:
#pragma TenDRA set wchar_t : type-id
for an integral type type-id.
6.3. Writeable String literals
By default, character string literals have type char [n]
in C and older dialects of C++, but type const char [n]
in ISO C++. Similarly wide string literals have type wchar_t [n]
or const wchar_t [n]
. Whether string literals are const
or not can be controlled using the two directives:
#pragma TenDRA++ set string literal : const #pragma TenDRA++ set string literal : no const
In the case where literals are const
, the array-to-pointer conversion is allowed to cast away the const
to allow for a degree of backwards compatibility. The status of this deprecated conversion can be controlled using the directive:
#pragma TenDRA writeable string literal allow
(yes, I know that that should be writable
). Note that this directive has a slightly different meaning in the C producer.
The ISO C standard, section 6.1.4, states that if the program attempts to modify a string literal of either form, the behaviour is undefined
. Assignments to string literals of the form:
"abc" = '3';
always result in errors. Other attempts to modify members of string literals, e.g.
"abc"[1] = '3';
are permitted in the default checking mode. This behaviour can be changed using:
#pragma TenDRA writeable string literal permit
where permit may be allow
, warning
or disallow
.
6.4. Concatenation of character string literals and wide character string literals
Adjacent string literals tokens of similar types (either both character string literals or both wide string literals) are concatenated at an early stage in parser, however it is unspecified what happens if a character string literal token is adjacent to a wide string literal token. By default this gives an error, but the directive:
#pragma TenDRA unify incompatible string literal allow
can be used to enable the strings to be concatenated to give a wide string literal.
If a '
or "
character does not have a matching closing quote on the same line then it is undefined whether an implementation should report an unterminated string or treat the quote as a single unknown character. By default, the C++ producer treats this as an unterminated string, but this behaviour can be controlled using the directive:
#pragma TenDRA unmatched quote allow
The ISO C standard, section 6.1.4, states that if a character string literal is adjacent to a wide character string literal, the behaviour is undefined. By default, this is flagged as an error by the checker. If the pragma:
#pragma TenDRA unify incompatible string literal permit
is used, with permit set to allow
or warning
the character string literal is converted to a wide character string literal and the strings are concatenated, although in the warning
case a warning is output. The disallow
version of the pragma restores the default behaviour.
6.5. Escape sequences
By default, if the character following the \
in an escape sequence is not one of those listed in the ISO C or C++ standards then an error is given. This behaviour, which is left unspecified by the standards, can be controlled by the directive:
#pragma TenDRA unknown escape allow
The result is that the \
in unknown escape sequences is ignored, so that \z
is interpreted as z
, for example. Individual escape sequences can be enabled or disabled using the directives:
#pragma TenDRA++ escape character-literal as character-literal allow #pragma TenDRA++ escape character-literal disallow
so that, for example:
#pragma TenDRA++ escape 'e' as '\033' allow #pragma TenDRA++ escape 'a' disallow
sets \e
to be the ASCII escape character and disables the alert character \a
.
By default, if the value of a character, given for example by a \x
escape sequence, does not fit into its type then an error is given. This implementation dependent behaviour can however be controlled by the directive:
#pragma TenDRA character escape overflow allow
the value being converted to its type in the normal way.
The ISO C standard specifies a small set of escape sequences in strings, for example \n
as newline. Unknown escape sequences lead to an error in the default mode , however the severity of the error may be altered using:
#pragma TenDRA unknown escape permit
where permit is allow
(silently replaces the unknown escape sequence, \z say, by z
), warning
or disallow
.
7. Configuration for declarations
- 7.1. Empty source files
- 7.2. Untagged compound types
- 7.3. Empty declarations
- 7.4. Unifying the tag name space
- 7.5. Extra commas
- 7.6. Implicit
int
- 7.7. Implicit function declarations
- 7.8. Forward enumeration declarations
- 7.9. Variable scope in
for
statements - 7.10. Anonymous unions
7.1. Empty source files
ISO C requires that a translation unit should contain at least one declaration. C++ and older dialects of C allow translation units which contain no declarations. This behaviour can be controlled using the directive:
#pragma TenDRA no external declaration allow
The ISO standard states that each source file should contain at least one declaration or definition. Source files which contain no external declarations or definitions are flagged as errors by the checker in default mode. The severity of the error may be altered using:
#pragma TenDRA no external declaration permit
where the options for permit are allow
(no errors raised), warning
or disallow
.
7.2. Untagged compound types
ISO C++ requires every declaration or member declaration to introduce one or more names into the program. The directive:
#pragma TenDRA unknown struct/union allow
can be used to relax one particular instance of this rule, by allowing anonymous class definitions (recall that anonymous unions are objects, not types, in C++ and so are not covered by this rule).
The ISO C standard states that a declaration must declare at least a declarator, a tag or the members of an enumeration. The checker detects such declarations and, by default, raises an error. The severity of the errors can be altered by:
#pragma TenDRA unknown struct/union permit
where permit may be allow
to allow code such as:
struct { int i; int j; };
through without errors (statements such as this occur in some system headers) or disallow
to restore the default behaviour.
7.3. Empty declarations
The C++ grammar also allows a solitary semicolon as a declaration or member declaration; however such a declaration does not introduce a name and so contravenes the rule above. The rule can be relaxed in this case using the directive:
#pragma TenDRA extra ; allow
Note that the C++ grammar explicitly allows for an extra semicolon following an inline member function definition, but that semicolons following other function definitions are actually empty declarations of the form above. A solitary semicolon in a statement is interpreted as an empty expression statement rather than an empty declaration statement.
Some dialects of C allow extra semicolons at the external declaration and definition level in contravention of the ISO C standard. For example, the program:
int f () { return ( 0 ); };
is not ISO compliant. The checker enforces the ISO rules by default, but the errors raised may be reduced to warning or suppressed entirely using:
#pragma TenDRA extra ; permit
with permit as warning
or allow
. The disallow
options restores the default behaviour.
7.4. Unifying the tag name space
Each object in the tag name space is associated with a classification (struct
, union
or enum
) of the type to which it refers. If such a tag is used, it must be preceded by the correct classification, otherwise the checker produces an error by default. However, the pragma:
#pragma TenDRA ignore struct/union/enum tag status
may be used to change the severity of the error. The options for status are: on
(allows a tag to be used with any of the three classifications, the correct classification being inferred from the type definition), warning
or off
.
7.5. Extra commas
The ISO C standard does not allow extra commas in enumeration type declarations e.g.
enum e = { red , orange , yellow , };
The extra comma at the end of the declaration is flagged as an error by default, but this behaviour may be changed by using:
#pragma TenDRA extra , permit
where permit has the usual allow
, disallow
and warning
options.
7.6. Implicit int
The C "implicit int
" rule, whereby a type of int
is inferred in a list of type or declaration specifiers which does not contain a type name, has been removed in ISO C++, although it was supported in older dialects of C++. This check is controlled by the directive:
#pragma TenDRA++ implicit int type allow
Partial relaxations of this rules are allowed. The directive:
#pragma TenDRA++ implicit int type for const/volatile allow
will allow for implicit int
when the list of type specifiers contains a cv-qualifier. Similarly the directive:
#pragma TenDRA implicit int type for function return allow
will allow for implicit int
in the return type of a function definition (this excludes constructors, destructors and conversion functions, where special rules apply). A function definition is the only kind of declaration in ISO C where a declaration specifier is not required. Older dialects of C allowed declaration specifiers to be omitted in other cases. Support for this behaviour can be enabled using:
#pragma TenDRA implicit int type for external declaration allow
The four cases can be demonstrated in the following example:
extern a ; // implicit int const b = 1 ; // implicit const int f () // implicit function return { return 2 ; } c = 3 ; // error: not allowed in C++
Older C dialects allow external variables to be specified without a type, the type int
being inferred. Thus, for example:
a, b;
is equivalent to:
int a, b;
By default these inferred declarations are not permitted, though tdfc2's behaviour can be modified using:
#pragma TenDRA implicit int type for external declaration permit
where permit is allow
, warning
or disallow
.
A more common feature, allowed by the ISO C standard, but considered bad style by some, is the inference of an int return type for functions defined in the form:
f ( int n ) { .... }
the checker's treatment of such functions can be determined using:
#pragma TenDRA implicit int type for function return permit
where permit can be allow
, warning
or disallow
.
7.7. Implicit function declarations
C, but not C++, allows calls to undeclared functions, the function being declared implicitly. It is possible to enable support for implicit function declarations using the directive:
#pragma TenDRA implicit function declaration on
Such implicitly declared functions have C linkage and type int ( ... )
.
7.8. Forward enumeration declarations
The ISO C Standard (Section 6.5.2.3) states that the first introduction of an enumeration tag shall declare the constants associated with that tag. This rule is enforced by the checker in default mode, however it can be relaxed using the pragma:
#pragma TenDRA forward enum declaration permit
where replacing permit by allow
permits the declaration and use of an enumeration tag before the declaration of its associated enumeration constants. A disallow
variant which restores the default behaviour is also available.
7.9. Variable scope in for
statements
In ISO C++ the scope of a variable declared in a for-init-statement is the body of the for
statement; in older dialects it extended to the end of the enclosing block. So:
for ( int i = 0 ; i < 10 ; i++ ) { // for statement body } return i ; // OK in older dialects, error in ISO C++
This behaviour is controlled by the directive:
#pragma TenDRA++ for initialization block on
a state of on
corresponding to the ISO rules and off
to the older rules. Perhaps most useful is the warning
state which implements the old rules but gives a warning if a variable declared in a for-init-statement is used outside the corresponding for
statement body. A program which does not give such warnings should compile correctly under either set of rules.
7.10. Anonymous unions
A union declared without introducing a tag or identifier is termed an anonymous union. Members populate the scope where the union itself is declared. For example, this may be a surrounding struct, or a block:
union { int a; int b; }; a = 5;
The ISO C Standard (Section 6.5.2.1) states that a union declaration must contain a tag or identifier. Several compilers permit this as an extension to C, and the later C11 standard formalises this as a required feature. Permissibility may be controlled using the pragma:
#pragma TenDRA anonymous union permit
where replacing permit by allow
permits the declaration of an anonymous union, and warning will allow the declaration but produce a warning. A disallow
variant which restores the default behaviour is also available.
By default anonymous unions are dissallowed for C.
Anonymous unions are required to be supported by C++, and setting this pragma has no effect. For C++, an anonymous union cannot have private or protected members or member functions (in addition, no union can have static data members).
8. Configuration for initialisers
8.1. Initialisation of compound types
Many older C dialects do not allow the initialisation of automatic variables of compound type. Thus, for example:
void f () { struct { int a; int b; } x = { 3, 2 }; }
would not be allowed by some older compilers, although by default tdfc2 does not raise any errors since the code is legal according to the ISO C standard. The checker's behaviour may be changed using:
#pragma TenDRA initialization of struct/union (auto) permit
where permit is allow
, warning
or disallow
. This feature is particularly useful when developing a program which is intended to be compiled with a compiler which does not support automatic compound initialisations.
8.2. Variable initialisation
The ISO C standard (Section 6.5.7) states that all expressions in an initialiser for an object that has static storage duration or in an initialiser-list for an object that has aggregate or union type shall be constant expressions. The pragma:
#pragma TenDRA variable initialization permit
may be used to allow non-constant initialisers if permit is replaced by allow
. The other option for permit is disallow
which restores the default behaviour of flagging non-constant initialisers for objects of static storage duration as errors.
9. Configuration for expressions
9.1. Cast expressions
ISO C++ introduces the constructs static_cast
, const_cast
and reinterpret_cast
, which can be used in various contexts where an old style explicit cast would previously have been used. By default, an explicit cast can perform any combination of the conversions performed by these three constructs. To aid migration to the new style casts the directives:
#pragma TenDRA++ explicit cast as cast-state allow #pragma TenDRA++ explicit cast allow
where cast-state is defined as follows:
cast-state : static_cast const_cast reinterpret_cast static_cast | cast-state const_cast | cast-state reinterpret_cast | cast-state
can be used to restrict the conversions which can be performed using explicit casts. The first form sets the interpretation of explicit cast to be combinations of the given constructs; the second resets the interpretation to the default. For example:
#pragma TenDRA++ explicit cast as static_cast | const_cast allow
means that conversions requiring reinterpret_cast
(the most unportable conversions) will not be allowed to be performed using explicit casts, but will have to be given as a reinterpret_cast
construct. Changing allow
to warning
will also cause a warning to be issued for every explicit cast expression.
9.2. Initialiser expressions
C, but not C++, only allows constant expressions in static initialisers. The directive:
#pragma TenDRA variable initialization allow
can be enable support for C++-style dynamic initialisers. Conversely, it can be used in C++ to detect such dynamic initialisers.
In older dialects of C it was not possible to initialise an automatic variable of structure or union type. This can be checked for using the directive:
#pragma TenDRA initialization of struct/union (auto) allow
The directive:
#pragma TenDRA++ complete initialization analysis on
can be used to check aggregate initialisers. The initialiser should be fully bracketed (i.e. with no elision of braces), and should have an entry for each member of the structure or array.
9.3. Lvalue expressions
C++ defines the results of several operations to be lvalues, whereas they are rvalues in C. The directive:
#pragma TenDRA conditional lvalue allow
is used to apply the C++ rules for lvalues in conditional (?:
) expressions.
Older dialects of C++ allowed this
to be treated as an lvalue. It is possible to enable support for this dialect feature using the directive:
#pragma TenDRA++ this lvalue allow
however it is recommended that programs using this feature should be modified.
The ?
operator cannot normally be used to define an lvalue, so that for example, the program:
struct s { int a, b; }; void f ( int n, struct s *s1, struct s *s2 ) { ( n ? s1 : s2) -> a = 0; }
is not allowed in ISO C. The pragma:
#pragma TenDRA conditional lvalue allow
allows conditional lvalues if:
-
Both options of the conditional operator have compatible compound types;
-
Both options of the conditional are lvalues.
(there is also a disallow
variant, but warning
is not permitted in this case).
10. Configuration for functions
10.1. Ellipsis in function calls
The directive:
#pragma TenDRA ident ... allow
may be used to enable or disable the use of ...
as a primary expression in a function defined with ellipsis. The type of such an expression is implementation defined. This expression is used in the definition of the va_start
macro in the <stdarg.h>
header. This header automatically enables this switch.
An ellipsis is not an identifier and should not be used in a function call, even if, as in the program below, the function prototype contains an ellipsis:
int f (int a, ...) { return 1; } int main() { int x, y; x = f(y, ...); return 1; }
In default mode the checker raises an error if an ellipsis is used as a parameter in a function call. The severity of this error can be modified by using:
#pragma TenDRA ident ... permit
If permit is replaced by allow
the ellipsis is ignored, if warning
is used tdfc2 produces a warning and if disallow
is used the default behaviour is restored.
10.2. Static block level functions
The ISO C standard (Section 6.5.1) states that the declaration of an identifier for a function that has block scope shall have no explicit storage-class specifier other than extern. By default, tdfc2 raises an error for declarations which do not conform to this rule. The behaviour can be modified using:
#pragma TenDRA block function static permit
where permit is allow
(accept block scope function declarations with other storage-class specifiers), disallow
or warning
.
11. Configuration for linkage
- 11.1. Default linkage
- 11.2. Identifier linkage
- 11.3. Static identifiers
- 11.4. External volatility
- 11.5. Function linkage
- 11.6. Resolving linkage problems
11.1. Default linkage
It is possible to set the default language linkage using the directive:
#pragma TenDRA++ external linkage string-literal
This is equivalent to enclosing the rest of the current checking scope in:
extern string-literal { .... }
It is unspecified what happens if such a directive is used within an explicit linkage specification and does not nest correctly. This directive is particularly useful when used in a named environment associated with an include directory. For example, it can be used to express the fact that all the objects declared in headers included from that directory have C linkage.
11.2. Identifier linkage
The ISO C standard, section 6.1.2.2, states that if, within a translation unit, an identifier appears with both internal and external linkage, the behaviour is undefined
. By default, the checker silently declares the variable with external linkage. The check to detect variables which are redeclared with incompatible linkage is controlled using:
#pragma TenDRA incompatible linkage permit
where permit may be allow
(default mode), warning
(warn about incompatible linkage) or disallow
(raise errors for redeclarations with incompatible linkage).
If an object is declared with both external and internal linkage in the same translation unit then, by default, an error is given. This behaviour can be changed using the directive:
#pragma TenDRA incompatible linkage allow
When incompatible linkages are allowed, whether the resultant identifier has external or internal linkage can be set using one of the directives:
#pragma TenDRA linkage resolution : off #pragma TenDRA linkage resolution : (external) on #pragma TenDRA linkage resolution : (internal) on
It is possible to declare objects with external linkage in a block. C leaves it undefined whether declarations of the same object in different blocks, such as:
void f () { extern int a ; .... } void g () { extern double a ; .... }
are checked for compatibility. However in C++ the one definition rule implies that such declarations are indeed checked for compatibility. The status of this check can be set using the directive:
#pragma TenDRA unify external linkage on
11.3. Static identifiers
By default, objects and functions with internal linkage are mapped to tags without external names in the output TDF capsule. Thus such names are not available to the installer and it needs to make up internal names to represent such objects in its output. This is not desirable in such operations as profiling, where a meaningful internal name is needed to make sense of the output. The directive:
#pragma TenDRA preserve identifier-list
can be used to preserve the names of the given list of identifiers with internal linkage. This is done using the static_name_def
TDF construct. The form:
#pragma TenDRA preserve *
will preserve the names of all identifiers with internal linkage in this way.
11.4. External volatility
Older dialects of C treated all identifiers with external linkage as if they had been declared volatile
(i.e. by being conservative in optimising such values). This behaviour can be enabled using the directive:
#pragma TenDRA external volatile_t
#pragma TenDRA external volatile_t
instructs the checker thereafter to treat any object declared with external linkage (ISO C standard Section 6.1.2.2) as if it were volatile (ISO C standard Section 6.5.3). This was a feature of some traditional C dialects. In the default mode, objects with external linkage are only treated as volatile if they were declared with the volatile
type qualifier.
11.5. Function linkage
A change in ISO C++ relative to older dialects is that the language linkage of a function now forms part of the function type. For example:
extern "C" int f ( int ) ; int ( *pf ) ( int ) = f ; // error
The directive:
#pragma TenDRA++ external function linkage on
can be used to control whether function types with differing language linkages, but which are otherwise compatible, are considered compatible or not.
Note that it is not possible in ISO C or C++ to declare objects or functions with internal linkage in a block. While static
object definitions in a block have a specific meaning, there is no real reason why static
functions should not be declared in a block. This behaviour can be enabled using the directive:
#pragma TenDRA block function static allow
Inline functions have external linkage by default in ISO C++, but internal linkage in older dialects. The default linkage can be set using the directive:
#pragma TenDRA++ inline linkage linkage-spec
where linkage-spec can be external
or internal
. Similarly const
objects have internal linkage by default in C++, but external linkage in C. The default linkage can be set using the directive:
#pragma TenDRA++ const linkage linkage-spec
11.6. Resolving linkage problems
Often the way that identifier names are resolved can alter the semantics of a program. For example, in:
void f () { { extern void g (); g ( 3 ); } g ( 7 ); }
the external declaration of g
is only in scope in the inner block of f
. Thus, at the second call of g
, it is not in scope, and so is inferred to have declaration:
extern int g ();
(see 3.4). This conflicts with the previous declaration of g
which, although not in scope, has been registered in the external namespace. The pragma:
#pragma TenDRA unify external linkage on
modifies the algorithm for resolving external linkage by searching the external namespace before inferring a declaration. In the example above, this results in the second use of g
being resolved to the previous external declaration. The on can be replaced by warning
to give a warning when such resolutions are detected, or off
to switch this feature off.
Another linkage problem, which is left undefined in the ISO C standard, is illustrated by the following program:
int f () { extern int g (); return ( g () ); } static int g () { return ( 0 ); }
Is the external variable g
(the declaration of which would be inferred if it was omitted) the same as the static variable g
? Of course, had the order of the two functions been reversed, there would be no doubt that they were, however, in the given case it is undefined. By default, the linkage is resolved externally, so that the two uses of g are not identified. However, the checker can be made to resolve its linkage internally, so that the two uses of g
are identified. The resolution algorithm can be set using:
#pragma TenDRA linkage resolution : action
where action can be one of:
-
(internal) on
-
(internal) warning
-
(external) on
-
(external) warning
-
off
depending on whether the linkage resolution is internal, external, or default, and whether a warning message is required. The most useful behaviour is to issue a warning for all such occurrences (by setting action to (internal) warning
, for example) so that the programmer can be alerted to clarify what was intended.
A. Standard library
At present the default implementation contains only a very small fraction of the ISO C++ library, namely those headers - <exception>
, <new>
and <typeinfo>
- which are an integral part of the language specification. These headers are also those which require the most cooperation between the producer and the library implementation, as described in C/C++ Producer Implementation.
It is suggested that if further library components are required then they be acquired from third parties. It should be noted however that such libraries may require some effort to be ported to an ISO compliant compiler; for example, some information on porting the libio
component of libg++
, which contains some very compiler-dependent code, are given in the C++ and Portability document. Libraries compiled with other C++ compilers may not link correctly with modules compiled using tcc
.