8. Preprocessing checks
- 8.1. Preprocessor directives
- 8.2. Indented Preprocessing Directives
- 8.3. Multiple macro definitions
- 8.4. Macro arguments
- 8.5. Unmatched quotes
- 8.6. Include depth
- 8.7. Text after
#endif
- 8.8. Text after
#
- 8.9. New line at end of file
- 8.10. Conditional Compilation
- 8.11. Target dependent conditional inclusion
- 8.12. Unused headers
This chapter describes tdfc2's capabilities for checking the preprocessing constructs that arise in C.
8.1. Preprocessor directives
By default, the TenDRA C checker understands those preprocessor directives specified by the ISO C standard, section 6.8, i.e. #if
, #ifdef
, #ifndef
, #elif
, #else
, #endif
, #error
, #line
and #pragma
. As has been mentioned, #pragma
statements play a significant role in the checker. While any recognised #pragma
statements are processed, all unknown pragma statements are ignored by default. The check to detect unknown pragma statements is controlled by:
#pragma TenDRA unknown pragma permit
The options for permit are disallow
(raise an error if an unknown pragma is encountered), warning
(allow unknown pragmas with a warning), or allow
(allow unknown pragmas without comment).
In addition, the common non-ISO preprocessor directives, #file
, #ident
, #assert
, #unassert
and #weak
may be permitted using:
#pragma TenDRA directive dir allow
where dir is one of file
, ident
, assert
, unassert
or weak
. If allow is replaced by warning
then the directive is allowed, but a warning is issued. In either case, the modifier (ignore)
may be added to indicate that, although the directive is allowed, its effect is ignored. Thus for example:
#pragma TenDRA directive ident (ignore) allow
causes the checker to ignore any #ident directives without raising any errors.
Finally, the directive dir can be disallowed using:
#pragma TenDRA directive dir disallow
Finally, the directive dir can be disallowed using:
#pragma TenDRA unknown directive allow
Any other unknown preprocessing directives cause the checker to raise an error in the default mode. The pragma may be used to force the checker to ignore such directives without raising any errors.
Disallow
and warning
variants are also available.
8.2. Indented Preprocessing Directives
The ISO C standard allows white space to occur before the #
in a preprocessing directive, and between the #
and the directive name. Many older preprocessors have problems with such directives. The checker's treatment of such directives can be set using:
#pragma TenDRA indented # directive permit
which detects white space before the #
and:
#pragma TenDRA indented directive after # permit
which detects white space before the #
and the directive name. The options for permit are allow
, warning
or disallow
as usual. The default checking profile allows both forms of indented directives.
8.3. Multiple macro definitions
The ISO C standard states that, for two definitions of a function-like macro to be equal, both the spelling of the parameters and the macro definition must be equal. Thus, for example, in:
#define f( x ) ( x ) #define f( y ) ( y )
the two definitions of f
are not equal, despite the fact that they are clearly equivalent. Tchk has an alternative definition of macro equality which allows for consistent substitution of parameter names. The type of macro equality used is controlled by:
#pragma TenDRA weak macro equality allow
where permit is allow
(use alternative definition of macro equality), warning
(as for allow but raise a warning), or disallow
(use the ISO C definition of macro equality - this is the default setting).
More generally, the pragma:
#pragma TenDRA extra macro definition allow
allows macros to be redefined, both consistently and inconsistently. If the definitions are incompatible, the first definition is overwritten. This pragma has a disallow
variant, which resets the check to its default mode.
8.4. Macro arguments
According to the ISO C standard, section 6.8.3, if a macro argument contains a sequence of preprocessing tokens that would otherwise act as a preprocessing directive, the behaviour is undefined. Tchk allows preprocessing directives in macro arguments by default. The check to detect such macro arguments is controlled by:
#pragma TenDRA directive as macro argument permit
where permit is allow
, warning
or disallow
.
The ISO C standard, section 6.8.3.2, also states that each #
preprocessing token in the replacement list for a function-like macro shall be followed by a parameter as the next preprocessing token in the replacement list. By default, if tdfc2 encounters a #
in a function-like macro replacement list which is not followed by a parameter of the macro an error is raised. The checker's behaviour in this situation is controlled by:
#pragma TenDRA no ident after # permit
where the options for permit are allow
(do not raise errors), disallow
(default mode) and warning
(raise warnings instead of errors).
8.5. Unmatched quotes
The ISO C standard, section 6.1, states that if a '
or "
character matches the category of preprocessing tokens described as single non-whitespace-characters that do not lexically match the other preprocessing token categories
, then the behaviour is undefined. For example:
#define a 'b
would result in undefined behaviour. By default the '
character is ignored by tdfc2. A check to detect such statements may be controlled by:
#pragma TenDRA unmatched quote permit
The usual allow
, warning
and disallow
options are available.
8.6. Include depth
Most preprocessors set a maximum depth for #include
directives (which may be limited by the maximum number of files which can be open on the host system). By default, the checker supports a depth equal to this maximum number. However, a smaller maximum depth can be set using:
#pragma TenDRA includes depth n
where n can be any positive integral constant.
8.7. Text after #endif
The ISO C standard, section 6.8, specifies that #endif
and #else
preprocessor directives do not take any arguments, but should be followed by a newline. In the default checking mode, tdfc2 raises an error when #endif
or #else
statements are not directly followed by a new line. This behaviour may be modified using:
#pragma TenDRA text after directive permit
where permit is allow
(no errors are raised and any text on the same line as the #endif
or #else
statement is ignored), warning
or disallow
.
8.8. Text after #
The ISO C standard specifies that a #
occuring outside of a macro replacement list must be followed by a new line or by a preprocessing directive and this is enforced by the checker in default mode. The check is controlled by:
#pragma TenDRA no directive/nline after ident permit
where permit may be allow
, disallow
or warning
.
8.9. New line at end of file
The ISO C standard, section 5.1.1.2, states that source files must end with new lines. Files which do not end in new lines are flagged as errors by the checker in default mode. The behaviour can be modified using:
#pragma TenDRA no nline after file end permit
where permit has the usual allow
, disallow
and warning
options.
8.10. Conditional Compilation
Tchk generally treats conditional compilation in the same way as other compilers and checkers. For example, consider:
#if expr .... /* First branch */ #else .... /* Second branch */ #endif
the expression, expr
, is evaluated: if it is non-zero the first branch of the conditional is processed; if it is zero the second branch is processed instead.
Sometimes, however, tdfc2 may be unable to evaluate the expression statically because of the abstract types and expressions which arise from the minimum integer range assumptions or the abstract standard headers used by the tool (see target-dependent types in section 4.5). For example, consider the following ISO compliant program:
#include <stdio.h> #include <limits.h> int main () { #if ( CHAR_MIN == 0 ) puts ("char is unsigned"); #else puts ("char is signed"); #endif return ( 0 ); }
The TenDRA representation of the ISO API merely states that CHAR_MIN
- the least value which fits into a char - is a target dependent integral constant. Hence, whether or not it equals zero is again target dependent, so the checker needs to maintain both branches. By contrast, any conventional compiler is compiling to a particular target machine on which CHAR_MIN
is a specific integral constant. It can therefore always determine which branch of the conditional it should compile.
In order to allow both branches to be maintained in these cases, it has been necessary for tdfc2 to impose certain restrictions on the form of the conditional branches and the positions in which such target-dependent conditionals may occur. These may be summarised as:
-
Target-dependent conditionals may not appear at the outer level. If the checker encounters a target-dependent conditional at the outer level an error is produced. In order to continue checking in the rest of the file an arbitrary assumption must be made about which branch of the conditional to process; tdfc2 assumes that the conditional is true and the first branch is used;
-
The branches of allowable target-dependent conditionals may not contain declarations or definitions.
8.11. Target dependent conditional inclusion
One of the effects of trying to compile code in a target independent manner is that it is not always possible to completely evaluate the condition in a #if
directive. Thus the conditional inclusion needs to be preserved until the installer phase. This can only be done if the target dependent #if
is more structured than is normally required for preprocessing directives. There are two cases; in the first, where the #if
appears in a statement, it is treated as if it were a if
statement with braces including its branches; that is:
#if cond true_statements #else false_statements #endif
maps to:
if ( cond ) { true_statements } else { false_statements }
In the second case, where the #if
appears in a list of declarations, normally gives an error. The can however be overridden by the directive:
#pragma TenDRA++ conditional declaration allow
which causes both branches of the #if
to be analysed.
8.12. Unused headers
Header files which are included but from which nothing is used within the other source files comprising the translation unit, might just as well not have been included. Tchk can detect top level include files which are unnecessary, by analysing the tdfc2dump output for the file. This check is enabled by passing the -Wd,-H
command line flag to tcc. Errors are written to stderr in a simple ascii form by default, or to the unified dump file in dump format if the -D
command line option is used.