Wednesday 30 March 2016

Preprocessor Directives: C++

Pre-processor directives may not be a popular part of the language but they sure are very important and useful. Here are some of their uses, these are not in any particular order.

Each Pre-processor directive must be in its own line. This means that you cannot as is the case with semi colon separated statements of the language, squeeze two or more pre-processor directives in one line. For example the following would not work:

#include <stdio.h> #include <stdlib.h>

Here are some of these directives in some useful depth:

#define :

The #define directive defines two things, an identifier and a character sequence. The identifier is to be used in the subsequent program just like a normal variable. And the character sequence is in fact the value of this identifier. So when the program will run, it is this character sequence that will participate in all the expressions where the identifier has been used. To be a little more exact and differentiate an identifier from the one defined using this pre-processor directive is the name for this, a macro
The syntax is very intuitive:
#define LENGTH 20

Notice that there is no semicolon in this statement. There may be any number of spaces between the identifier and the character sequence, but once the character sequence begins, it is terminated only by a newline.
Once a macro name has been defined, it may be used as part of the definition of other macro names. For example, this code defines the values of ONE, TWO, and THREE:

#define ONE 1

#define TWO ONE+ONE
#define THREE ONE+TWO
Macro substitution is simply the replacement of an identifier by the character sequence associated with it. Therefore, if you wish to define a standard error message, you might write something like this:

#define E_MS "standard error on input\n"

/* ... */
printf(E_MS);
If the character sequence is longer than one line, you may continue it on the next by placing a backslash at the end of the line, as shown here:

#define LONG_STRING "this is a very long \

string that is used as an example"
All of the stuff defined above is how you would invoke the described functionality in C. But in C++, this is not the only way to do stuff like this. The Const keyword has been introduced specially in C++ to define such macros.
The #define directive has another powerful feature: the macro name can have arguments. Each time the macro name is encountered, the arguments used in its definition are replaced by the actual arguments found in the program. This form of a macro is called a function-like macro. Example :

#define ABC(a) (a)>0?1:-1

The parenthesis ensure proper substitution and better results.
And now to somewhat repeat a pattern within the article, the text immediately above represent the only wau to define and use macros in C. But this is not so in C++. The inline keyword is used to define functions like this that are not called like a normal function but there is copied and placed upon the statement in which they are invoked.

#if #elif:

The following code snippets serve enough explanation togethor with the syntax for the #if and #elif directives.

#if expression
statement(s)
#elif expression
statement(s)
.
.
.
#endif
The #endif at the end may seem abnormal but it is what it is. 

#else :
The #if directive apart from being used with the #elif directive can also be used with the #else directive.
#if expression
statements
#else
statements
#endif

#endif :

As you have already seen, this directive is used with #if, #else, and the #elif directives. It doesn't really have any statements to follow it. It simply signifies the end of a conditional macro.

#error :

The #error directive forces the compiler to stop compilation. It is used primarily for debugging. The general form of the #error directive is:

#error error-message
The error-message is not between double quotes. When the #error directive is encountered, the error message is displayed, possibly along with other information defined by the compiler. It stops the compiler to even run the program.

At this point after understanding some of these conditional directives, one may ask why do we even need them? There question is valid and indeed a good one since we already have a simple, flexible and powerful if else system in our language for such type of decision making. Well the only difference between the two ways of doing apparently the same thing is amount of code that needs compilation. If we use the if() and else() blocks like most would do, then regardless of which condition is true, every block must be compiled. This means that the compiler must scan, parse and process each statement even though it is not be run considering the expression triggering them are false. This of course is fine for smaller or even bigger programs. But when there are performance issues then this may not be the optimal way of doing things. Compilation takes time. After all how fast can fast be. So if even the tiniest of times count, then it is best to use #if and #else directives. This is so because using pre-processor directives ensures that only those statement (in this case directives) are considered and compiled by the compiler that fall under the category of expressions turning out to be true. So if any expression is going to be false (this of course will be determined by the arguments being passed) then the statements in that block will be bluntly ignored by the compiler for that particular run of the source code. 
This is a big advantage when writing code for big projects. In fact I've seen from experience how heavily these pre-processor directives are used in bigger projects, projects seemingly out of my comprehension. 
#ifdef :
This is like a combination of if and define directives, obvious I guess. Consider the following code:
#ifdef macro_name
statements
#endif
The code b/w #ifdef and #endif will be compiled only if the macro name has been previously defined using the #define directive. This may seem to be a bit of a stretch in terms of it being of a very specific use. In fact that is the case. For all I care this is a short hand for doing something which could otherwise be done with a little more effort.
Consider the following macro definition using the #define directive:
#define HEY

Then accordingly the following directive is true and hence the 
enclosed statements will run:
#ifdef HEY
statements
#endif

The longer of doing this by using the #if directive and the defined
keyword (if you will).
The following is an expression, so it can either be true or false.
defined HEY
Of course it will be true if the identifier HEY has been defined
previously using the #define directive. And it will be false otherwise.
Since this is an expression, it will go rather nicely with our very own
#if directive as:
#if defined HEY
statements
#endif

The above expression: defined HEY can also be proceeded  with a '!' to negate the expression.

Again as a shorthand to using the ! before the expression, following directive can be used:
#ifndef HEY 
statements
#endif
So to state the obvious, the code b/w #ifndef and #endif will be compiled only if the macro name has NOT been previously defined using the #define directive.

#include:
The #include directive instructs the compiler to read another source file in addition to the one that contains the #include directive. The name of the additional source file must be enclosed between double quotes or angle brackets. For example,
#include "stdio.h"
#include <stdio.h>
The angular brackets are for the libraries already known to your compiler. Basically you have nothing to do with these except for using them.
The double quotes are for header files written by you, the creative programmer. 

#undef :
Enough construction for now, let's destroy something. Well to be fair I'll just be undoing something. As the name suggests, this directive un-defines a macro (for lack of a better word).
#undef macro_name
Before we move to the following directives, we need to know about two identifiers that are predefined. So they are just there, you just have to use them. And if you want to use them, you better get to know them first. Let's have the introductions:
__LINE__ is an identifier storing the line number or serial number of the currently compiled line of code.
__FILE__ is an identifier storing the name of the file being compiled.

And now for the useful part, here is how you ought to use these:
#line:

This directive is used to change the value of the two identifiers described
above. Here is how:
#line number "filename"
The number is any positive number that will become the new value of __LINE__ and the "filename" is optional and it is any valid file identifier which becomes the new value of __FILE__.
This does not change the order of compilation, it just changes the value of an identifier which contains the information of the line which is being compiled currently.

There are a couple of operators that go with these pre-processor directives:

# Operator:
The # operator is an entirely different thing from the pound sign used in all the directives. Firstly it works on the arguments passed to the directive. Its function is to convert whatever is passed to it into a string. 
#define convrt(a) #a
In the above example, the argument a is being fed to the # operator which will convert a into a string. And this string will returned by the directive when called upon.
So when the following line of code is written following the above directive:
cout<<convrt(hey i am 007);
It equates to the following:
cout<<"hey i am 007";

## Operator":

The ## operator concatenates. But beware, it is very different from your daily normal use string concatenation. ## operator literally concatenates things to give a name of an identifier. To see what I mean, consider the following:
#define concat(a, b) a##b
So whenever concat is called with two parameters (each being a collection of characters), the macro will return another collection of characters which is the concatenation of the parameters. This returned value can in fact be used as an identifier. So if we have an identifier with a big name like: Sum_Of_Numbers:
int Sum_Of_Numbers=5;
And now think you need to print the value of this variable. The obvious (or sane) way of doing this would be:
cout<<Sum_Of_Numbers;
But no, we must use cryptic code else how will people be impressed. Following shows how we would do this using our parameterized directive:
cout<<concat(Sum_Of, _Numbers);
The derivative will return a concatenated string of character (not the data type string) and this returned value will be placed over the calling statement. Hence the above statement will literally change to the following:
cout<<Sum_Of_Numbers;
Which is what we wanted to do in the first place.
Too many ingredients for a simple recipe. 

Predefined Stuff in Macros:
Along the lines of __LINE__ (see what I did there?) and __FILE__,
There are some more of these predefined identifier things. Some of
them are:

__DATE__:

This one contains a string of the form month/day/year that is the date
of the translation of the source of file into object code.

__TIME__:

This one contains the time at which the program was compiled. The
time is contained in a string of the form: hour:minute:second.

__STDC__:

The meaning of __STDC__ is implementation defined (to be honest 
I don't really know what that means). Generally, if __STDC__ is defined,
the compiler will accept only standard C/C++ code that does not contain
any nonstandard extensions (don't really know what that means either).

__cplusplus:
This one contains a value. It can be any value. The important thing is the number of digits in the value contained. If the number of digits is 6 it means that the compiler is conforming to Standard C++. For digits less than 6, you have nonconforming compilers in hand. 

No comments:

Post a Comment