Notes on Programming Practices
Here are a few examples that record what, in my opinion, makes up good
C style. If you remember anything from this guide, remember this: The
code's looks should reflect its intent.
More Purity
Bad: if ((i == 2) && (j == 3)) ...
Good: if (i == 2 && i == 3) ...
Equality has precedence over logical &&, and the parentheses only
clutter the line. Memorize precedence rules and don't over- or
under-parenthesise. Exceptions are cases which are usually
gotten wrong (see next example).
Bad: a = b + c << 2;
Good: a = (b + c) << 2; /* Same thing */
Most people might think that shift has higher precedence than addition
because it is usually used as fast multiplication, but it's actually
lower. Use parentheses here to clarify the semantics of the expression.
Same with bit-wise operators (|, &, ^), which people often get wrong,
and the ternary ?: operator, which has very low precedence but should
have the first operator parenthesised for clarity.
Bad: return (a);
Good: return a;
There's no need for parentheses with "return" and it only clutters
the line.
Bad: #define ELEMENT_SIZE 4
Good: #define ELEMENT_SIZE sizeof (long)
This is not just a portability issue, since longs are likely to stay
4-bytes. It's more for the reader who gets no information from the
"4", whereas understands right away that an element is a "long" in
the second case.
Bad: i = *(s + j);
Good: i = s[j];
The code is equally fast and is more clear. Remember that "a[b]" is
semantically equivalent to "*((a) + (b))".
Bad: int strcpy (char *, char *); /* Prototype */
Good: int strcpy (char *dest, char *src);
Use variable names in prototypes of header files so that the user can
use it as a quick reference.
Bad: if (i == 5)
a++;
Good: if (i == 5) {
a++;
}
It is always a good idea to put braces around blocks, even if they are
only one line long. This can catch two kinds of bugs: a line can be
added in the "if" without realizing that braces should be added; and
the line "a++;" can be commented out without realizing that the "if"
will now apply to the next statement.
Bad: if (!strcmp (a, b)) ...
Good: if (strcmp (a, b) == 0) ...
Don't use the ! operator on values that are not booleans. The "strcmp"
function returns an integer which is less than 0 if a < b, 0 if a == b,
and greater than 0 if a > b. (Note: it does not return -1, 0, 1.)
The ! operator here is not faster and is misleading because the line
reads, "if not string compare" whereas it means "if the strings
compare." The same is true about pointers -- don't say "if (!ptr)"
when "if (ptr == NULL)" is just as fast and much more readable, and
same with "if (ptr)" when "if (ptr != NULL)" is better.
Bad: while (*s++);
Good: while (*s++) {
/* Empty Body */
}
Make it clear that you intended for the body to be empty and that it's
not just a bug.
Bad: case FOO: blah ();
bleh ();
case BAR: pizza ();
break;
Good: case FOO: blah ();
bleh ();
/* Fall through */ /* or */
/*FALLTHROUGH*/ /* lint syntax */
case BAR: pizza ();
break;
Make it clear that you intended to fall through and that it's not just a
bug.
Bad: char ch = 0;
Good: char ch = '\0';
Use character constants whenever possible. This include \n, \r, \t, etc.
This makes it clear to the reader what you meant. Remember you want
the syntax to mirror your intent.
Bad: #endif
Good: #endif /* KERNEL */
Always document #endif's because they are often hard to match to their
#ifdef.
Bad: for (;;) ...
Good: while (1) ...
Better:while (TRUE) ...
The "while" is clearer. The "for" loop should be reserved when looping
a variable from one value to another. It is safe to define FALSE as 0
and TRUE as 1.
Bad: int done; /* Variable local to this source file */
Bad: void copy() ... /* Function local to this source file */
Good: static int done;
Good: static void copy() ...
If a global variable or function is only needed by one object file, then
make it static to reduce the size of the symbol table and the
possibilities of conflicts, and to give the compiler more information
for optimizations.
Bad: fouri = i << 2;
Good: fouri = i * 4;
Every compiler should be able to optimize multiplications into shifts
when possible. Dennis Richie's first C compiler that fit in 4k of
RAM on a PDP-7 even did this. Make the code clearer by making an
explicit multiply. Same for divide. You want the code to reflect the
intent.
Bad: int * x, y, z;
Good: int *x, y, z;
The former implies that all three variables are pointers. Be careful
when you see this in someone else's code -- they might have meant for
all three to be pointers and wrote it wrong.
Bad: int count = 5; /* Local variable */
Good: int count; /* Local variable */
count = 5;
Don't combine the initialization of a local (or global, really) variable
with its definition. This is bug-prone because if, say, you add a loop
inside the function to do everything several times, you may forget
that you have to initialize it at the beginning of each loop. Stick
the initialization just before the variable is first used and it will avoid
problems if the code is later modified or reused.
Bad: if ((i = evaluate (x, y, z)) == -1) { ...
Good: i = evaluate (x, y, z);
if (i == -1) { ...
Don't bother sticking both assignment and check into one expression. It
is error-prone and hard to read. Sometimes this is necessary, such as:
while ((i = evaluate (x, y, z)) != -1) { ...
which is cleaner than calling the function before the loop and at the
end of it. In general avoid obfuscating the code. There are some
exceptions, such as:
if ((f = fopen ("file.txt", "r")) == NULL) { ...
that are so common that they're easy to read this way. Under no
circumstances should you make the comparison implicit:
if (len = read (f, buf, BUFLEN)) { ...
The above line should generate a warning on decent compilers, and rightly
so. Take the assignment out of the conditional and test "len" seperately.
Bad: f = (float)((float)x / (float)y);
Good: f = (float)x / y;
Read and memorize the coercion rules. In the above example, only one of
the operands of "/" needs to be converted to floating point for the
other (and the result) to be a float. The other coercions only clutter
the code. Keep in mind that (float)(x / y) is wrong if x and y are
integers, and that z * (x / y) is also wrong if x and y are integers,
even if z is a float. (Wrong in the sense that the divide will be an
integer divide -- this may be what you want, but not likely.)
More Speed
Bad: long i; /* For small loops */
Bad: short i; /* For small loops */
Good: int i;
For small loops where the loop variable is not likely to exceed 2^32,
use "int" since it's usually the native integer size of the machine.
This allows the compiler to generate the fastest code whereas it might
have to generate a lot of extra code (or slower instructions) for
unnatural sizes.
Bad: int i; /* For positive numbers */
Good: unsigned int i;
For some operations, the compiler has to generate extra code to check
for negative numbers, etc. You can avoid that if you know for sure that
the number will never be negative.
Bad: for (i = 0; i < 10; i++) {
printf ("Hello "); /* Body loop does not use "i" */
}
Good: for (i = 10; i > 0; i--) {
printf ("Hello "); /* Body loop does not use "i" */
}
It is easier for the compiler to compare against zero than against 10
(or a more complex expression involving variables). Count backwards if
you don't care about the loop variable. Use
for (i = 9; i >= 0; i--)
if you don't case about the direction of the loop variable (e.g., clearing
an array), but make sure to use a signed integer.
More Correctness
Bad: #define sqr(x) (x*x)
Good: #define sqr(x) ((x)*(x))
Since the parameter is expanded textually, you'll get incorrect
results if, say, sqr(a+b) is called.
Bad: #define sqr(x) (x)*(x)
Good: #define sqr(x) ((x)*(x))
You want the pseudo-function sqr() to be atomic in case it is used
next to another operator of equal or greater precedence. For example,
the expressions a/sqr(b) would be incorrect in the bad case.
Bad: i = (int)(f * 255); /* Convert 0.0 - 1.0 to 0 - 255 */
Good: i = (int)(f * 255 + 0.5);
Better:i = (int)(f * 256);
if (i == 256) {
i = 255;
}
Best: i = (int)(f * 256);
i -= (i == 256); /* Clamp to 255 */
The first case only generates 255 if "f" is 1.0, which is not quite right;
it should round up so as to not bias the other numbers too much. The
"better" solution is most accurate, but requires the extra conditional.
The last solution is a horrible hack, but may avoid a jump on some
machines. Make sure to comment.
Bad: char s[20];
Good: unsigned char s[20]; /* String */
Good: signed char s[20]; /* Array of small integers */
Don't assume anything about the sign of "char" because it's compiler-
dependent. Usually unsigned characters are what's needed for strings.
Even if you're sure that the sign of the characters doesn't matter (e.g.,
they will only be assigned to each other and never to an int), still
explicitly declare the sign to minimize future bugs were things to change.
Bad: void main (void) ...
Good: int main (void) ...
Good: int main (int argc, char *argv[]) ...
"main" should always return an integer. You never know when it will be
used in a Makefile. Always return 0 on success and non-0 on failure.
Bad: if (bool == TRUE) ...
Good: if (bool) ...
Don't compare booleans directly to TRUE (1), because a boolean can
be true whenever it's not zero. It's safe to compare it to FALSE (0),
but (!bool) is clearer. Logical operators (&&, ||, ==, etc.) are
guaranteed to return 0 or 1, but functions (isatty(), etc.) are
not.
Bad: i = (getchar () << 8) | getchar ();
Good: i = getchar () << 8;
i |= getchar ();
You're not guaranteed anything about the order of evaluation, so the
getchar's may actually be called in the wrong order. Split the
calls up to be sure. The only operators that guarantee that the
left side will be evaluated before the right are &&, ||, and comma
(,).
Bad: double x, y; if (x == y) ...
Good: double x, y; if (fabs (x - y) < EPS) ...
Better:double x, y; if (-EPS < x - y && x - y < EPS) ...
Don't compare floats or doubles to each other for equality because they
are unlikely to be exactly the same. Use a small epsilon, such as
0.0001, for comparison. The value of EPS depends on the size of the
reals and the application itself. The second good method avoids a
function call but is messy and should be put in a macro.
Bad: if (bytesleft < sizeof (struct vertex)) ...
Good: if (bytesleft < (int)sizeof (struct vertex)) ...
The sizeof() operator usually returns an unsigned quantity (size_t) and
this will cause the "bytesleft" variable to be cast to an unsigned
variable as well. (If an unsigned and a signed quantity of the same
size are operands to an operator, the signed value is cast to an
unsigned.) Negative values for "bytesleft" will then cause the
comparison to fail, which is probably not what's desired. Cast
sizeof() operators to integers just to be safe. In general, be
careful about comparisons between signed and unsigned values.