Code Style and Documentation Standards


Overview

The purpose of this document is to specify the style and documentation policy for students in the Computer Science and Applied Mathematics at IIT. Students taking courses in the Department must adhere to these guidelines, unless exempted by the instructor (exemptions will be granted for students who are required to follow some other set of explicit C/C++ Style and Documentation Guidelines in real life, if they prefer to conform to those guidelines -- and provide them to the instructor). The purpose of these guidelines is threefold:

Indention

The size of all indentation levels should be equal. The value chosen is to be in the range of a minimum of three spaces and a maximum of eight spaces. Indention must consist of some combination of spaces and tab characters. If your editor allows you to set the amount of horizontal space taken up by a tab character (the distance between tab stops), set the tab size to eight spaces -- your printer will interpret the tab stops to be eight spaces, so make your editor match. With tab stops eight spaces apart, you may find it convenient to use four spaces (half a tab stop) as your basic indentation level size. No line of code should extend beyond the 80th column.

Variable Names

Variable names should be descriptive, all lowercase, and no more than 31 characters. Underscores may be used to enhance readability. Variable names used only as loop counters may be i, j, etc., if a more descriptive name does not seem appropriate to the student. All other variable names must be descriptive names, not just a single letter. The left side of comments should line up vertically whenever possible. In practice, experienced C++ programmers start most of their comments at about 40 columns from the left margin (at about the middle of the page or screen).

Examples:

                       
    int    i;                        // loop counter
    char   buff[BUF_SIZE];           // buffer to hold disk  sector
    char  *str_ptr;                  // pointer to current position in   
                                     // ... string
    for (i=0 ; i<BUF_SIZE ; i++)
        buff[i] = *str_ptr;          // fill buffer with current char

Global variables should be avoided, unless specified by the instructor in the assignment. In practice, even large software projects with dozens of source files can be written without global variables. Global variables produce an undesirable coupling of the functions in a program via the common shared global data items. One has to scan all the lines in all the functions in all the files of the program to find where the global data is used, since globals are visible everywhere. The primary methods of avoiding global variables are:

  1. declare struct's in header files,
  2. pass data items, or pointers to them -- including struct's -- as arguments to functions (using struct's can reduce the number of arguments)
  3. define certain data items as static extern data.

Let's elaborate on this last technique. In practice, large C++ projects are usually divided into multiple source files on the basis of what data needs to be referenced. The functions that access the same data items are put into the same source file. In C++, if we declare as static a variable/struct/array/etc. defined outside of any function (that would be global), then it is visible only in the source file in which it is defined (for example static int i_have_file_scope;). Sometimes, that is exactly what you want. For instance, you might have a source file that opens and closes files and does file I/O. All the functions in that file need to see the file pointers (e.g. static FSTREAM *in_fp, *out_fp;).

Constants

#define creates a named macro. When doing systems programming in C++, #define is used extensively to make code more readable. If you need a constant, const int (or float, ...) should be used instead of #define.

Use self-explanatory constant values with names like LINELEN, instead of a confusing numeric literal magic number like 80. No magic number constants are to be placed in code as literal numbers. There are few exceptions like comparing against zero in a for loop; use your judgement to decide what's reasonable.

The constant name should be all uppercase and meaningful. Any constants used in only one file should be defined in that file. Constants used in more than one file should be put in a .h file and #include'd. If you use #define, note that a #define creates a macro, which works by blind string substitution done by a pre-processing step at the very beginning of compiling your file. It does not (and should not be used to) define a variable. By contrast, the const keyword is part of a variable definition. Normally, one declares something to be const static, so the variable won't be global. Because it defines a variable, the const keyword does not get used in .h files.

Bad example of the use of #define:

    #define  ONE     1
Good examples of the use of #define:
    #define  BUFFERSIZE           512
    #define  ERROR_OVERFLOW         0
    #define  ERROR_BAD_MEDIA        1
    #define  CONTROL(letter)     ((letter) - '@')
    #define  MAIN_DOS_INTERRUPT  0x21

The CONTROL macro #define'd above takes an upper-case letter as its argument and lets you write

    if (keystroke == CONTROL('K'))      // self-explanatory

instead of

    if (keystroke == 11)                // a mystery

Functions

Function names should be descriptive, lowercase, and no more than 31 characters. Each function must be preceded by a standard documentation header.

//****************************************************************************
//
//  function_name (or FunctionName)
//  A one line statement of what the function does.
//
//  A more complete description of the function, its role in your program 
//  and/or anything else you think ought to be said about this function.  
//  This description may be omitted if you don't have anything that needs 
//  said here.  On the other hand, the one-line description above must be 
//  given.  If you can't figure out how to say, in one line, what your function 
//  does, then you should take that as a big hint that your function probably 
//  could be split into two or more functions. 
//
//  The typical well-conceived function can be described by just its name.
//
//  Input Params:     
//  a list of the types, names, and descriptions of all those parameters that
//  are not modified by the function
//
//  Output Params:    
//  a list of the types, names, and descriptions of all those parameters that
//  are modified by the function
//
//  Returns:
//  A statement of the type and meaning of what the function returns, including
//  both the normal return value and any error codes.  This "returns" part of
//  the function header may be omitted if the function is of type void.
//
//  Calls/References:
//  A list of all other functions this one calls, and all the global or static
//  extern data items, file pointers, etc. that this function references that
//  are not passed as arguments to it.  This "calls/references" part of the
//  function header may be omitted if nothing is called/referenced.
//
//****************************************************************************

return_type
function_name( param1_type   param1[], param2_type  *param2,      
               param3_type   param3 )     
{
    some_type     local_var1[],            // what it is and how used
                  local_var2;              // comment
    locl3_type    *local_var3;             // comment

    cout << "This is the first statement of the function" << endl;
    return some_expression;
}

You should follow the above format fairly closely, including indentation, blank lines, and the function header comment. The header comment may be distinguished by using either the C++ "//" comment style or the traditional C style of "/* ... */". The header comment must be preceded either by at least three blank lines or by a form-feed (a form-feed is just ordinary whitespace). This makes it easier to find the beginning of your function. Following the function header, there is one blank line and then the actual function, formatted as shown above. Note the following points:

For example:
void
myfunc(void)
{
    local variables    // Comments explaing variables

    code
}

All functions must be prototyped. The prototypes for those functions that may be called from other source files should be placed in a header file, and the header file should be #include'd in the source files that contain the actual code of the functions, as well as the source files which call them. The list of function prototypes in a header file should be grouped by the source files which actually contain the functions (there should be a comment for each group telling which source file the group is for). Each group of function prototypes should be alphabetized by function name. Grouping and alphabetizing will significantly help in understanding a large program with many source files and many functions.

You should declare as static any functions which should not be callable from other source files. This too will significantly help in understanding a large program with many source files.

Source File Layout

Each source file should be laid out as follows:

Put three or more blank lines (or a form-feed) at each place where there is a blank line above. The source file header comment is of the following form:

//************************************************************************
//
//  filename.ext
//
//  Written by:    your name
//  Date:          dates written and modified, earliest first
//  Environment:   Turbo C++, MSDOS 6.2
//
//  Description of the overall goal of this source file.
//
//  Entry Points:
//      List of functions available externally.
//
//  Local Functions:
//      List functions not used externally and static functions.
//
//  Control Flow:
//      Description of the interrelations of the functions in     
//      this file.  This section may be omitted if the relation   
//      ships among the functions in this file (which motivated   
//      you to put them in the same file) is something other than 
//      flow of control -- for example, the functions share some  
//      data representation.  At least the file containing the    
//      main() function should have one of these "Control Flow"   
//      comments describing how the program goes together.        
//
//************************************************************************

Formatted Comments

The programs which reformat C++ source files recognize comments beginning with /** or with /*- as being comments which the program should not reformat. All comments in which you have an intentional two-dimensional format on the screen or page are to begin with /** possibly followed by additional asterisks. The comment block that begins each source file and the comment block that precedes each function are examples. Another example is a comment in which you draw a diagram or picture of some sort, using the ASCII character set. You may of course come up with some similar system using the "//" comment style (see example above).

General Spacing Guidelines

In general, each language keyword, arithmetic operator, and assignment operator should be preceded and followed by one space. A detailed study of the examples below is recommended.

A space or spaces may be omitted to emphasize the associativity and precedence in an expression, or to make the structure of a for loop statement visually obvious, as is shown below. The left-parenthesis of the set required by if, while, for, and do/while statements should be preceded by one space. The right-parenthesis of the required set should be followed by either a newline or else by a space, left-curly-brace, and newline. If a condition inside the set of parentheses is too long to fit on one line, it may be continued on following line(s), indented past the left- parenthesis. Operators should be lined up vertically in such cases. Examples:

    if ( system_status == OK
         && motor_on == TRUE
         && disk_read == TRUE )
    {
        do the read
    }
    sum += new_value;
    for (i=0 ; i<BUFFER_SIZE ; i++)
        if (sum < 200)
    	    return status_value;
    while (i < BUFFER_SIZE)
        buffer[pos] = input_char;
    idx++;
Note the use of spacing and parentheses for readability below.
    if ((cnt + pos*3 - 1) > limit) {
        do some calculation
    }

Each new block is to be indented one indentation level. The size of indentation levels must be equal. The size should be a constant, four to eight spaces. Lines are to be no more than 80 columns wide. Tab characters (tab stops) are to be taken as eight characters, because that is how printers will interpret them.

Formatting of the Basic Statements of the Language

Your general formatting of each of the basic statements of the language should conform to one of the models below. If you declare additional local variables after the opening curly brace associated with some statement, there must be a blank line after those local variable declarations. Note that there are two general styles of curly brace use. You must pick just one curly brace style and use it consistently. Note that in both styles the closing curly brace is always exactly aligned with the beginning of the associated keyword.

    for (i = 0; i < BUFFER_SIZE; i++) {   // Kernighan & Ritchie
        ....                              // ... brace style
    }

    for (i = 0; i < BUFFER_SIZE; i++)     // Alternate brace style
    {                                     // that is also common.
        ....
    }

If you want, in a for loop you can omit the spaces in each of the three main parts inside the parentheses at the top and/or put a space before the semicolons between them.

    for (i=0 ; i<BUFFER_SIZE ; i++)

You have two indention options in a switch statement:

switch (expression) {              switch (expression) {
case 0 :  code;                        case 0 :
          break;       OR                  code;
case 1 :  code;                            break;
          break;                       case 1 :
default:  code;                            code;
}                                          break;
                                       default :
                                           code;
                                   }

Of course, you can pick either style of curly brace use:

if (condition1)         OR       if (condition1) {
{                                    <body>
    <body>                           <more body>
    <more body>                  }
}                                else if (condition2) { 
else if (condition2)                 <something>
{                                }
    <something>                  else {
}                                    <continuing>
else                                 <out>
{                                }
    <continuing>
    <out>
}

Since the do/while statement and the while statement look very similar, it is customary to leave a blank line before and after each do/while loop, to set it off visually. Experience shows it is also important to put the while condition on the same line as the ending curly brace, and to never omit the braces:

do                                 do {
{                                      <body>
    <body>                             <more body>
    <more body>                    }while (condition);               
}while (condition);

Except in do loops, if the body is a single statement, curly braces can be omitted:

    if (idx_pos < BUFFER_SIZE)
        <body>
    while (idx_pos < BUFFER_SIZE)
        <body>

The empty statement always goes on a line by itself:

    while (TRUE)
        ;

This document was originally written by Stefan Brandle and used for the cs351 class.
Last modified: March 23, 1997 by Virgil Bistriceanu