Why Worry About Program Style, Part 1

Programming style is much like writing style. Each person tends to have their own. Programming is, however, somewhat different than writing because the programs typically do not belong to the individual programmer. Instead they belong to the company or organization that employs the progammer. Indeed, a single programmer almost never writes a program by herself/himself. And, the program must be maintained (corrected, modified, etc.) by programmers other than the creators. Companies and organizations often have style guides that their programmers are expected to follow.

Programming style is considered important because good style is supposed to make programs more readable and, therefore, understandable. Understanding program code is required if one is to make reasoned changes to it. Understanding will be needed in various situations, e.g., the programmer is finding and correcting errors, the programmer is seeking help in identifying errors/problems, the program is being modified to accomplish a different task.

When reading a document or program, one's mind immediately interprets the words or code encountered in a certain way. Characteristics of the document or code can make that interpretation easier or harder. Those same characteristics also make the document more or less likely to be misinterpreted. Good programming style is supposed to allow for easier and correct interpretation.

Programming style and the way a programmer approaches the programming process are interrelated. Adhering to a personal style is a habit of mind that indicates a disciplined approach to the work of programming. Programs exhibiting good style are thought to be more likely to be correct due to the thought and effort (presumably) indicated by the good style.

Finally, from a student perspective, it is useful to develop good programming style because teachers are likely to be favorably influenced code they can readily understand, i.e., by students using good programming style.

What is "Good" Programming Style

There are at least two aspects of program style. One relates only or primarily to the code itself. The other relates primarily to the algorithm underlying the code—the program design. These two aspects are discussed separately below.

Code Style

Code or coding style is concerned with the content and appearance of the code. Aspects like variable names, code spacing, and documentation are commonly discussed under this topic.

Naming (of variables)

Computer programming must make use of variables. Another activity that uses variables is mathematics. In mathematics, variables are often single characters such as x, y, etc. In computer programming a single letter can be used but that practice is discouraged. In programming, the advice is to use meaningful variable names.

So, what makes a variable name meaningful? Some of the following should probably be considered.

Readability (of variable names)

In addition to choosing or making up variable names, good coding includes another aspect related to variable names—readability. The discussion above suggests that variable names will often need to be multiple words. When we read English, the words have spaces between them. In a programming language that is not feasible. So, we need help the reader/programmer distinguish the individual words. The two most widely acceptable means to do this are using camel case or underscores.

Camel case is the capitalization of each word (except the first). Thus we have variable names such as accumulatedTotal, totalFromUser, and sumFromUser. Note that the programmer can choose words that aid readability when camel case is used. sumFromUser might be considered more readable than totalFromUser because of the difference between the m and l before the F in From. Of course the context would also affect the choice of words.
(The first word in multiple-word names is typically not capitalized. Some possible reasons for this are that programmers are lazy or that variable names start with lower case letters and other names start with upper case letters or for consistency, e.g., single word variables are lower case so multi-word variables should start with a lower case letter.)

Underscores are also commonly used between words. (In some languages, one could hyphenate variable names but other languages would interpret the hyphen as a minus sign and try to do subtraction. So, for generality, we suggest not using hyphens, even when they are legal.) The userscore may make the variable names even more readable, e.g., total_from_user vs totalFromUser. But, this lengthens the name and adds a tiny bit of work when typing the name.

Choosing to use camel case or underscores is a matter of personal taste but probably one or the other should be used.

Uniqueness/Dissimilarity (of variable names)

Another consideration when naming variables has to do with avoiding confusion between similar names. Mostly variable names should be unique and readily distinguishable from others. For example, total and Total are unique but are not readily distinguishable. Similarly, you would not want to use totalA and totalB. Names that are similar will lead to confusion unless the programmer is very, very attuned to context of the code.

The one time when it is okay to use duplicate variable names is when values are passed to a module (function) as parameters. Usually, the values have the same exact meaning as the value passed in. And, the code inside the module wants to work on/with the local version of the value. Having the same name ensures that the original value is not changed and just generally makes sense with respect to understandability and lack of confusion.

FYI

Some background/historical information for the curious. Early in computer history, main memory (RAM) was very limited. Programmers were advised to use single letters or a letter and a digit as variable names. That was because the program was read into memory and translated from the programming language into machine language. The letters used in variable names increased the size of the program, perhaps to an extent that either the translation process was slowed down or the program was too large for it and the translation program to be in computer memory at one time. Also, for your information, in machine language the variables are just numbers indicating where the value is stored in memory—length of variable names makes no difference in the size of the machine language code. Now that RAM is essentially unlimited (for most applications), the number of characters used in variable names no longer matters insofar as the computer is concerned. So, variables can be named for the benefit of humans.

Code Format/Layout

Code format refers to the white space used in your program. White space consists of blanks and tabs entered in the code between the various elements of the program—variable names, operators, function names, etc. White space also refers to blank lines (the newline characters used to provide blank lines).

The idea is to insert spaces to make the code more readable. Different people think different aspects of spacing make the code more readable. Some of the possibilities are:

Documentation

Documentation is descriptive information about program code. It is typically included in the program using comments that are preceded by or included in special (sets of) characters. Professional/industrial programs often have external documentation that exists outside the program and is typically written in English prose. For our purposes, program documentation is internal to the program and consists of program, module, task, or statement level documentation.

In the ideal situation a program has no documentation—the code is fully understandable because it has been well-designed and names of variables and modules have been carefully selected. The ideal situation (or fully self-documenting code may occur for small programs or program segments but typically does not occur otherwise. The discussion below discusses documentation from the bottom up.

Again, the goal is that programs be self-documenting. Programmers need to consider whether that goal has been met and when it is not, provide appropriate, but minimal program documentation. The use of well-named modules to accomplish nearly all program tasks can minimize the documentation that is needed.