15-121 Java Style Guide

1 Introduction

This document serves as the complete definition of coding standards for source code in 15-121 at Carnegie Mellon University in Qatar. A Java source file is described as following the style guide if and only if it adheres to the rules herein.

Like other programming style guides, the issues covered span not only aesthetic issues of formatting, but other types of conventions or coding standards as well. However, this document focuses primarily on the hard-and-fast rules that we follow universally, and avoids giving advice that isn't clearly enforceable (whether by human or tool).

This style guide is an adapted version of Google's Java Style Guide, released under the CC-By 3.0 License, which encourages you to share these documents. See https://creativecommons.org/licenses/by/3.0/ for more details.

1.1 Terminology notes

In this document, unless otherwise clarified:

  1. The term class is used inclusively to mean an "ordinary" class, enum class, interface or annotation type (@interface).
  2. The term member (of a class) is used inclusively to mean a nested class, field, method, or constructor; that is, all top-level contents of a class except initializers and comments.

Other "terminology notes" will appear occasionally throughout the document.

1.2 Guide notes

Example code in this document is non-normative. That is, while the examples follow the style guide, they may not illustrate the only stylish way to represent the code. Optional formatting choices made in examples should not be enforced as rules.

2 Source file basics

2.1 File name

The source file name consists of the case-sensitive name of the top-level class it contains (of which there is exactly one), plus the .java extension.

2.2 Special characters

2.2.1 Whitespace characters

You may use either whitespace characters or tabs for indentation, but whatever you use must be consistent throughout the entire file.

2.2.2 Special escape sequences

For any character that has a special escape sequence (\b, \t, \n, \f, \r, \", \' and \\), that sequence is used rather than the corresponding octal (e.g. \012) or Unicode (e.g. \u000a) escape.

2.2.3 Non-ASCII characters

For the remaining non-ASCII characters, either the actual Unicode character (e.g. ) or the equivalent Unicode escape (e.g. \u221e) is used. The choice depends only on which makes the code easier to read and understand, although Unicode escapes outside string literals and comments are strongly discouraged.

Tip: In the Unicode escape case, and occasionally even when actual Unicode characters are used, an explanatory comment can be very helpful.

Examples:

Example Discussion
String unitAbbrev = "μs"; Best: perfectly clear even without a comment.
String unitAbbrev = "\u03bcs"; // "μs" Allowed, but there's no reason to do this.
String unitAbbrev = "\u03bcs"; // Greek letter mu, "s" Allowed, but awkward and prone to mistakes.
String unitAbbrev = "\u03bcs"; Poor: the reader has no idea what this is.
return '\ufeff' + content; // byte order mark Good: use escapes for non-printable characters, and comment if necessary.

Tip: Never make your code less readable simply out of fear that some programs might not handle non-ASCII characters properly. If that should happen, those programs are broken and they must be fixed.

3 Source file structure

A source file consists of, in order:

  1. Comment containing your name and Andrew ID
  2. Package statement
  3. Import statements
  4. Exactly one top-level class

Exactly one blank line separates each section that is present.

3.1 Name and Andrew ID

The name and Andrew ID of the author of the file goes here.

3.2 Package statement

The package statement is not line-wrapped. The column limit (Section 4.4, Column limit: 100) does not apply to package statements.

3.3 Import statements

3.3.1 No wildcard imports

Wildcard imports are not used.

Examples:

import java.util.Random; // Good
import java.util.*; // Bad

3.3.2 No line-wrapping

Import statements are not line-wrapped. The column limit (Section 4.4, Column limit: 100) does not apply to import statements.

3.4 Class declaration

3.4.1 Exactly one top-level class declaration

Each top-level class resides in a source file of its own.

3.4.2 Ordering of class contents

The order you choose for the members and initializers of your class can have a great effect on learnability. However, there's no single correct recipe for how to do it; different classes may order their contents in different ways.

What is important is that each class uses some logical order, which its maintainer could explain if asked. For example, new methods are not just habitually added to the end of the class, as that would yield "chronological by date added" ordering, which is not a logical ordering.

3.4.2.1 Overloads: never split

When a class has multiple constructors, or multiple methods with the same name, these appear sequentially, with no other code in between (not even private members).

4 Formatting

Terminology Note: block-like construct refers to the body of a class, method or constructor. Note that, by Section 4.7.2 on array initializers, any array initializer may optionally be treated as if it were a block-like construct.

4.1 Braces

4.1.1 Braces are used where optional

Braces are used with if, else, for, do and while statements, even when the body is empty or contains only a single statement.

4.1.2 Nonempty blocks: K & R style

Braces follow the Kernighan and Ritchie style ("Egyptian brackets") for nonempty blocks and block-like constructs:

Examples:

return () -> {
  while (condition()) {
    method();
  }
};

return new MyClass() {
  @Override public void method() {
    if (condition()) {
      try {
        something();
      } catch (ProblemException e) {
        recover();
      }
    } else if (otherCondition()) {
      somethingElse();
    } else {
      lastThing();
    }
  }
};

4.1.3 Empty blocks: may be concise

An empty block or block-like construct may be in K & R style (as described in Section 4.1.2). Alternatively, it may be closed immediately after it is opened, with no characters or line break in between ({}).

Examples:

  // This is acceptable
  void doNothing() {}

  // This is equally acceptable
  void doNothingElse() {
  }

4.2 Block indentation

Each time a new block or block-like construct is opened, the indent increases. When the block ends, the indent returns to the previous indent level. The indent level applies to both code and comments throughout the block. (See the example in Section 4.1.2, Nonempty blocks: K & R Style.)

4.3 One statement per line

Each statement is followed by a line break.

4.4 Column limit: 100

You should apply a line limit of 100 characters. A "character" means any Unicode code point except tab, which counts as 8 characters. Except as noted below, any line that would exceed this limit must be line-wrapped, as explained in Section 4.5, Line-wrapping.

Each Unicode code point counts as one character, even if its display width is greater or less. For example, if using fullwidth characters, you may choose to wrap the line earlier than where this rule strictly requires.

Exceptions:

  1. Lines where obeying the column limit is not possible (for example, a long URL in Javadoc, or a long JSNI method reference).
  2. package and import statements (see Sections 3.2 Package statement and 3.3 Import statements).
  3. Command lines in a comment that may be cut-and-pasted into a shell.

4.5 Line-wrapping

Terminology Note: When code that might otherwise legally occupy a single line is divided into multiple lines, this activity is called line-wrapping.

There is no comprehensive, deterministic formula showing exactly how to line-wrap in every situation. Very often there are several valid ways to line-wrap the same piece of code.

Note: While the typical reason for line-wrapping is to avoid overflowing the column limit, even code that would in fact fit within the column limit may be line-wrapped at the author's discretion.

Tip: Extracting a method or local variable may solve the problem without the need to line-wrap.

4.5.1 Where to break

The prime directive of line-wrapping is: prefer to break at a higher syntactic level. Also:

  1. When a line is broken at a non-assignment operator the break comes before the symbol. (Note that this is not the same practice used in Google style for other languages, such as C++ and JavaScript.)
    • This also applies to the following "operator-like" symbols:
      • the dot separator (.)
      • the two colons of a method reference (::)
      • an ampersand in a type bound (<T extends Foo & Bar>)
      • a pipe in a catch block (catch (FooException | BarException e)).
  2. When a line is broken at an assignment operator the break typically comes after the symbol, but either way is acceptable.
    • This also applies to the "assignment-operator-like" colon in an enhanced for ("foreach") statement.
  3. A method or constructor name stays attached to the open parenthesis (() that follows it.
  4. A comma (,) stays attached to the token that precedes it.
  5. A line is never broken adjacent to the arrow in a lambda, except that a break may come immediately after the arrow if the body of the lambda consists of a single unbraced expression. Examples:
    MyLambda<String, Long, Object> lambda =
        (String label, Long value, Object obj) -> {
            ...
        };
    
    Predicate<String> predicate = str ->
        longExpressionInvolving(str);
    

Note: The primary goal for line wrapping is to have clear code, not necessarily code that fits in the smallest number of lines.

4.5.2 Indent continuation lines at least +4 spaces or one tab

When line-wrapping, each line after the first (each continuation line) is indented at least +4 spaces or 1 tab from the original line.

When there are multiple continuation lines, indentation may be varied beyond this as desired. In general, two continuation lines use the same indentation level if and only if they begin with syntactically parallel elements.

4.6 Whitespace

4.6.1 Vertical Whitespace

A single blank line always appears:

  1. Between consecutive members or initializers of a class: fields, constructors, methods, nested classes, static initializers, and instance initializers.
    • Exception: A blank line between two consecutive fields (having no other code between them) is optional. Such blank lines are used as needed to create logical groupings of fields.
  2. As required by other sections of this document (such as Section 3, Source file structure, and Section 3.3, Import statements).

A single blank line may also appear anywhere it improves readability, for example between statements to organize the code into logical subsections. A blank line before the first member or initializer, or after the last member or initializer of the class, is neither encouraged nor discouraged.

Multiple consecutive blank lines are permitted, but never required (or encouraged).

4.6.2 Horizontal whitespace

Beyond where required by the language or other style rules, and apart from literals, comments and Javadoc, a single ASCII space also appears in the following places only.

  1. Separating any reserved word, such as if, for or catch, from an open parenthesis (() that follows it on that line
  2. Separating any reserved word, such as else or catch, from a closing curly brace (}) that precedes it on that line
  3. Before any open curly brace ({), with two exceptions:
    • @SomeAnnotation({a, b}) (no space is used)
    • String[][] x = {{"foo"}}; (no space is required between {{, by item 8 below)
  4. On both sides of any binary or ternary operator. This also applies to the following "operator-like" symbols:
    • the ampersand in a conjunctive type bound: <T extends Foo & Bar>
    • the pipe for a catch block that handles multiple exceptions: catch (FooException | BarException e)
    • the colon (:) in an enhanced for ("foreach") statement
    • the arrow in a lambda expression: (String str) -> str.length()
    but not
    • the two colons (::) of a method reference, which is written like Object::toString
    • the dot separator (.), which is written like object.toString()
  5. After ,:; or the closing parenthesis ()) of a cast
  6. On both sides of the double slash (//) that begins an end-of-line comment. Here, multiple spaces are allowed, but not required.
  7. Between the type and variable of a declaration: List<String> list
  8. Optional just inside both braces of an array initializer
    • new int[] {5, 6} and new int[] { 5, 6 } are both valid
  9. Between a type annotation and [] or ....

This rule is never interpreted as requiring or forbidding additional space at the start or end of a line; it addresses only interior space.

4.6.3 Horizontal alignment [Optional]

Terminology Note: Horizontal alignment is the practice of adding a variable number of additional spaces in your code with the goal of making certain tokens appear directly below certain other tokens on previous lines.

This practice is permitted, but is never required.

Here is an example without alignment, then using alignment:

private int x; // this is fine
private Color color; // this too

private int   x;      // permitted, but future edits
private Color color;  // may leave it unaligned

Tip: Alignment can aid readability, but it creates problems for future maintenance. Consider a future change that needs to touch just one line. This change may leave the formerly-pleasing formatting mangled, and that is allowed. More often it prompts the coder (perhaps you) to adjust whitespace on nearby lines as well, possibly triggering a cascading series of reformattings. That one-line change now has a "blast radius." This can at worst result in pointless busywork, but at best it still corrupts version history information, slows down reviewers and exacerbates merge conflicts.

4.7 Specific constructs

4.7.1 One variable per declaration

Every variable declaration (field or local) declares only one variable: declarations such as int a, b; are not used.

Exception: Multiple variable declarations are acceptable in the header of a for loop.

4.7.2 Array initializers: can be "block-like"

Any array initializer may optionally be formatted as if it were a "block-like construct." For example, the following are all valid (not an exhaustive list):

new int[] {           new int[] {
  0, 1, 2, 3            0,
}                       1,
                        2,
new int[] {             3,
  0, 1,               }
  2, 3
}                     new int[]
                          {0, 1, 2, 3}

4.7.3 Comments

This section addresses implementation comments.

Any line break may be preceded by arbitrary whitespace followed by an implementation comment. Such a comment renders the line non-blank.

4.7.3.1 Block comment style

Block comments are indented at the same level as the surrounding code. They may be in /* ... */ style or // ... style. For multi-line /* ... */ comments, subsequent lines must start with * aligned with the * on the previous line.

/*
 * This is          // And so           /* Or you can
 * okay.            // is this.          * even do this. */
 */

Comments are not enclosed in boxes drawn with asterisks or other characters.

Tip: When writing multi-line comments, use the /* ... */ style if you want automatic code formatters to re-wrap the lines when necessary (paragraph-style). Most formatters don't re-wrap lines in // ... style comment blocks.

4.7.4 Numeric Literals

long-valued integer literals use an uppercase L suffix, never lowercase (to avoid confusion with the digit 1). For example, 3000000000L rather than 3000000000l.

5 Naming

5.1 Rules common to all identifiers

Identifiers use only ASCII letters and digits, and, in a small number of cases noted below, underscores. Thus each valid identifier name is matched by the regular expression \w+ .

5.2 Rules by identifier type

5.2.1 Package names

Package names are all lowercase, with consecutive words simply concatenated together (no underscores). For example, com.example.deepspace, not com.example.deepSpace or com.example.deep_space.

5.2.2 Class names

Class names are written in UpperCamelCase.

Class names are typically nouns or noun phrases. For example, Character or ImmutableList. Interface names may also be nouns or noun phrases (for example, List), but may sometimes be adjectives or adjective phrases instead (for example, Readable).

There are no specific rules or even well-established conventions for naming annotation types.

Test classes are named starting with the name of the class they are testing, and ending with Test. For example, HashTest or HashIntegrationTest.

5.2.3 Method names

Method names are written in lowerCamelCase.

Method names are typically verbs or verb phrases. For example, sendMessage or stop.

Underscores may appear in JUnit test method names to separate logical components of the name, with each component written in lowerCamelCase. One typical pattern is <methodUnderTest>_<state>, for example pop_emptyStack. There is no One Correct Way to name test methods.

5.2.4 Constant names

Constant names use CONSTANT_CASE: all uppercase letters, with each word separated from the next by a single underscore. But what is a constant, exactly?

Constants are static final fields whose contents are deeply immutable and whose methods have no detectable side effects. This includes primitives, Strings, immutable types, and immutable collections of immutable types. If any of the instance's observable state can change, it is not a constant. Merely intending to never mutate the object is not enough. Examples:

// Constants
static final int NUMBER = 5;
static final ImmutableList<String> NAMES = ImmutableList.of("Ed", "Ann");
static final ImmutableMap<String, Integer> AGES = ImmutableMap.of("Ed", 35, "Ann", 32);
static final Joiner COMMA_JOINER = Joiner.on(','); // because Joiner is immutable
static final SomeMutableType[] EMPTY_ARRAY = {};
enum SomeEnum { ENUM_CONSTANT }

// Not constants
static String nonFinal = "non-final";
final String nonStatic = "non-static";
static final Set<String> mutableCollection = new HashSet<String>();
static final ImmutableSet<SomeMutableType> mutableElements = ImmutableSet.of(mutable);
static final ImmutableMap<String, SomeMutableType> mutableValues =
    ImmutableMap.of("Ed", mutableInstance, "Ann", mutableInstance2);
static final Logger logger = Logger.getLogger(MyClass.getName());
static final String[] nonEmptyArray = {"these", "can", "change"};

These names are typically nouns or noun phrases.

5.2.5 Non-constant field names

Non-constant field names (static or otherwise) are written in lowerCamelCase.

These names are typically nouns or noun phrases. For example, computedValues or index.

5.2.6 Parameter names

Parameter names are written in lowerCamelCase.

One-character parameter names in public methods should be avoided.

5.2.7 Local variable names

Local variable names are written in lowerCamelCase.

Even when final and immutable, local variables are not considered to be constants, and should not be styled as constants.

5.2.8 Type variable names

Each type variable is named in one of two styles:

5.3 Camel case: defined

Sometimes there is more than one reasonable way to convert an English phrase into camel case, such as when acronyms or unusual constructs like "IPv6" or "iOS" are present. To improve predictability, 15-121 Style specifies the following (nearly) deterministic scheme.

Beginning with the prose form of the name:

  1. Convert the phrase to plain ASCII and remove any apostrophes. For example, "Müller's algorithm" might become "Muellers algorithm".
  2. Divide this result into words, splitting on spaces and any remaining punctuation (typically hyphens).
    • Recommended: if any word already has a conventional camel-case appearance in common usage, split this into its constituent parts (e.g., "AdWords" becomes "ad words"). Note that a word such as "iOS" is not really in camel case per se; it defies any convention, so this recommendation does not apply.
  3. Now lowercase everything (including acronyms), then uppercase only the first character of:
    • ... each word, to yield upper camel case, or
    • ... each word except the first, to yield lower camel case
  4. Finally, join all the words into a single identifier.

Note that the casing of the original words is almost entirely disregarded. Examples:

Prose form Correct Incorrect
"XML HTTP request" XmlHttpRequest XMLHTTPRequest
"new customer ID" newCustomerId newCustomerID
"inner stopwatch" innerStopwatch innerStopWatch
"supports IPv6 on iOS?" supportsIpv6OnIos supportsIPv6OnIOS
"YouTube importer" YouTubeImporter
YoutubeImporter*

*Acceptable, but not recommended.

Note: Some words are ambiguously hyphenated in the English language: for example "nonempty" and "non-empty" are both correct, so the method names checkNonempty and checkNonEmpty are likewise both correct.