15-121 Java Style Guide
1 Introduction
This document serves as the complete definition of coding standards for source code in 15-121 at Carnegie Mellon University in Qatar. A Java source file is described as following the style guide if and only if it adheres to the rules herein.
Like other programming style guides, the issues covered span not only aesthetic issues of formatting, but other types of conventions or coding standards as well. However, this document focuses primarily on the hard-and-fast rules that we follow universally, and avoids giving advice that isn't clearly enforceable (whether by human or tool).
This style guide is an adapted version of Google's Java Style Guide, released under the CC-By 3.0 License, which encourages you to share these documents. See https://creativecommons.org/licenses/by/3.0/ for more details.
1.1 Terminology notes
In this document, unless otherwise clarified:
- The term class is used inclusively to mean an "ordinary" class, enum class,
interface or annotation type (
@interface
). - The term member (of a class) is used inclusively to mean a nested class, field, method, or constructor; that is, all top-level contents of a class except initializers and comments.
Other "terminology notes" will appear occasionally throughout the document.
1.2 Guide notes
Example code in this document is non-normative. That is, while the examples follow the style guide, they may not illustrate the only stylish way to represent the code. Optional formatting choices made in examples should not be enforced as rules.
2 Source file basics
2.1 File name
The source file name consists of the case-sensitive name of the top-level class it contains
(of which there is exactly one), plus the
.java
extension.
2.2 Special characters
2.2.1 Whitespace characters
You may use either whitespace characters or tabs for indentation, but whatever you use must be consistent throughout the entire file.
2.2.2 Special escape sequences
For any character that has a
special escape sequence
(\b
,
\t
,
\n
,
\f
,
\r
,
\"
,
\'
and
\\
), that sequence
is used rather than the corresponding octal
(e.g. \012
) or Unicode
(e.g. \u000a
) escape.
2.2.3 Non-ASCII characters
For the remaining non-ASCII characters, either the actual Unicode character
(e.g. ∞
) or the equivalent Unicode escape
(e.g. \u221e
) is used. The choice depends only on
which makes the code easier to read and understand, although Unicode escapes
outside string literals and comments are strongly discouraged.
Tip: In the Unicode escape case, and occasionally even when actual Unicode characters are used, an explanatory comment can be very helpful.
Examples:
Example | Discussion |
---|---|
String unitAbbrev = "μs"; |
Best: perfectly clear even without a comment. |
String unitAbbrev = "\u03bcs"; // "μs" |
Allowed, but there's no reason to do this. |
String unitAbbrev = "\u03bcs";
// Greek letter mu, "s" |
Allowed, but awkward and prone to mistakes. |
String unitAbbrev = "\u03bcs"; |
Poor: the reader has no idea what this is. |
return '\ufeff' + content;
// byte order mark |
Good: use escapes for non-printable characters, and comment if necessary. |
Tip: Never make your code less readable simply out of fear that some programs might not handle non-ASCII characters properly. If that should happen, those programs are broken and they must be fixed.
3 Source file structure
A source file consists of, in order:
- Comment containing your name and Andrew ID
- Package statement
- Import statements
- Exactly one top-level class
Exactly one blank line separates each section that is present.
3.1 Name and Andrew ID
The name and Andrew ID of the author of the file goes here.
3.2 Package statement
The package statement is not line-wrapped. The column limit (Section 4.4, Column limit: 100) does not apply to package statements.
3.3 Import statements
3.3.1 No wildcard imports
Wildcard imports are not used.
Examples:
import java.util.Random; // Good import java.util.*; // Bad
3.3.2 No line-wrapping
Import statements are not line-wrapped. The column limit (Section 4.4, Column limit: 100) does not apply to import statements.
3.4 Class declaration
3.4.1 Exactly one top-level class declaration
Each top-level class resides in a source file of its own.
3.4.2 Ordering of class contents
The order you choose for the members and initializers of your class can have a great effect on learnability. However, there's no single correct recipe for how to do it; different classes may order their contents in different ways.
What is important is that each class uses some logical order, which its maintainer could explain if asked. For example, new methods are not just habitually added to the end of the class, as that would yield "chronological by date added" ordering, which is not a logical ordering.
3.4.2.1 Overloads: never split
When a class has multiple constructors, or multiple methods with the same name, these appear sequentially, with no other code in between (not even private members).
4 Formatting
Terminology Note: block-like construct refers to the body of a class, method or constructor. Note that, by Section 4.7.2 on array initializers, any array initializer may optionally be treated as if it were a block-like construct.
4.1 Braces
4.1.1 Braces are used where optional
Braces are used with
if
,
else
,
for
,
do
and
while
statements, even when the
body is empty or contains only a single statement.
4.1.2 Nonempty blocks: K & R style
Braces follow the Kernighan and Ritchie style ("Egyptian brackets") for nonempty blocks and block-like constructs:
- No line break before the opening brace.
- Line break after the opening brace.
- Line break before the closing brace.
- Line break after the closing brace, only if that brace terminates a statement or
terminates the body of a method, constructor, or named class.
For example, there is no line break after the brace if it is followed by
else
or a comma.
Examples:
return () -> { while (condition()) { method(); } }; return new MyClass() { @Override public void method() { if (condition()) { try { something(); } catch (ProblemException e) { recover(); } } else if (otherCondition()) { somethingElse(); } else { lastThing(); } } };
4.1.3 Empty blocks: may be concise
An empty block or block-like construct may be in K & R style (as described in
Section 4.1.2). Alternatively, it may be closed immediately
after it is opened, with no characters or line break in between
({}
).
Examples:
// This is acceptable void doNothing() {} // This is equally acceptable void doNothingElse() { }
4.2 Block indentation
Each time a new block or block-like construct is opened, the indent increases. When the block ends, the indent returns to the previous indent level. The indent level applies to both code and comments throughout the block. (See the example in Section 4.1.2, Nonempty blocks: K & R Style.)
4.3 One statement per line
Each statement is followed by a line break.
4.4 Column limit: 100
You should apply a line limit of 100 characters. A "character" means any Unicode code point except tab, which counts as 8 characters. Except as noted below, any line that would exceed this limit must be line-wrapped, as explained in Section 4.5, Line-wrapping.
Each Unicode code point counts as one character, even if its display width is greater or less. For example, if using fullwidth characters, you may choose to wrap the line earlier than where this rule strictly requires.
Exceptions:
- Lines where obeying the column limit is not possible (for example, a long URL in Javadoc, or a long JSNI method reference).
package
andimport
statements (see Sections 3.2 Package statement and 3.3 Import statements).- Command lines in a comment that may be cut-and-pasted into a shell.
4.5 Line-wrapping
Terminology Note: When code that might otherwise legally occupy a single line is divided into multiple lines, this activity is called line-wrapping.
There is no comprehensive, deterministic formula showing exactly how to line-wrap in every situation. Very often there are several valid ways to line-wrap the same piece of code.
Note: While the typical reason for line-wrapping is to avoid overflowing the column limit, even code that would in fact fit within the column limit may be line-wrapped at the author's discretion.
Tip: Extracting a method or local variable may solve the problem without the need to line-wrap.
4.5.1 Where to break
The prime directive of line-wrapping is: prefer to break at a higher syntactic level. Also:
- When a line is broken at a non-assignment operator the break comes before
the symbol. (Note that this is not the same practice used in Google style for other languages,
such as C++ and JavaScript.)
- This also applies to the following "operator-like" symbols:
- the dot separator (
.
) - the two colons of a method reference
(
::
) - an ampersand in a type bound
(
<T extends Foo & Bar>
) - a pipe in a catch block
(
catch (FooException | BarException e)
).
- the dot separator (
- This also applies to the following "operator-like" symbols:
- When a line is broken at an assignment operator the break typically comes
after the symbol, but either way is acceptable.
- This also applies to the "assignment-operator-like" colon in an enhanced
for
("foreach") statement.
- This also applies to the "assignment-operator-like" colon in an enhanced
- A method or constructor name stays attached to the open parenthesis
(
(
) that follows it. - A comma (
,
) stays attached to the token that precedes it. - A line is never broken adjacent to the arrow in a lambda, except that a
break may come immediately after the arrow if the body of the lambda consists
of a single unbraced expression. Examples:
MyLambda<String, Long, Object> lambda = (String label, Long value, Object obj) -> { ... }; Predicate<String> predicate = str -> longExpressionInvolving(str);
Note: The primary goal for line wrapping is to have clear code, not necessarily code that fits in the smallest number of lines.
4.5.2 Indent continuation lines at least +4 spaces or one tab
When line-wrapping, each line after the first (each continuation line) is indented at least +4 spaces or 1 tab from the original line.
When there are multiple continuation lines, indentation may be varied beyond this as desired. In general, two continuation lines use the same indentation level if and only if they begin with syntactically parallel elements.
4.6 Whitespace
4.6.1 Vertical Whitespace
A single blank line always appears:
- Between consecutive members or initializers of a class: fields, constructors,
methods, nested classes, static initializers, and instance initializers.
- Exception: A blank line between two consecutive fields (having no other code between them) is optional. Such blank lines are used as needed to create logical groupings of fields.
- As required by other sections of this document (such as Section 3, Source file structure, and Section 3.3, Import statements).
A single blank line may also appear anywhere it improves readability, for example between statements to organize the code into logical subsections. A blank line before the first member or initializer, or after the last member or initializer of the class, is neither encouraged nor discouraged.
Multiple consecutive blank lines are permitted, but never required (or encouraged).
4.6.2 Horizontal whitespace
Beyond where required by the language or other style rules, and apart from literals, comments and Javadoc, a single ASCII space also appears in the following places only.
- Separating any reserved word, such as
if
,for
orcatch
, from an open parenthesis ((
) that follows it on that line - Separating any reserved word, such as
else
orcatch
, from a closing curly brace (}
) that precedes it on that line - Before any open curly brace
(
{
), with two exceptions:@SomeAnnotation({a, b})
(no space is used)String[][] x = {{"foo"}};
(no space is required between{{
, by item 8 below)
- On both sides of any binary or ternary operator. This also applies to the following
"operator-like" symbols:
- the ampersand in a conjunctive type bound:
<T extends Foo & Bar>
- the pipe for a catch block that handles multiple exceptions:
catch (FooException | BarException e)
- the colon (
:
) in an enhancedfor
("foreach") statement - the arrow in a lambda expression:
(String str) -> str.length()
- the two colons (
::
) of a method reference, which is written likeObject::toString
- the dot separator (
.
), which is written likeobject.toString()
- the ampersand in a conjunctive type bound:
- After
,:;
or the closing parenthesis ()
) of a cast - On both sides of the double slash (
//
) that begins an end-of-line comment. Here, multiple spaces are allowed, but not required. - Between the type and variable of a declaration:
List<String> list
- Optional just inside both braces of an array initializer
new int[] {5, 6}
andnew int[] { 5, 6 }
are both valid
- Between a type annotation and
[]
or...
.
This rule is never interpreted as requiring or forbidding additional space at the start or end of a line; it addresses only interior space.
4.6.3 Horizontal alignment [Optional]
Terminology Note: Horizontal alignment is the practice of adding a variable number of additional spaces in your code with the goal of making certain tokens appear directly below certain other tokens on previous lines.
This practice is permitted, but is never required.
Here is an example without alignment, then using alignment:
private int x; // this is fine private Color color; // this too private int x; // permitted, but future edits private Color color; // may leave it unaligned
Tip: Alignment can aid readability, but it creates problems for future maintenance. Consider a future change that needs to touch just one line. This change may leave the formerly-pleasing formatting mangled, and that is allowed. More often it prompts the coder (perhaps you) to adjust whitespace on nearby lines as well, possibly triggering a cascading series of reformattings. That one-line change now has a "blast radius." This can at worst result in pointless busywork, but at best it still corrupts version history information, slows down reviewers and exacerbates merge conflicts.
4.7 Specific constructs
4.7.1 One variable per declaration
Every variable declaration (field or local) declares only one variable: declarations such as
int a, b;
are not used.
Exception: Multiple variable declarations are acceptable in the header of a
for
loop.
4.7.2 Array initializers: can be "block-like"
Any array initializer may optionally be formatted as if it were a "block-like construct." For example, the following are all valid (not an exhaustive list):
new int[] { new int[] { 0, 1, 2, 3 0, } 1, 2, new int[] { 3, 0, 1, } 2, 3 } new int[] {0, 1, 2, 3}
4.7.3 Comments
This section addresses implementation comments.
Any line break may be preceded by arbitrary whitespace followed by an implementation comment. Such a comment renders the line non-blank.
4.7.3.1 Block comment style
Block comments are indented at the same level as the surrounding code. They may be in
/* ... */
style or
// ...
style. For multi-line
/* ... */
comments, subsequent lines must start with
*
aligned with the *
on the previous line.
/* * This is // And so /* Or you can * okay. // is this. * even do this. */ */
Comments are not enclosed in boxes drawn with asterisks or other characters.
Tip: When writing multi-line comments, use the
/* ... */
style if you want automatic code formatters to
re-wrap the lines when necessary (paragraph-style). Most formatters don't re-wrap lines in
// ...
style comment blocks.
4.7.4 Numeric Literals
long
-valued integer literals use an uppercase L
suffix, never
lowercase (to avoid confusion with the digit 1
). For example, 3000000000L
rather than 3000000000l
.
5 Naming
5.1 Rules common to all identifiers
Identifiers use only ASCII letters and digits, and, in a small number of cases noted below,
underscores. Thus each valid identifier name is matched by the regular expression
\w+
.
5.2 Rules by identifier type
5.2.1 Package names
Package names are all lowercase, with consecutive words simply concatenated together (no
underscores). For example, com.example.deepspace
, not
com.example.deepSpace
or
com.example.deep_space
.
5.2.2 Class names
Class names are written in UpperCamelCase.
Class names are typically nouns or noun phrases. For example,
Character
or
ImmutableList
. Interface names may also be nouns or
noun phrases (for example, List
), but may sometimes be
adjectives or adjective phrases instead (for example,
Readable
).
There are no specific rules or even well-established conventions for naming annotation types.
Test classes are named starting with the name of the class they are testing, and ending
with Test
. For example,
HashTest
or
HashIntegrationTest
.
5.2.3 Method names
Method names are written in lowerCamelCase.
Method names are typically verbs or verb phrases. For example,
sendMessage
or
stop
.
Underscores may appear in JUnit test method names to separate logical components of the
name, with each component written in lowerCamelCase.
One typical pattern is <methodUnderTest>_<state>
,
for example pop_emptyStack
. There is no One Correct
Way to name test methods.
5.2.4 Constant names
Constant names use CONSTANT_CASE
: all uppercase
letters, with each word separated from the next by a single underscore. But what is a
constant, exactly?
Constants are static final fields whose contents are deeply immutable and whose methods have no detectable side effects. This includes primitives, Strings, immutable types, and immutable collections of immutable types. If any of the instance's observable state can change, it is not a constant. Merely intending to never mutate the object is not enough. Examples:
// Constants static final int NUMBER = 5; static final ImmutableList<String> NAMES = ImmutableList.of("Ed", "Ann"); static final ImmutableMap<String, Integer> AGES = ImmutableMap.of("Ed", 35, "Ann", 32); static final Joiner COMMA_JOINER = Joiner.on(','); // because Joiner is immutable static final SomeMutableType[] EMPTY_ARRAY = {}; enum SomeEnum { ENUM_CONSTANT } // Not constants static String nonFinal = "non-final"; final String nonStatic = "non-static"; static final Set<String> mutableCollection = new HashSet<String>(); static final ImmutableSet<SomeMutableType> mutableElements = ImmutableSet.of(mutable); static final ImmutableMap<String, SomeMutableType> mutableValues = ImmutableMap.of("Ed", mutableInstance, "Ann", mutableInstance2); static final Logger logger = Logger.getLogger(MyClass.getName()); static final String[] nonEmptyArray = {"these", "can", "change"};
These names are typically nouns or noun phrases.
5.2.5 Non-constant field names
Non-constant field names (static or otherwise) are written in lowerCamelCase.
These names are typically nouns or noun phrases. For example,
computedValues
or
index
.
5.2.6 Parameter names
Parameter names are written in lowerCamelCase.
One-character parameter names in public methods should be avoided.
5.2.7 Local variable names
Local variable names are written in lowerCamelCase.
Even when final and immutable, local variables are not considered to be constants, and should not be styled as constants.
5.2.8 Type variable names
Each type variable is named in one of two styles:
- A single capital letter, optionally followed by a single numeral (such as
E
,T
,X
,T2
) - A name in the form used for classes (see Section 5.2.2,
Class names), followed by the capital letter
T
(examples:RequestT
,FooBarT
).
5.3 Camel case: defined
Sometimes there is more than one reasonable way to convert an English phrase into camel case, such as when acronyms or unusual constructs like "IPv6" or "iOS" are present. To improve predictability, 15-121 Style specifies the following (nearly) deterministic scheme.
Beginning with the prose form of the name:
- Convert the phrase to plain ASCII and remove any apostrophes. For example, "Müller's algorithm" might become "Muellers algorithm".
- Divide this result into words, splitting on spaces and any remaining punctuation (typically
hyphens).
- Recommended: if any word already has a conventional camel-case appearance in common usage, split this into its constituent parts (e.g., "AdWords" becomes "ad words"). Note that a word such as "iOS" is not really in camel case per se; it defies any convention, so this recommendation does not apply.
- Now lowercase everything (including acronyms), then uppercase only the first
character of:
- ... each word, to yield upper camel case, or
- ... each word except the first, to yield lower camel case
- Finally, join all the words into a single identifier.
Note that the casing of the original words is almost entirely disregarded. Examples:
Prose form | Correct | Incorrect |
---|---|---|
"XML HTTP request" | XmlHttpRequest |
XMLHTTPRequest |
"new customer ID" | newCustomerId |
newCustomerID |
"inner stopwatch" | innerStopwatch |
innerStopWatch |
"supports IPv6 on iOS?" | supportsIpv6OnIos |
supportsIPv6OnIOS |
"YouTube importer" | YouTubeImporter YoutubeImporter * |
*Acceptable, but not recommended.
Note: Some words are ambiguously hyphenated in the English
language: for example "nonempty" and "non-empty" are both correct, so the method names
checkNonempty
and
checkNonEmpty
are likewise both correct.