R Coding Standards
Coding Standards for R
We will follow the https://style.tidyverse.org/ style guide with very few changes to benefit from two R packages supporting this style guide:
This coding standards will outline the more important aspects of the aforementioned style.
Modifications from tidyverse Coding Standards
Naming will use
camelCase
instead ofsnake_case
.Favor usage of
return()
even when the return value does not need to be specified explicitly.
RStudio IDE Settings
Indentation of 2
Use spaces instead of tabs
Naming Convention
Use meaningful and understandable names. Code should read as a story and only some well known abbreviations (such as pk) should be used.
Files
File names containing both source code (/R
) and tests (/tests
) should follow the kebab-case naming convention and should have .R
extension.
Do not use special characters (e.g. “µ”, “ß”, …) or blanks in the file names
Object names
Variable and function names should use only lowercase letters and numbers. Use camelCase to separate words within a name.
Class names on the other hand should use Pascal Casing.
True constant variables should use ALL_CAPS Casing.
Do not use Hungarian notation (e.g., g for global, b for Boolean, s for strings, etc.)
Functions
Prefer using return()
for explicitly returning result, although you can rely on R to implicitly return the result of the last evaluated expression in a function.
Comments
Do not comment the obvious.
Use comments to explain the why, and not the what or how.
Indent comment at the same level of indentation as the code you are documenting.
All comments must be written in English.
Do not generate comments automatically.
Do comment algorithm specifics. For example, why would you start a loop at index 1 and not at 0, etc.
If a lot of comments are required to make a method easier to understand, break down the method in smaller methods.
Really, do not comment the obvious.
Documentation
Use roxygen comments (
#'
) as described here.Do not include empty lines between the function code and its documentation.
Internal functions, if documented, should use the tag
#' @keywords internal
. This makes sure that package websites don't include these internal functions.Prefer using
markdown
syntax to write roxygen documentation (e.g. use**
instead of\bold{}
).To automate the conversion of existing documentation to use
markdown
syntax, install roxygen2md package and runroxygen2md::roxygen2md()
in the package root directory and carefully check the conversion.
Conventions
Function names as code with parentheses (good:
dplyr::mutate()
,mutate()
; bad: mutate, mutate)Variable and (
R6
/S3
/S4
) object names as code (good:x
; bad: x, x, x)Package names as code with
{
(good:{dplyr}
; bad:dplyr
, dplyr, dplyr)Programming language names as code (e.g.
markdown
,C++
)
Note that these conventions are adopted to facilitate (auto-generated) cross-linking in {pkgdown}
websites.
Documenting functions
http://r-pkgs.had.co.nz/man.html#man-functions
Documenting classes
Reference classes are different across S3
and S4
because methods are associated with classes, not generics. RC also has a special convention for documenting methods: the docstring. The docstring is a string placed inside the definition of the method which briefly describes what it does. This makes documenting RC simpler than S4
because you only need one roxygen block per class.
When referring to the class property ($name
) or method ($set_hair()
) in package vignettes, use the $
sign to highlight that they belong to an object. Note that the method always has parentheses to distinguish it from a property.
If a class has a private method, its name should start with .
to highlight this (e.g. $.set_hair_color()
).
Syntax
Spacing
Use the styler
add-in for RStudio. It will style the files for you. For more, see here
Global Variables and Constants
Except for program constants or truly global states, never use global variables. If a global object is required, this should be absolutely discussed with the team.
No hard coded strings and magic number should be used. Declare a constant instead.
Booleans
Avoid using boolean abbreviations (
T
andF
). Instead, useTRUE
andFALSE
(respectively).
Style
Long Lines
Strive to limit your code (including comments and roxygen documentation) to 80 characters per line.
Assignments
Use <-
, not =
, for assignment.
Semicolons
Don't put ;
at the end of a line, and don't use ;
to put multiple commands on one line.
Note: All these styling issues, and much more, are corrected automatically with {styler}
.
Code blocks
{
should be the last character on the line. Related code (e.g., anif
clause, a function declaration, a trailing comma, etc.) must be on the same line as the opening brace.The contents should be indented.
}
should be the first character on the line.It is OK to drop the curly braces for very simple statements that fit on one line, as long as they don't have side-effects.
Tests
Refer to chapter Tests
Error messages
Refer to chapter Errors
Rmarkdown
Package vignettes are written using {rmarkdown}
package. Here are some good practices to follow while writing these documents:
It is strongly recommended that only alphanumeric characters (
a-z
,A-Z
and0-9
) and dashes (-
) are used in chunk labels, because they are not special characters and will surely work for all output formats. Other characters, spaces and underscores in particular, may cause trouble in certain packages, such as{bookdown}
,{styler}
. Ref: https://bookdown.org/yihui/rmarkdown/r-code.html
Let your rmarkdown breathe. You should use blank lines to separate different elements to avoid ambiguity. Ref: https://yihui.org/en/2021/06/markdown-breath/
Code complexity
R provides some quality checking tools, which can also investigate the complexity of code. E.g. the R package cyclocomp allows the calculation of cyclomatic complexity of a function or a package, s. https://en.wikipedia.org/wiki/Cyclomatic_complexity )
cyclocomp::cyclocomp(<function_name>)
ORcyclocomp::cyclocomp_package(<package_name>)
ORcyclocomp::cyclocomp_q(<R_expression>)
Example:
General advice is: cyclomatic complexity of a function should not exceed the value of 15 https://en.wikipedia.org/wiki/Cyclomatic_complexity#Limiting_complexity_during_development
See also
A more comprehensive list of tools helpful for package development can be found in this resource.
Last updated