Skip to content

Use static analysis to find bugs in our R code before production #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
b-long opened this issue Oct 5, 2015 · 3 comments
Closed

Use static analysis to find bugs in our R code before production #3

b-long opened this issue Oct 5, 2015 · 3 comments

Comments

@b-long
Copy link
Contributor

b-long commented Oct 5, 2015

Similar to #2, I'd like to see the introduction of static analysis to BBS. Perhaps we could use @jimhester 's lintr . Presumably, lintr will point out many code blocks as in need of improvement. Surely there will be some lintr results that are controversial, but perhaps those code blocks can be brought up for discussion among the Bioconductor core team to identify the canonical approach to solve some problem.

@jimhester
Copy link
Contributor

lintr mainly does style linting at the moment. It runs some of the checks in codetools and also generates errors if there are syntax errors, but both of these types of checks are already run from R CMD check.

The default code style used by lintr (Hadley's style guide is also different than the Bioconductor style. This could be customized but it has not been done yet. On top of that there are widely different styles used in Bioconductor packages currently, and no style is being enforced for current or new packages.

That being said, there are some trivial cases we could add to lintr that I see commonly when reviewing packages.

  • return() calls at the end of functions (result of last statement is always returned, so the explicit return() is redundant (and also slower as it requires an additional function call).
  • use of 1:length(x) in for loops (dangerous if length(x) can be 0) as 1:0 returns c(1, 0), should use seq_len(length(x)) instead.
  • neglecting to use drop = FALSE in two (or more) argument form of [, i.e. x[a, b], by default drop = TRUE this will coerce the result to the lowest possible dimension, which can produce unexpected results if both a and b are length 1 for instance. (This feature is nice interactively, but usually undesirable within functions).
  • Iterative appending to vectors within loops rather than pre-allocation the vectors and filling. (the former is quadratic in time complexity, as R has to keep re-sizing the vector on every assignment.
# bad
for (i in seq_len(10)) {
  x <- c(x, i)
}

# good
x <- integer(10)
for (i in seq_len(10)) {
  x[i] <- i
}
  • Changing global state within functions without resetting it properly (or using https://github.com/jimhester/withr). setwd(), options(), par() are common functions which should not be used (or at least use should be examined)
  • Avoid Functions that use Non-Standard-Evaluation within functions (with(), subset(), transform() etc)
  • Avoid functions that return different types of output (e.g. sapply(), mapply())

I have a open issue at r-lib/lintr#48 for some of these, just haven't had the time to work on them.

@b-long
Copy link
Contributor Author

b-long commented Oct 5, 2015

After a quick discussion with Jim, I just want to highlight that I meant analyzing BBS itself (rather than packages built by BBS). Of course, a majority of the code is written in shell scripts (already addressed by issue #2). This issue is just meant to offer a pathway for us to measure current code quality and then improve on it. Implementation ideas are very much welcome! :)

@mtmorgan mtmorgan changed the title Use static analysis to find bugs in R before production Use static analysis to find bugs in our R code before production Jan 13, 2016
@b-long
Copy link
Contributor Author

b-long commented Jan 13, 2016

Currently this isn't possible (R CMD check can be run on a package, but not stand-alone scripts). We may consider moving scripts to a package at some later point.

@b-long b-long closed this as completed Jan 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants