Jan 27, 2017 - Stancon

Stan and NIMBLE so that computer does the hard part

Erin and I made a trip out to New York last weekend so that I could catch the STAN conference. There were some great talks, especially the “come to Jesus” seminar with Michael Betancourt on typical sets of high-dimensional posteriors and the previously cryptic warnings that STAN’s sampler sometimes emits in my models. I was also pleased that STAN developers are thinking seriously about sustainability beyond its current academic home at Columbia.

STAN has some competition in the Bayesian programming language domain. NIMBLE is a project primarily based out of UC Berkeley that also offers a modern, actively developed domain-specific language for specifying Bayesian models. NIMBLE and STAN have similar goals, but differ as far as methodology.

Similarities

  • Both compile to C++
  • Both use a domain-specific modeling language

Differences

  • NIMBLE explicitly infers the directed graph of the model while STAN is imperative. So you can make pretty pictures of the model more easily with NIMBLE, but maybe more sophisticated programming in STAN.
  • NIMBLE actually does use an extended version of the BUGS/JAGS language, so if you already know that there are fewer things to learn
  • STAN offers fancy auto-tuning hamiltonian Monte Carlo. NIMBLE offers a number of samplers, but probably nothing as advanced. STAN is probably faster per effective sample (taking in account autocorrelation)
  • STAN has an API offering programmatic evaluation of log-likelihoods and gradients
  • STAN does not allow for discrete parameters (data are fine). If you’ve got a discrete parameter, you need to figure out how to marginalize it out.
  • STAN seems to have a larger library of mathematical functions (gaussian processes/matrix distributions/differential equations, and more)
  • STAN automatically handles constrained parameters (non-negativity, positive definite or interval) by transforming them to be unconstrained
  • STAN can do MAP and variational inference. NIMBLE has some interesting looking particle filters.

A rainy day in Olmsted's original

Sep 20, 2016 - Joys Of Multiple R Environments

On the joys of running multiple R environments

Once you have reached a certain level of depravity, it no longer suffices to have a single R environment, where the environment can be defined as the R version, the version of all packages, and possibly the version of the compiler and linear algebra libraries that compiled the whole mess to begin with. Such perversion arises when you are developing a package and need to develop against R/Bioconductor development, or you need to run a legacy version of R.

Herein find some notes:

  1. You can tell R where to look for its package libraries by setting the environment variable R_LIBS_USER to a non-default directory, eg, R_LIBS_USER=~/R/x86_64-unknown-linux-gnu-library/3.3-bioc-release tells R to put ~/R/x86_64-unknown-linux-gnu-library/3.3-bioc-release first in the places it looks for packages (the .libPaths() search order).
  2. Standard advice is to make an alias, ie, in .bashrc:

    alias R-devel R_LIBS_USER=~/R/x86_64-unknown-linux-gnu-library/3.4-bioc-devel R
    
  3. But this doesn’t work easily or well if you are trying to run R remotely, say via ESS/tramp in emacs. Instead it is better to write a few shell scripts

    #!/bin/bash
    #put this in a file called R-devel accessible on your path
    R_LIBS_USER=~/R/x86_64-unknown-linux-gnu-library/3.4-bioc-devel R "$@"
    
  4. This seems not to be entirely respected by devtools 1.12.0. At least when you call check, it has complicated ways of mangling the library paths that break this, I believe during the “checking whether package XXX can be installed” phase. So you will probably have to call R-devel CMD check myPackage.tar.gz from the shell. Details currently under investigation.
  5. It’s possible that using packrat might obviate the custom .libPaths() situation, but not the multiple versions of R situation. But I haven’t had great luck with packrat (probably my own fault).

2017.01.04 Addendum

Working multiple R versions via ESS/TRAMP requires even more bizarre incantations. Supposedly, ESS is supposed to discover R versions searching the EMACS variable exec-path for commands starting with strings provided in the list ess-r-versions, though even this doesn’t seem to work on MacOS. It definitely doesn’t simply work with TRAMP, which sets its own remote PATH based on the variable tramp-remote-path. This, in turn, takes default values that generally don’t reflect the PATH you’d get with a standard login shell on the remote system. Instead, to find a version of R, you may alter tramp-remote-path to put the directory containing your desired version first in the path list, eg, with M-x customize-variables tramp-remote-path.