Avoiding using global objects when building an R package with multiple separate functions -
i have built r package runs complex bayesian model (dirichlet process mixture model on spatial data) including mcmc, thinning , validation , interface googlemaps. i'm happy performance , runs without problems. issue on cran , rejected because extensively use global variables.
the package built around use of 8 core functions (which user interacts with):
1) loaddata: loads in data, extracts key information , sets series of global matrices other small list objects.
2) modelparameters: sets model parameters, option plot prior on parameter sigma on googlemap. calculates hyper-prior @ point , saves large matrix global environment
3) graphicparameters: sets graphic parameters of maps , plots (see code below)
4) createmaps: creates prior surface on source location tau , plots data on google map. keeps number of global objects saved repeated plotting of map.
5) runmcmc: runs bulk of analysis using mcmc (a time intensive step), creates many global objects.
6) thinandanalsye: thins posterior samples , constructs geoprofile (a time intensive step)
7) plotgp: plots data , overlays geoprofile onto google map
8) reporthitscores: optional if source data imported, calculates hit scores of potential sources
each 1 run in turn before next, , pass global variables out used 1 or more of other functions.
i built way reason, user must stop , evaluate results of these functions before rushing ahead future ones.
each of these functions passes not fixed parameters, large map objects, lists , matrices global objects. thought nice simple solution smooth workflow (you can check results in main working environment before moving on, possibly applying transformations etc) , have given objects unique , informative names.
how around this, , pass checks of cran whilst keeping user friendly workflow of series of interacting functions?
i dont want post lot of code (as mcmc part several hundred lines long)
but include 1 of simple examples. graphicparameters 1 of simple parameter setting functions, comes default values set. simple example, there more complex ones in package. there model parameters function pulls many of variables existing data loading function example.
graphicparameters <- function(guardrail=0.05, nring=20,transp=0.4,gridsize=640,gridsize2=300,maptype= "roadmap",location=getwd(),pointcol="black") { guardrail<<-guardrail nring<<-nring transp<<-transp gridsize<<-gridsize gridsize2<<-gridsize2 maptype<<-maptype location<<-location pointcol<<-pointcol }
most of material have seen concerning avoiding global objects resolves around single function work. want keep step step multi-function approach, loose global objects.
any appreciated.
i understand may major reworking of code (which several 1000 lines currently), love solutions minimally affect overall structure of package.
p.s. wish had known crans displeasure global objects before started!!!
your problem amenable oop-style design. can use reference classes or s4 export single global, e.g., mapanalysis
class generator. idea creates using
ma <- new('mapanalysis', option1 = ..., option2 = ..., ...) # s4 # or ma <- mapanalysis$new(option1 = ..., ...) # refclass
and can call methods with
ma$loaddata(...) ma$setparameters(...)
with object doing bookkeeping of options , auxiliary objects internally. should not work refactor. if read page linked @ top of post, should see it's possible wrap functions refclass('mapanalysis', fields = (...), methods = (...))
few further modifications. (although lot of down road re-think architecture in oop terms.)
Comments
Post a Comment