Advanced R - Packages

Download Report

Transcript Advanced R - Packages

R Packages Davor Cubranic SCARL, Dept. of Statistics

Warmup questions

• • • • Who here uses packages?

Which ones?

How do you know you’re using a package?

– Had to install it – Had to load it What happens when you load it?

– Functions – Data – Help pages

• • • So package is a bundle of “stuff” But, (next slide) It’s a structured bundle of…

What is a package?

• Structured, standardized unit of: – R code – documentation – data – external code

Why use packages (talking points)

• • • • • Installation & administration is easy – Finding, installing, compiling, updating… Validation – Package checks Distribution mechanisms – CRAN, Bioconductor, Github Documentation – Bundle examples, demos, tutorials Organization – Especially useful for the programmers: – Self-contained (names) – Declare and enforce dependencies on other packages

Why use packages

• • • • • Installation & administration Validation Distribution Documentation Organization

• • • Knowing how packages work and how to use them effectively will make you more effective R analyst, even if you don’t develop new packages But you should consider developing packages even if you don’t write the next ggplot Packages for your own stuff: - Analyses you frequently repeat and/or share with others - Publication: create a package containing the publication as a vignette, and bundle the code and data with it

Handling packages

• • • Load with library(

name

) Package-level help: – library(help=

name

) Unload with detach(package:

name

) – You shouldn’t have to do this

Handling packages with RStudio

• • • • See the packages tab in Rstudio Checkmark to load Some are already loaded!!

Click on the name for help

What happens when you load a package?

• • • • When you start R you have an empty workspace But there is also all this other “basic” R stuff (matrix, plot) So it’s more like we have two boxes, your workspace and “core” R Actually, it’s more like a whole bunch of boxes: see search()

So what happens when you load?

• • • • New package gets inserted near the front of the list It can pull additional packages (dependencies) But notice that each package is its own bundle (box) We’ll talk how you create these bundles next

Structure

• • • What makes package a package is that it follows a prescribed structure of files and directories If you tell R to treat this as a package, it will You can create it by hand, but we’ll use a shortcut: package.skeleton()

package.skeleton()

• • Convenient for turning a set of existing functions and scripts into a package Let’s do it with the anova.mlm code that we wrote earlier

• • • New project: – [email protected]:~scdemo/pkg source(‘anova.mlm.R’) package.skeleton(“anovaMlm”, ls())

DESCRIPTION

• • The only required part of a package Name, author, license, etc.

R/

• • • Directory for R code package.skeleton creates one file per function This is not a rule, you can put as many functions into a single file

• Help files

man/

NAMESPACE

• • Defines which objects are visible to the user and other functions Public vs. private to the package • The default is to make everything visible that starts with a letter

• • • Check Install Build

Command-line tools

INSTALL

• • • • • • Let’s install our package R CMD INSTALL anova.mlm

Delete the “man” directory – (it’s optional and we’ll recreate it later) Redo INSTALL Restart R studio library(“anova.mlm”)

check

• • • Really important!!!

Finds common errors, non-standard parts CRAN requires no ERRORS or WARNINGS

Optional contents

• • • • • • • • man/: documentation data/: datasets demo/: R code for demo purposes inst/: other files to include in the package (PDFs, vignettes) tests/: package test files src/: C, C++, or Fortran source code NEWS: history of changes to the package LICENSE or LICENCE: package license

DESCRIPTION

• • • Depends: – packages used by this one – and loaded as part of its loading – i.e., visible to the user Imports: – packages used by this one –

but not loaded

– i.e, not visible to the user Suggests: – used in examples or vignettes – non-essential functionality

NAMESPACE

• • exportPattern(“^[[:alpha:]]”) export(anova.mlm, est.beta) • • S3method(print, anova.mlm) S3method(plot, anova.mlm) • • import(MASS) importFrom(MASS, lda)

Documentation

• • • Let’s re-generate the documentation files promptPackage(“anova.mlm”) prompt(anova.mlm)

anova.mlm.Rd

• • • Description: Compute a (generalized) analysis of variance table for one or more multivariate linear models.

Arguments: – object: an object of class '"mlm”’ – ...: further objects of class '"mlm"’.

– force.int: Force intercept Value: An object of class “anova” inheriting from class “matrix”

Help files for methods

• • \usage{anova.mlm(…)} For S3 methods: – \usage{\method{print}{anova.mlm}(….)}

Vignettes

• • • • • • .rnw extension Written in Sweave – similar to knitr Latex + R code Produces a PDF available in the installed package vignette() vignette(‘Sweave’)

Help on writing packages

• • • Lots of tutorials on the Web – many of them are not necessarily correct – NAMESPACES, Imports, etc.

Authoritative guide: Writing R Extensions R-devel mailing list