Using R, Rmarkdown, RStudio, knitr, plotly, and HTML for the Next Generation of Reproducible Statistical Reports

The Vanderbilt Department of Biostatistics has two policies currently in effect:
1. All statistical reports will be reproducible
2. All reports should include all the code used to produce the report, in some fashion

We have succeeded with 1. (mainly using knitr in R) and to a large extent with 2. Some biostatisticians have been concerned about interspersing code with the contents of the report. It has also been challenging to copy some PDF report components (e.g., advanced tables) into word processing documents.

Fortunately R and RStudio have recently added a number of new features that allow for easy creation of HTML notebooks that are viewed with any web browser. This solves the problems listed above and adds new possibilities such as interactive graphics that appear in a self-contained HTML file to post on a collaboration web server or send to a collaborator. Interactive graphics allow the analyst to create more detail (e.g., confidence bands for multiple confidence levels; confidence bands for group differences as well as those for each group individually) with the collaborator able to easily select which details to view.

I have made major revisions in the R Hmisc and rms packages to provide new capabilities that fit into the R/RStudio Rmarkdown HTML notebook framework. Interactive plotly graphics (based on Javascript and D3) and customized HTML output are the main new ingredients. In this talk the rationale for this approach is discussed, and the new features are demonstrated with two statistical reports. A few miscellaneous topics will also be covered, e.g. how to cite bibliographic references in Rmarkdown and how to interface R to for viewing or extracting bibliographic references.

For more information see
ggplotly: a function that converts any ggplot2 graphic to a plotly interactive graphic:

November 16, 2017