# 2022

## R Workflow

This article outlines analysis project workflow found to be efficient in making reproducible research reports using R with Quarto. I start by covering the creation of annotated analysis files, discovering missing data patterns, and running descriptive statistics on with goals of understanding the data and the quality and completeness of the data. Functions in the Hmisc package are used to annotate data frames and data tables with labels and units of measurement and to produce tabular and graphical statistical summaries. Several examples of processing and manipulating data using the data.table package are given. Much attention is paid to the use of minimal-assumption methods for describing relationships with continuous variables, avoiding disasters such as computing mean Y as a function of quintiles of body mass index. Examples of diagramming exclusions of observations from analysis, caching results, doing parallel processing, and running simulations are presented. This article is a synopsis of the [R Workflow electronic book](https://hbiostat.org/rflow).

## Resources for Ordinal Regression Models

This article provides resources to assist researchers in understanding and using ordinal regression models, and provides arguments for their wider use.

## Decision curve analysis for quantifying the additional benefit of a new marker

This article examines the benefits of decision curve analysis for assessing model performance when adding a new marker to an existing model. Decision curve analysis provides a clinically interpretable metric based on the number of events identified and interventions avoided.

## Equivalence of Wilcoxon Statistic and Proportional Odds Model

In this article I provide much more extensive simulations showing the near perfect agreement between the odds ratio (OR) from a proportional odds (PO) model, and the Wilcoxon two-sample test statistic. The agreement is studied by degree of violation of the PO assumption and by the sample size. A refinement in the conversion formula between the OR and the Wilcoxon statistic scaled to 0-1 (corcordance probability) is provided.

## Longitudinal Data: Think Serial Correlation First, Random Effects Second

Random effects/mixed effects models shine for multi-level data such as measurements within cities within counties within states. They can also deal with measurements clustered within subjects. There are at least two contexts for the latter: rapidly repeated measurements where elapsed time is not an issue, and serial measurements spaced out over time for which time trends are more likely to be important.

## Assessing the Proportional Odds Assumption and Its Impact

This article demonstrates how the proportional odds (PO) assumption and its impact can be assessed. General robustness to non-PO on either a main variable of interest or on an adjustment covariate are exemplified. Advantages of a continuous Bayesian blend of PO and non-PO are also discussed.