Data-reduction

Scoring Multiple Variables, Too Many Variables and Too Few Observations: Data Reduction

This post will grow to cover questions about data reduction methods, also known as unsupervised learning methods. These are intended primarily for two purposes: collapsing correlated variables into an overall score so that one does not have to disentangle correlated effects, which is a difficult statistical task reducing the effective number of variables to use in a regression or other predictive model, so that fewer parameters need to be estimated The latter example is the “too many variables too few subjects” problem.