Methods used to obtain unbiased estimates of future performance of statistical prediction models and classifiers include data splitting and resampling. The two most commonly used resampling methods are cross-validation and bootstrapping. To be as good as the bootstrap, about 100 repeats of 10-fold cross-validation are required.
As discussed in more detail in Section 5.3 of Regression Modeling Strategies Course Notes and the same section of the RMS book, data splitting is an unstable method for validating models or classifiers, especially when the number of subjects is less than about 20,000 (fewer if signal:noise ratio is high).
It is important to distinguish prediction and classification. In many decisionmaking contexts, classification represents a premature decision, because classification combines prediction and decision making and usurps the decision maker in specifying costs of wrong decisions. The classification rule must be reformulated if costs/utilities or sampling criteria change. Predictions are separate from decisions and can be used by any decision maker. Classification is best used with non-stochastic/deterministic outcomes that occur frequently, and not when two individuals with identical inputs can easily have different outcomes.