Classification and clustering are examples of the more general problem of
pattern recognition, which is the assignment of some sort of output value to a given input value. Other examples are
regression, which assigns a real-valued output to each input;
sequence labeling, which assigns a class to each member of a sequence of values (for example,
part of speech tagging, which assigns a
part of speech to each word in an input sentence);
parsing, which assigns a
parse tree to an input sentence, describing the
syntactic structure of the sentence; etc. A common subclass of classification is
probabilistic classification. Algorithms of this nature use
statistical inference to find the best class for a given instance. Unlike other algorithms, which simply output a "best" class, probabilistic algorithms output a
probability of the instance being a member of each of the possible classes. The best class is normally then selected as the one with the highest probability. However, such an algorithm has numerous advantages over non-probabilistic classifiers: • It can output a confidence value associated with its choice (in general, a classifier that can do this is known as a
confidence-weighted classifier). • Correspondingly, it can
abstain when its confidence of choosing any particular output is too low. • Because of the probabilities which are generated, probabilistic classifiers can be more effectively incorporated into larger machine-learning tasks, in a way that partially or completely avoids the problem of
error propagation. ==Frequentist procedures==