Home > Data Mining Algorithms > Anomaly Detection > Anomaly Detection Viewers a... > Association > AR Model Viewers and Algori... > Decision Tree > Expectation Maximization > EM Model Viewer and Algorit... > Generalized Linear Models > GLM Model Viewers and Algor... > k-Means > Naive Bayes > Non-Negative Matrix Factori... > Orthogonal Partitioning Clu... > O-Cluster Algorithm
The O-Cluster (OC) algorithm creates a hierarchical grid-based clustering model, that is, it creates axis-parallel (orthogonal) partitions in the input attribute space. The algorithm operates recursively. The resulting hierarchical structure represents an irregular grid that tessellates the attribute space into clusters. The resulting clusters define dense areas in the attribute space.
The clusters are described by intervals along the attribute axes and the corresponding centroids and histograms. The sensitivity parameter defines a baseline density level. Only areas with peak density above this baseline level can be identified as clusters.
The clusters discovered by O-Cluster are used to generate a Bayesian probability model that is then used during scoring (model apply) for assigning data points to clusters. The generated probability model is a mixture model where the mixture components are represented by a product of independent normal distributions for numerical attributes and multinomial distributions for categorical attributes.
O-Cluster handles missing values naturally as missing at random. The algorithm does not support nested tables and thus does not support sparse data.
Note: OC does not support text. |