How Iso Cluster works
The Iso Cluster tool uses a modified iterative optimization clustering procedure, also known as the migrating means technique. The algorithm separates all cells into the user-specified number of distinct unimodal groups in the multidimensional space of the input bands. This tool is most often used in preparation for unsupervised classification.
The iso prefix of the isodata clustering algorithm is an abbreviation for the iterative self-organizing way of performing clustering. This type of clustering uses a process in which, during each iteration, all samples are assigned to existing cluster centers and new means are recalculated for every class. The optimal number of classes to specify is usually unknown. Therefore, it is advised to enter a conservatively high number, analyze the resulting clusters, and rerun the function with a reduced number of classes.
The iso cluster algorithm is an iterative process for computing the minimum Euclidean distance when assigning each candidate cell to a cluster. The process starts with arbitrary means being assigned by the software, one for each cluster (you dictate the number of clusters). Every cell is assigned to the closest of these means (all in the multidimensional attribute space). New means are recalculated for each cluster based on the attribute distances of the cells that belong to the cluster after the first iteration. The process is repeated: each cell is assigned to the closest mean in multidimensional attribute space, and new means are calculated for each cluster based on the membership of cells from the iteration. You can specify the number of iterations of the process through Number of iterations. This value should be large enough to ensure that, after running the specified number of iterations, the migration of cells from one cluster to another is minimal; therefore, all the clusters become stable. When increasing the number of clusters, the number of iterations should also increase.
The specified Number of classes value is the maximum number of clusters that can result from the clustering process. However, the number of clusters in the output signature file may not be the same as the number specified for the number of classes. This situation occurs in the following cases:
- The values of data and the initial cluster means are not evenly distributed. In certain ranges of cell values, the frequency of occurrences for these clusters may be next to none. Consequently, some of the originally predefined cluster means may not have a chance to absorb enough cell members.
- Clusters consisting of fewer cells than the specified Minimum class size value will be eliminated at the end of the iterations.
- Clusters merge with neighboring clusters when the statistical values are similar after the clusters become stable. Some clusters may be so close to each other and have such similar statistics that keeping them apart would be an unnecessary division of the data.
Example
The following is a sample signature file created by Iso Cluster. The file begins with a header, which is commented out, showing the values of the parameters used in performing the iso clustering.
The class names are optional and are entered after creating the file using a text editor. Each class name, if entered, must be a single string of characters no more than 14 alphanumeric characters in length.
# Signatures Produced by Clustering of # Stack redlands # number_of_classes=6 max_iterations=20 min_class_size=20 # sampling interval=10 # Number of selected grids /* 3 # Layer-Number Grid-name /* 1 redlands1 /* 2 redlands2 /* 3 redlands3 # Type Number of Classes Number of Layers Number of Parametric Layers 1 4 3 3 # =============================================================== # Class ID Number of Cells Class Name 1 1843 # Layers 1 2 3 # Means 22.8817 60.7656 34.8893 # Covariance 1 169.3975 -69.7444 179.0808 2 -69.7444 714.7072 10.7889 3 179.0808 10.7889 284.0931 # --------------------------------------------------------------- # Class ID Number of Cells Class Name 2 2495 # Layers 1 2 3 # Means 38.4894 132.9775 61.8104 # Covariance 1 414.9621 -19.0732 301.0267 2 -19.0732 510.8439 102.8931 3 301.0267 102.8931 376.5450 # --------------------------------------------------------------- # Class ID Number of Cells Class Name 3 2124 # Layers 1 2 3 # Means 70.3983 82.9576 89.2472 # Covariance 1 264.2680 100.6966 39.3895 2 100.6966 523.9096 75.5573 3 39.3895 75.5573 279.7387 # ------------------------------------------------------------ # Class ID Number of Cells Class Name 4 2438 # Layers 1 2 3 # Means 105.8708 137.6645 130.0886 # Covariance 1 651.0465 175.1060 391.6028 2 175.1060 300.8853 143.2443 3 391.6028 143.2443 647.7345
References
Ball, G. H., and D. J. Hall. 1965. A Novel Method of Data Analysis and Pattern Classification. Menlo Park, California: Stanford Research Institute.
Richards, J. A. 1986. Remote Sensing Digital Image Analysis: An Introduction.. Berlin: Springer–Verlag.