Class Classifier1D


  • public class Classifier1D
    extends Object
    • Constructor Summary

      Constructors 
      Constructor Description
      Classifier1D()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static double[] adjustLimitsKMeans​(double[] data, double[] oldLimits)
      Moves the limits, by assigning data points to the closest class mean value.
      static double[] calcClassMeans​(double[] data, int[] classes, int numClasses)  
      static double calcGVF​(double[] data, double[] limits, double SDAM)
      GVF (goodness of variance fit): see B.D.
      static double calcGVF​(double SDAM, double SDCM)
      GVF (goodness of variance fit): see B.D.
      static double calcSDAM​(double[] data)
      SDAM (squared deviation [from] array mean): see B.D.
      static double calcSDCM​(double[] data, int[] classes, double[] classMeans, int numClasses)
      SDCM (squared deviations [from] class means): see B.D.
      static int[] classifyData​(double[] data, double[] limits)
      Classifies the given data according to the given limits.
      static double[] classifyEqualNumber​(double[] data, int numberClasses)
      calculates class limits with equal number, which is euqal to the "quantiles" method.
      static double[] classifyEqualRange​(double[] data, int numberClasses)
      calculates class limits with equal range
      static double[] classifyKMeansOnExistingBreaks​(double[] data, int numberClasses, int initialLimitAlgorithm)
      calculates class limits using optimal breaks method (see e.g.
      static double[] classifyMaxBreaks​(double[] data, int numberClasses)
      calculates class limits using Maximum Breaks method (see e.g.
      static double[] classifyMeanStandardDeviation​(double[] data, int numberClasses)
      calculates class limits using mean value and standard deviation, i.e.
      static double[] classifyNaturalBreaks​(double[] data, int numberClasses)
      calculates class limits using Jenks's Optimisation Method(Natural Break)
      static List getAvailableClassificationMethods()  
      static boolean isInClass​(double val, double lowerBound, double upperBound)
      Checks if value is within limits.
    • Field Detail

      • EQUAL_RANGE

        public static String EQUAL_RANGE
      • EQUAL_NUMBER

        public static String EQUAL_NUMBER
      • MEAN_STDEV

        public static String MEAN_STDEV
      • MAX_BREAKS

        public static String MAX_BREAKS
      • JENKS_BREAKS

        public static String JENKS_BREAKS
      • KMEANS_OPTIMIZE

        public static String KMEANS_OPTIMIZE
    • Constructor Detail

      • Classifier1D

        public Classifier1D()
    • Method Detail

      • getAvailableClassificationMethods

        public static List getAvailableClassificationMethods()
      • classifyEqualRange

        public static double[] classifyEqualRange​(double[] data,
                                                  int numberClasses)
        calculates class limits with equal range
        Parameters:
        data - input data
        numberClasses - number of classes
        Returns:
        break values for classes. E.g. for 4 ranges 3 breaks are returned. Min and Max Values are not returned.
      • classifyEqualNumber

        public static double[] classifyEqualNumber​(double[] data,
                                                   int numberClasses)
        calculates class limits with equal number, which is euqal to the "quantiles" method. Note that differences in the items per classes occure, if items have same values and need to be grouped into the same class.
        Parameters:
        data - input data
        numberClasses - number of classes
        Returns:
        break values for classes. E.g. for 4 ranges 3 breaks are returned. Min and Max Values are not returned.
      • classifyMeanStandardDeviation

        public static double[] classifyMeanStandardDeviation​(double[] data,
                                                             int numberClasses)
        calculates class limits using mean value and standard deviation, i.e. for 5 classes: c1: values < m- 2std, c2: m - 2std < values < m - 1std, c3: m - 1std < values < m + 1std, c4: m + 1std < values < m + 2std c5: values > m- 2std
        Parameters:
        data - input data
        numberClasses - number of classes
        Returns:
        break values for classes. E.g. for 4 ranges 3 breaks are returned. Min and Max Values are not returned.
      • classifyMaxBreaks

        public static double[] classifyMaxBreaks​(double[] data,
                                                 int numberClasses)
        calculates class limits using Maximum Breaks method (see e.g. T. A. Slocum: "Thematic Cartography and Visualization", 1999)
        Parameters:
        data - input data
        numberClasses - number of classes
        Returns:
        break values for classes. E.g. for 4 ranges 3 breaks are returned. Min and Max Values are not returned.
      • classifyNaturalBreaks

        public static double[] classifyNaturalBreaks​(double[] data,
                                                     int numberClasses)
        calculates class limits using Jenks's Optimisation Method(Natural Break)
        Parameters:
        data - input data
        numberClasses - number of classes
        Returns:
        break values for classes. E.g. for 4 ranges 3 breaks are returned. Min and Max Values are not returned.
      • classifyKMeansOnExistingBreaks

        public static double[] classifyKMeansOnExistingBreaks​(double[] data,
                                                              int numberClasses,
                                                              int initialLimitAlgorithm)
        calculates class limits using optimal breaks method (see e.g. T. A. Slocum: "Thematic Cartography and Visualization", 1999, p.73) or B.D. Dent: "Cartography: Thematic Map Design", 1999, p.146).

        Note: limits should not be equal to values. Since values that are equal to bounds can be classified into 2 classes.

        Parameters:
        data - input data
        numberClasses - number of classes
        initialLimitAlgorithm - 1: maxBreaks, 2: equalRange, 3: quantiles, 4: MeanStd-Dev 5: Jenks
        Returns:
        break values for classes. E.g. for 4 ranges 3 breaks are returned. Min and Max Values are not returned.
      • adjustLimitsKMeans

        public static double[] adjustLimitsKMeans​(double[] data,
                                                  double[] oldLimits)
        Moves the limits, by assigning data points to the closest class mean value.

        This approach is equal to the k-means procedure (see e.g. Duda, Hart and Stork 2000, p. 526).

        Parameters:
        data - (sortedData from min to max, e.g. use jmathtools DoubleArray.sort())
        oldLimits - old limits array
        Returns:
        a double array of adjusted limits
      • classifyData

        public static int[] classifyData​(double[] data,
                                         double[] limits)
        Classifies the given data according to the given limits.
        Parameters:
        data - input data
        limits - The break/decision values between the classes. Highest and lowest values are not delivered. Example Limits are for instance delivered by the Classifier1D.classifyEqualNumber() method.
        Returns:
        array containg a class ID for every item.
      • isInClass

        public static boolean isInClass​(double val,
                                        double lowerBound,
                                        double upperBound)
        Checks if value is within limits.

        Note: values equal to the bound values return "true". (qery: lowerlimit <= val <= upperlimit)

        Parameters:
        val - the value to test
        lowerBound - the lower bound
        upperBound - the upper bound
        Returns:
        true if val is included between lowerBound (included) and upperBound (included)
      • calcSDAM

        public static double calcSDAM​(double[] data)
        SDAM (squared deviation [from] array mean): see B.D. Dent (1999, p. 148) alternatively look for T.A. Slocum (1999, p. 73).

        Used for Optimal Breaks Method.

        Parameters:
        data - input data
        Returns:
        the squared deviation from double array mean
      • calcSDCM

        public static double calcSDCM​(double[] data,
                                      int[] classes,
                                      double[] classMeans,
                                      int numClasses)
        SDCM (squared deviations [from] class means): see B.D. Dent (1999, p. 148) alternatively look for T.A. Slocum (1999, p. 73). \n Used for Optimal Breaks Method. TODO : definition of SDCM (relative to SDAM)
        Parameters:
        data - input data
        classes - the classes for every item of the data array
        classMeans - class means
        numClasses - number of classes
        Returns:
        squared deviations from class means
      • calcGVF

        public static double calcGVF​(double SDAM,
                                     double SDCM)
        GVF (goodness of variance fit): see B.D. Dent (1999, p. 148) alternatively look for T.A. Slocum (1999, p. 73). \n Used for Optimal Breaks Method.
        Parameters:
        SDAM - squared deviation [from] array mean
        SDCM - squared deviation [from] class mean
        Returns:
        the Goodness of Variant Fit for a particular SDAM and SDCM
      • calcGVF

        public static double calcGVF​(double[] data,
                                     double[] limits,
                                     double SDAM)
        GVF (goodness of variance fit): see B.D. Dent (1999, p. 148) alternatively look for T.A. Slocum (1999, p. 73). \n Used for Optimal Breaks Method.
        Parameters:
        data - input data
        limits - The break/decision values between the classes. Highest and lowest values are not delivered. Example Limits are for instance delivered by the Classifier1D.classifyEqualNumber() method.
        SDAM - squared deviation [from] array mean
        Returns:
        goodness of variance fit
      • calcClassMeans

        public static double[] calcClassMeans​(double[] data,
                                              int[] classes,
                                              int numClasses)
        Parameters:
        data - input data
        classes - the vector containing the information on the class for an item
        numClasses - the number of classes
        Returns:
        class means