Home : Course Map : Chapter 7 : Java : Tech :
Histogram Statistics
JavaTech
Course Map
Chapter 7

Introduction
Event Overview
Event Processing
Button Events
  Demo 1
 Demo 2
Mouse Events
  Demo3

More Components
  Demo 4  Demo 5
  Demo 6  Demo 7

LayoutManagers-1
  Demo 8     Demo 9
  Demo 10  Demo 11
  Demo 12

LayoutManagers-2
  Demo 13  Demo 14
  Demo 15  Demo 16
  Demo 17

Inner Classes
Anonymous Class
Adapter Classes
  Demo 18  Demo 19
Frames & Menus
  Demo 20  Demo 21
Exercises

    Supplements
AWT Components
  Button
     Demo 1
  Canvas
     Demo 2
  AWT GUI Demo
     Demo 3
Swing Dialogs
JOptionPane Dialog
  Demo 1
JDialog
  Demo 2
UI Enhancement: P1
  Demo 1   Demo 2
  Demo 3

UI Enhancement: P2
  Demo 1
     About JavaTech
     Codes List
     Exercises
     Feedback
     References
     Resources
     Tips
     Topic Index
     Course Guide
     What's New

To provide more information about the distribution of entries in a histogram, we will add additonal statistical measures to a subclass of Histogram class. These include

  • Mean value
  • Error on the mean value
  • Standard deviation
  • Skewness
  • Kurtosis

The HistogramStat class calculates the statistics from individual entries rather than from the bins values, as done in the Histogram class. It does this by accumulating the power sums during the histogram filling and then converts these to central moments when requested by a call to the getStats() method, which overrides the method from Histogram. From the central moments the statistical values above can be calculated (see the references below or standard statistics books for derivations.)

Note: The equations employed here to calculate the statistical measures using the central moments derived from the power sums are prone to serious round-off errors. This can be overcome with a "two-pass" approach (Press) in which all the input values are saved in an array and the mean value then computed with this array. The mean is used to calculate the central moments directly in another pass through the array. An approach that doesn't require saving the inputs is given by Bisset. This "robust implementation" calculates new central moments as each new entry is added.

Finally, the class offers methods to pack values into the bins and also to pack the errors for each bin. These will be useful later when we fit lines to distributions.

Note that in designing such a class, you must decide how to provide access to the data. Here the getStats() method provides a whole set of statistical measures via a double type array. The class provides constants that indicate to what statistic each entry corresponds. One could instead provide a separate getter method for each statistic.

Similarly, you must choose how to respond to abnormal conditions. For example, getStats() will return null if the histogram is empty or if the statistics keeping has been turned off. Thus the calling method should check for the value returned to avoid a runtime error for attempting to use a null reference.


HistStatsApplet.java - Display an instance of the HistogramStat. Clicking on the Stats button creates a frame that displays the stats for the Gaussian distribution of random values.

+ New class:
HistogramStat.java
- subclass of Histogram adds various statistical information about the contents such as the mean, standard deviation, etc.

+ Previous classes:
Chapter 6:Tech: Histogram.java, HistPanel.java
Chapter 6:Tech: PlotPanel.java, PlotFormat.java

/**
  * This class provides provides additional statistical
  * measures of the histogram distribution.
**/
public class HistogramStat extends Histogram
{
  protected boolean fDoDataStats = true;
  protected double [] fBinErrors;

  protected double [] fMoments= new double[5];
  protected double fMean;

  // These constants indicate for each element of the
  // array returned from the getStats () method the statistical
  // measure to which it corresponds.
  public final static int I_MEAN       = 0;
  public final static int I_STD_DEV    = 1;
  public final static int I_MEAN_ERROR = 2;
  public final static int I_SKEWNESS   = 3;
  public final static int I_KURTOSIS   = 4;
  public final static int I_NUMSTATS   = 5;

/**
  * Constructor
  * @param number of bins for the histogram.
  * @param lowest value of bin.
  * @param highest value of bin.
  */
  public HistogramStat (int num_bins, double lo, double hi) {
    super (num_bins,lo,hi);
  } // ctor

/**
  * Constructor with title and x axis label.
  *
  * @param title for histogram
  * @param label for x axis of histogram.
  * @param number of bins for the histogram.
  * @param lowest value of bin.
  * @param highest value of bin.
  */
  public HistogramStat (String title, String xLabel, int num_bins,
                       double lo, double hi) {
    super (title, xLabel, num_bins,lo,hi);
  } // ctor

/**
   * Provide an array of error values for all bins.
   * @return array of double values.
  **/
  public double [] getBinErrors () {
    return fBinErrors;
  }

/**
   * Return the error for a particular bin.
   * @param from 0 to highest bin value
  **/
  public double getBinError (int bin) {
    if (bin>= 0 && bin < fBins.length)
        return fBinErrors[bin];
    else
        return -1.0;
  } // getBin Error

/**
   * Calculate the error on each bin according to the
   * sqrt (bin contents) from std. dev. of Poisson distribution.
  **/
  public void makeBinErrors () {
    for ( int i = 0; i < fNumBins; i++) {
      fBinErrors[i] = Math.sqrt (fBins[i]);
    }
  }

/**
   * Pack the errors on each bin.
   * @param false if array size doesn't match number of bins.
  **/
  public boolean packErrors (double [] errors) {
    if (errors.length != fNumBins) return false;
    if (fBinErrors == null || fBinErrors.length != fNumBins) {
       fBinErrors = new double[errors.length];
    }

    for (int i = 0; i < fNumBins; i++) {
         fBinErrors[i] = errors[i];
    }
    return true;
  } // packErrors

/**
   * Add an entry to the histogram, plus calculate the
   * power sums  (if flag set) for each entry rather than
   * from the bin values.
   *
   * @param non-zero length array of int values.
  **/
  public void add (double x) {
    // First add entry as usual.
    super.add (x);

    // Then do moments if flag set.
    if (!fDoDataStats)
        return;
    else {
        fMoments[0] += 1;
        fMoments[1] += x;
        double x2 = x * x;
        fMoments[2] += x2;
        fMoments[3] += x2 * x;
        fMoments[4] += x2 * x2;
    }
  } // add

  /** Clear the histogram bins and the moments. **/
  public void clear () {
    super.clear (); // Clear histogram arrys
    int i;
    for (i=0; i < 5; i++) fMoments[i] = 0.0;
    if (fBinErrors != null)
        for (i=0; i < fBinErrors.length; i++) fBinErrors[i]=0.0;
  } // clear

/**
   * Turn on or off the accumulation and calculation of
   * the data statistics.
  **/
  public void setDataStats (boolean flag) {
    fDoDataStats = flag;
  }

  /**
    *  Get the statistical measures of the distribution calculated
    *  from the entry values.
    *  @return  values in double array correspond to
    *  mean,standard deviation, error on the mean, skewness, and
    *  kurtosis. If the number of entries is zero or the statistics
    *  accumulation is turned off  (see setStats () method), a null
    *  value will return
   **/
  public double [] getDataStats () {
    // If stats turned off or no entries, then give up
    if (!fDoDataStats || fMoments[0] == 0) return null;

    double [] stats = new double[I_NUMSTATS];

    double n = fMoments[0];

    // Average value = 1/n * sum[x]
    fMean = fMoments[1]/n;

    // Use running mean.
    stats[0] = fMean;
    double mean_sq = fMean * fMean;

    // Check on minimum number of entries.
    if (n < 2) return stats;

    // Convert power sums to central moments
    double m2 = fMoments[2]/n;
    double cm2 = m2 - fMean * fMean;
    double m3 = fMoments[3]/n;
    double cm3 = 2.0 * fMean * mean_sq - 3.0 * fMean * m2 + m3;
    double m4 = fMoments[4]/n;
    double cm4 = -3.0 * mean_sq * mean_sq + 6.0 * mean_sq * m2
                     -4.0 * fMean * m3 + m4;

    // variance = N/ (N-1) m2
    double variance = cm2 *  (n/ (n-1.0));

    // Std. Deviation s = sqrt (variance)
    stats[1] = Math.sqrt (variance);

    // Error on mean = s / sqrt (N)
    stats[2] = stats[1]/Math.sqrt (n);

    // Skewness = n^2/ (n-1) (n-2) * cm3/s^3
    stats[3] =  (n( (n-1) * (n-2))) * n * cm3/(variance * stats[1]);

    // Kurtosis = n(n+1)/(n-1)(n-2)(n-3) * cm4/s^4 - 3(n-1)^2/(n-2)(n-3)
    double factor1 =  ( n * (n+1.0))/ (  (n-1.0) * (n-2.0) * (n-3.0) );
    double factor2 =  ( 3.0 * (n-1.0) * (n-1.0) )/ ( (n-2.0) * (n-3.0));
    stats[4] = factor1 * cm4 * n/ (variance*variance) - factor2;

    return stats;
  } // getDataStats

} // class HistogramStat

import javax.swing.*;
import java.awt.*;
import java.awt.event.*;

/**
  * This program will run as an applet inside
  * an application frame.
  *
  * The applet uses the HistPanel to display contents of
  * an instance of Histogram. HistFormat used by HistPanel to
  * format the scale values.
  *
  * Includes "Go" button to add random values from a Gaussian
  * distribution to the histogram. The number of values taken from
  * entry in a JTextField. "Clear"  button clears the histogram.
  * In standalone mode, the Exit button closes the program.
  *
 **/
public class HistStatsApplet extends JApplet
             implements ActionListener
{
  // Use the HistPanel JPanel subclass here
  HistPanel fOutputPanel;

  HistogramStat fHistogram;
  int fNumDataPoints = 100;


  // A text field for input strings
  JTextField fTextField;

  // Flag for whether the applet is in a browser
  // or running via the main () below.
  boolean fInBrowser = true;

  //Buttons
  JButton fGoButton;
  JButton fStatsButton;
  JButton fClearButton;
  JButton fExitButton;

  /**
    * Create a User Interface with a histogram and a Go button
    * to initiate processing and a Clear button to clear the .
    * histogram. In application mode, the Exit button stops the
    * program. Add a stats button to open a frame window to show
    * statistical measures.
   **/
  public void init () {

    Container content_pane = getContentPane ();

    JPanel panel = new JPanel (new BorderLayout ());

    // Create a histogram with Gaussian distribution.
    makeHist ();

    // JPanel subclass here.
    fOutputPanel = new HistPanel (fHistogram);

    panel.add (fOutputPanel,"Center");

    // Use a textfield for an input parameter.
    fTextField =
      new JTextField (Integer.toString (fNumDataPoints), 10);

    // If return hit after entering text, the
    // actionPerformed will be invoked.
    fTextField.addActionListener (this);

    fGoButton = new JButton ("Go");
    fGoButton.addActionListener (this);

    fStatsButton = new JButton ("Stats");
    fStatsButton.addActionListener (this);

    fClearButton = new JButton ("Clear");
    fClearButton.addActionListener (this);

    fExitButton = new JButton ("Exit");
    fExitButton.addActionListener (this);

    JPanel fControlPanel = new JPanel ();

    fControlPanel.add (fTextField);
    fControlPanel.add (fGoButton);
    fControlPanel.add (fStatsButton);
    fControlPanel.add (fClearButton);
    fControlPanel.add (fExitButton);

    if (fInBrowser) fExitButton.setEnabled (false);

    panel.add (fControlPanel,"South");

    // Add text area with scrolling to the contentPane.
    content_pane.add (panel);

  } // init

  public void actionPerformed (ActionEvent e) {
    Object source = e.getSource ();
    if (source == fGoButton || source == fTextField) {
        String strNumDataPoints = fTextField.getText ();
        try {
          fNumDataPoints = Integer.parseInt (strNumDataPoints);
        }
        catch (NumberFormatException ex) {
          // Could open an error dialog here but just
          // display a message on the browser status line.
          showStatus ("Bad input value");
          return;
        }
        makeHist ();
        repaint ();
    } else if (source == fStatsButton) {
        displayStats ();
    } else if (source == fClearButton) {
        fHistogram.clear ();
        repaint ();
    } else if (!fInBrowser)
        System.exit (0);
  } // actionPerformed

  /** Create a frame to display the distribution statistics. **/
  void displayStats () {
    JFrame frame =
        new JFrame ("Histogram Distributions Statistics");

    // Create a listener to close the frame
    frame.setDefaultCloseOperation (JFrame.DISPOSE_ON_CLOSE);

    JTextArea area = new JTextArea ();

    double [] stats = fHistogram.getDataStats ();
    if (stats != null) {

      area.append ("Number entries = "+fHistogram.getTotal ()+"\n");

      String stat = PlotFormat.getFormatted (
                    stats[HistogramStat.I_MEAN],
                      1000.0,0.001,3);
      area.append ("Mean value = "+ stat +" ");

      stat = PlotFormat.getFormatted (
                    stats[HistogramStat.I_MEAN_ERROR],
                      1000.0,0.001,3);
      area.append (" +/- "+stat+"\n");

      stat = PlotFormat.getFormatted (
                    stats[HistogramStat.I_STD_DEV],
                      1000.0,0.001,3);
      area.append ("Std. Dev. = "+stat+"\n");

      stat = PlotFormat.getFormatted (
                    stats[HistogramStat.I_SKEWNESS],
                      1000.0,0.001,3);
      area.append ("Skewness = "+stat+"\n");

      stat = PlotFormat.getFormatted (
                    stats[HistogramStat.I_KURTOSIS],
                      1000.0,0.001,3);
      area.append ("Kurtosis = "+stat+"\n");
    } else {
      area.setText ("No statistical information available");
    }

    frame.getContentPane ().add (area);
    frame.setSize (200,200);
    frame.setVisible (true);;
  } // displayStats

  void makeHist () {
    // Create an instance of the Random class for
    // producing our random values.
    java.util.Random r = new java.util.Random ();

    // Them method nextGaussian in the class Randomproduces a value
    // centered at 0.0 and a standarde deviation
    // of 1.0.

    // Create an instance of our basic histogram class.
    // Make it wide enough enough to include most of the
    // gaussian values.
    if (fHistogram == null)
        fHistogram =
          new HistogramStat ("Gaussian Distribution with Statistics",
                            "random values",
                            20,-3.0,3.0);

    // Fill histogram with Gaussian distribution
    for (int i=0; i < fNumDataPoints; i++) {
        double val = r.nextGaussian ();
        fHistogram.add (val);
    }
  } // makeHist

  public static void main (String[] args) {
    int frame_width=450;
    int frame_height=300;

    //  Create the applet
    HistStatsApplet applet = new HistStatsApplet ();
    applet.fInBrowser = false;
    applet.init ();

    // Following anonymous class used to close window & exit program
    JFrame f = new JFrame ("Demo");
    f.setDefaultCloseOperation (JFrame.EXIT_ON_CLOSE);

    // Add applet to the frame
    f.getContentPane ().add ( applet);
    f.setSize (new Dimension (frame_width,frame_height));
    f.setVisible (true);
  } // main

} // class HistStatsApplet

 

References & Web Resources

Last update: Feb.15.04

           Tech
Histogram UI
  Demo 1
Probablity Distrib.
  Demo 2 Demo 3
RejectionMethod
Histogram Stats
  Demo 4
Exercises

           Physics
Sim & Randomness
Custom Prob. Dist.
   Demo 1
Histogram Dist.
   Demo 2
Monte Carlo
  Demo 3
Exercises

  Part I Part II Part III
Java Core 1  2  3  4  5  6  7  8  9  10  11  12 13 14 15 16 17
18 19 20
21
22 23 24
Supplements

1  2  3  4  5  6  7  8  9  10  11  12

Tech 1  2  3  4  5  6  7  8  9  10  11  12
Physics 1  2  3  4  5  6  7  8  9  10  11  12

Java is a trademark of Sun Microsystems, Inc.