To provide more information about the distribution of entries in
a histogram, we will add additonal statistical measures to a subclass
of Histogram
class. These include
- Mean value
- Error on the mean value
- Standard deviation
- Skewness
- Kurtosis
The HistogramStat
class calculates the statistics from individual entries rather than
from the bins values, as done in the Histogram
class. It does this by accumulating the power sums during the histogram
filling and then converts these to central moments when requested
by a call to the getStats()
method, which overrides the method from Histogram.
From the central moments the statistical values above can be calculated
(see the references below or standard statistics
books for derivations.)
Note: The equations employed here
to calculate the statistical measures using the central moments
derived from the power sums are prone to serious round-off errors.
This can be overcome with a "two-pass" approach (Press)
in which all the input values are saved in an array and the mean
value then computed with this array. The mean is used to calculate
the central moments directly in another pass through the array.
An approach that doesn't require saving the inputs is given by
This "robust implementation" calculates new central
moments as each new entry is added.
Finally, the class offers methods to pack values into the bins
and also to pack the errors for each bin. These will be useful later
when we fit lines to distributions.
Note that in designing such a class, you must decide how to provide
access to the data. Here the getStats()
method provides a whole set of statistical measures via a double
type array. The class provides constants that indicate to what statistic
each entry corresponds. One could instead provide a separate getter
method for each statistic.
Similarly, you must choose how to respond to abnormal conditions.
For example, getStats()
will return null
if the histogram is empty or if the statistics keeping has been
turned off. Thus the calling method should check for the value returned
to avoid a runtime error for attempting to use a null reference.
Display an instance of the HistogramStat. Clicking on
the Stats button creates a frame that displays
the stats for the Gaussian distribution of random values.
+ New class:
- subclass of Histogram adds various statistical
information about the contents such as the mean, standard
deviation, etc.
+ Previous classes:
* This class provides provides additional statistical
* measures of the histogram distribution.
public class HistogramStat extends Histogram
protected boolean fDoDataStats = true;
protected double [] fBinErrors;
protected double [] fMoments= new double[5];
protected double fMean;
// These constants indicate for each element of
// array returned from the getStats () method
the statistical
// measure to which it corresponds.
public final static int I_MEAN
= 0;
public final static int I_STD_DEV =
public final static int I_MEAN_ERROR = 2;
public final static int I_SKEWNESS
= 3;
public final static int I_KURTOSIS
= 4;
public final static int I_NUMSTATS
= 5;
* Constructor
* @param number of bins for the histogram.
* @param lowest value of bin.
* @param highest value of bin.
public HistogramStat (int num_bins, double lo,
double hi) {
super (num_bins,lo,hi);
} // ctor
* Constructor with title and x axis label.
* @param title for histogram
* @param label for x axis of histogram.
* @param number of bins for the histogram.
* @param lowest value of bin.
* @param highest value of bin.
public HistogramStat (String title, String xLabel,
int num_bins,
double lo, double hi) {
super (title, xLabel, num_bins,lo,hi);
} // ctor
* Provide an array of error values for all bins.
* @return array of double values.
public double [] getBinErrors () {
return fBinErrors;
* Return the error for a particular bin.
* @param from 0 to highest bin value
public double getBinError (int bin) {
if (bin>= 0 && bin < fBins.length)
return fBinErrors[bin];
return -1.0;
} // getBin Error
* Calculate the error on each bin according to
* sqrt (bin contents) from std. dev. of Poisson
public void makeBinErrors () {
for ( int i = 0; i < fNumBins; i++)
fBinErrors[i] = Math.sqrt
* Pack the errors on each bin.
* @param false if array size doesn't match number
of bins.
public boolean packErrors (double [] errors) {
if (errors.length != fNumBins) return
if (fBinErrors == null || fBinErrors.length
!= fNumBins) {
fBinErrors = new double[errors.length];
for (int i = 0; i < fNumBins; i++)
= errors[i];
return true;
} // packErrors
* Add an entry to the histogram, plus calculate
* power sums (if flag set) for each
entry rather than
* from the bin values.
* @param non-zero length array of int values.
public void add (double x) {
// First add entry as usual.
super.add (x);
// Then do moments if flag set.
if (!fDoDataStats)
else {
+= 1;
+= x;
double x2
= x * x;
+= x2;
+= x2 * x;
+= x2 * x2;
} // add
/** Clear the histogram bins and the moments.
public void clear () {
super.clear (); // Clear histogram
int i;
for (i=0; i < 5; i++) fMoments[i]
= 0.0;
if (fBinErrors != null)
for (i=0;
i < fBinErrors.length; i++) fBinErrors[i]=0.0;
} // clear
* Turn on or off the accumulation and calculation
* the data statistics.
public void setDataStats (boolean flag) {
fDoDataStats = flag;
* Get the statistical measures
of the distribution calculated
* from the entry values.
* @return values
in double array correspond to
* mean,standard deviation,
error on the mean, skewness, and
* kurtosis. If the number
of entries is zero or the statistics
* accumulation is turned
off (see setStats () method), a null
* value will return
public double [] getDataStats () {
// If stats turned off or no entries,
then give up
if (!fDoDataStats || fMoments[0] ==
0) return null;
double [] stats = new double[I_NUMSTATS];
double n = fMoments[0];
// Average value = 1/n * sum[x]
fMean = fMoments[1]/n;
// Use running mean.
stats[0] = fMean;
double mean_sq = fMean * fMean;
// Check on minimum number of entries.
if (n < 2) return stats;
// Convert power sums to central moments
double m2 = fMoments[2]/n;
double cm2 = m2 - fMean * fMean;
double m3 = fMoments[3]/n;
double cm3 = 2.0 * fMean * mean_sq
- 3.0 * fMean * m2 + m3;
double m4 = fMoments[4]/n;
double cm4 = -3.0 * mean_sq * mean_sq
+ 6.0 * mean_sq * m2
-4.0 * fMean * m3 + m4;
// variance = N/ (N-1) m2
double variance = cm2 * (n/
// Std. Deviation s = sqrt (variance)
stats[1] = Math.sqrt (variance);
// Error on mean = s / sqrt (N)
stats[2] = stats[1]/Math.sqrt (n);
// Skewness = n^2/ (n-1) (n-2) * cm3/s^3
stats[3] = (n( (n-1) *
(n-2))) * n * cm3/(variance * stats[1]);
// Kurtosis = n(n+1)/(n-1)(n-2)(n-3)
* cm4/s^4 - 3(n-1)^2/(n-2)(n-3)
double factor1 = ( n *
(n+1.0))/ ( (n-1.0) * (n-2.0) * (n-3.0) );
double factor2 = ( 3.0
* (n-1.0) * (n-1.0) )/ ( (n-2.0) * (n-3.0));
stats[4] = factor1 * cm4 * n/ (variance*variance)
- factor2;
return stats;
} // getDataStats
} // class HistogramStat |
import java.awt.*;
import java.awt.event.*;
* This program will run as an applet inside
* an application frame.
* The applet uses the HistPanel to display contents
* an instance of Histogram. HistFormat used
by HistPanel to
* format the scale values.
* Includes "Go" button to add random values
from a Gaussian
* distribution to the histogram. The number
of values taken from
* entry in a JTextField. "Clear" button
clears the histogram.
* In standalone mode, the Exit button closes
the program.
public class HistStatsApplet extends JApplet
implements ActionListener
// Use the HistPanel JPanel subclass here
HistPanel fOutputPanel;
HistogramStat fHistogram;
int fNumDataPoints = 100;
// A text field for input strings
JTextField fTextField;
// Flag for whether the applet is in a browser
// or running via the main () below.
boolean fInBrowser = true;
JButton fGoButton;
JButton fStatsButton;
JButton fClearButton;
JButton fExitButton;
* Create a User Interface with a
histogram and a Go button
* to initiate processing and a Clear
button to clear the .
* histogram. In application mode,
the Exit button stops the
* program. Add a stats button to
open a frame window to show
* statistical measures.
public void init () {
Container content_pane = getContentPane
JPanel panel = new JPanel (new BorderLayout
// Create a histogram with Gaussian
makeHist ();
// JPanel subclass here.
fOutputPanel = new HistPanel (fHistogram);
panel.add (fOutputPanel,"Center");
// Use a textfield for an input
fTextField =
new JTextField (Integer.toString
(fNumDataPoints), 10);
// If return hit after entering
text, the
// actionPerformed will be invoked.
fTextField.addActionListener (this);
fGoButton = new JButton ("Go");
fGoButton.addActionListener (this);
fStatsButton = new JButton ("Stats");
fStatsButton.addActionListener (this);
fClearButton = new JButton ("Clear");
fClearButton.addActionListener (this);
fExitButton = new JButton ("Exit");
fExitButton.addActionListener (this);
JPanel fControlPanel = new JPanel
fControlPanel.add (fTextField);
fControlPanel.add (fGoButton);
fControlPanel.add (fStatsButton);
fControlPanel.add (fClearButton);
fControlPanel.add (fExitButton);
if (fInBrowser) fExitButton.setEnabled
panel.add (fControlPanel,"South");
// Add text area with scrolling
to the contentPane.
content_pane.add (panel);
} // init
public void actionPerformed (ActionEvent e)
Object source = e.getSource ();
if (source == fGoButton || source
== fTextField) {
String strNumDataPoints
= fTextField.getText ();
try {
= Integer.parseInt (strNumDataPoints);
catch (NumberFormatException
ex) {
Could open an error dialog here but just
display a message on the browser status line.
("Bad input value");
} else if (source == fStatsButton)
} else if (source == fClearButton)
} else if (!fInBrowser)
} // actionPerformed
/** Create a frame to display the distribution
statistics. **/
void displayStats () {
JFrame frame =
new JFrame
("Histogram Distributions Statistics");
// Create a listener to close the
frame.setDefaultCloseOperation (JFrame.DISPOSE_ON_CLOSE);
JTextArea area = new JTextArea ();
double [] stats = fHistogram.getDataStats
if (stats != null) {
area.append ("Number
entries = "+fHistogram.getTotal ()+"\n");
String stat = PlotFormat.getFormatted
area.append ("Mean value
= "+ stat +" ");
stat = PlotFormat.getFormatted
area.append (" +/- "+stat+"\n");
stat = PlotFormat.getFormatted
area.append ("Std. Dev.
= "+stat+"\n");
stat = PlotFormat.getFormatted
area.append ("Skewness
= "+stat+"\n");
stat = PlotFormat.getFormatted
area.append ("Kurtosis
= "+stat+"\n");
} else {
area.setText ("No statistical
information available");
frame.getContentPane ().add (area);
frame.setSize (200,200);
frame.setVisible (true);;
} // displayStats
void makeHist () {
// Create an instance of the Random
class for
// producing our random values.
java.util.Random r = new java.util.Random
// Them method nextGaussian in the
class Randomproduces a value
// centered at 0.0 and a standarde
// of 1.0.
// Create an instance of our basic
histogram class.
// Make it wide enough enough to
include most of the
// gaussian values.
if (fHistogram == null)
HistogramStat ("Gaussian Distribution with Statistics",
// Fill histogram with Gaussian
for (int i=0; i < fNumDataPoints;
i++) {
double val
= r.nextGaussian ();
} // makeHist
public static void main (String[] args) {
int frame_width=450;
int frame_height=300;
// Create the applet
HistStatsApplet applet = new HistStatsApplet
applet.fInBrowser = false;
applet.init ();
// Following anonymous class used
to close window & exit program
JFrame f = new JFrame ("Demo");
f.setDefaultCloseOperation (JFrame.EXIT_ON_CLOSE);
// Add applet to the frame
f.getContentPane ().add ( applet);
f.setSize (new Dimension (frame_width,frame_height));
f.setVisible (true);
} // main
} // class HistStatsApplet
