To provide more information about the distribution of entries in
a histogram, we will add additonal statistical measures to a subclass
of Histogram
class. These include
- Mean value
- Error on the mean value
- Standard deviation
- Skewness
- Kurtosis
The HistogramStat
class calculates the statistics from individual entries rather than
from the bins values, as done in the Histogram
class. It does this by accumulating the power sums during the histogram
filling and then converts these to central moments when requested
by a call to the getStats()
method, which overrides the method from Histogram.
From the central moments the statistical values above can be calculated
(see the references below or standard statistics
books for derivations.)
Note: The equations employed here
to calculate the statistical measures using the central moments
derived from the power sums are prone to serious round-off errors.
This can be overcome with a "two-pass" approach (Press)
in which all the input values are saved in an array and the mean
value then computed with this array. The mean is used to calculate
the central moments directly in another pass through the array.
An approach that doesn't require saving the inputs is given by
Bisset.
This "robust implementation" calculates new central
moments as each new entry is added.
Finally, the class offers methods to pack values into the bins
and also to pack the errors for each bin. These will be useful later
when we fit lines to distributions.
Note that in designing such a class, you must decide how to provide
access to the data. Here the getStats()
method provides a whole set of statistical measures via a double
type array. The class provides constants that indicate to what statistic
each entry corresponds. One could instead provide a separate getter
method for each statistic.
Similarly, you must choose how to respond to abnormal conditions.
For example, getStats()
will return null
if the histogram is empty or if the statistics keeping has been
turned off. Thus the calling method should check for the value returned
to avoid a runtime error for attempting to use a null reference.
HistStatsApplet.java
-
Display an instance of the HistogramStat. Clicking on
the Stats button creates a frame that displays
the stats for the Gaussian distribution of random values.
+ New class:
HistogramStat.java
- subclass of Histogram adds various statistical
information about the contents such as the mean, standard
deviation, etc.
+ Previous classes:
Chapter
6:Tech: Histogram.java,
HistPanel.java
Chapter
6:Tech: PlotPanel.java,
PlotFormat.java
|
/**
* This class provides provides additional statistical
* measures of the histogram distribution.
**/
public class HistogramStat extends Histogram
{
protected boolean fDoDataStats = true;
protected double [] fBinErrors;
protected double [] fMoments= new double[5];
protected double fMean;
// These constants indicate for each element of
the
// array returned from the getStats () method
the statistical
// measure to which it corresponds.
public final static int I_MEAN
= 0;
public final static int I_STD_DEV =
1;
public final static int I_MEAN_ERROR = 2;
public final static int I_SKEWNESS
= 3;
public final static int I_KURTOSIS
= 4;
public final static int I_NUMSTATS
= 5;
/**
* Constructor
* @param number of bins for the histogram.
* @param lowest value of bin.
* @param highest value of bin.
*/
public HistogramStat (int num_bins, double lo,
double hi) {
super (num_bins,lo,hi);
} // ctor
/**
* Constructor with title and x axis label.
*
* @param title for histogram
* @param label for x axis of histogram.
* @param number of bins for the histogram.
* @param lowest value of bin.
* @param highest value of bin.
*/
public HistogramStat (String title, String xLabel,
int num_bins,
double lo, double hi) {
super (title, xLabel, num_bins,lo,hi);
} // ctor
/**
* Provide an array of error values for all bins.
* @return array of double values.
**/
public double [] getBinErrors () {
return fBinErrors;
}
/**
* Return the error for a particular bin.
* @param from 0 to highest bin value
**/
public double getBinError (int bin) {
if (bin>= 0 && bin < fBins.length)
return fBinErrors[bin];
else
return -1.0;
} // getBin Error
/**
* Calculate the error on each bin according to
the
* sqrt (bin contents) from std. dev. of Poisson
distribution.
**/
public void makeBinErrors () {
for ( int i = 0; i < fNumBins; i++)
{
fBinErrors[i] = Math.sqrt
(fBins[i]);
}
}
/**
* Pack the errors on each bin.
* @param false if array size doesn't match number
of bins.
**/
public boolean packErrors (double [] errors) {
if (errors.length != fNumBins) return
false;
if (fBinErrors == null || fBinErrors.length
!= fNumBins) {
fBinErrors = new double[errors.length];
}
for (int i = 0; i < fNumBins; i++)
{
fBinErrors[i]
= errors[i];
}
return true;
} // packErrors
/**
* Add an entry to the histogram, plus calculate
the
* power sums (if flag set) for each
entry rather than
* from the bin values.
*
* @param non-zero length array of int values.
**/
public void add (double x) {
// First add entry as usual.
super.add (x);
// Then do moments if flag set.
if (!fDoDataStats)
return;
else {
fMoments[0]
+= 1;
fMoments[1]
+= x;
double x2
= x * x;
fMoments[2]
+= x2;
fMoments[3]
+= x2 * x;
fMoments[4]
+= x2 * x2;
}
} // add
/** Clear the histogram bins and the moments.
**/
public void clear () {
super.clear (); // Clear histogram
arrys
int i;
for (i=0; i < 5; i++) fMoments[i]
= 0.0;
if (fBinErrors != null)
for (i=0;
i < fBinErrors.length; i++) fBinErrors[i]=0.0;
} // clear
/**
* Turn on or off the accumulation and calculation
of
* the data statistics.
**/
public void setDataStats (boolean flag) {
fDoDataStats = flag;
}
/**
* Get the statistical measures
of the distribution calculated
* from the entry values.
* @return values
in double array correspond to
* mean,standard deviation,
error on the mean, skewness, and
* kurtosis. If the number
of entries is zero or the statistics
* accumulation is turned
off (see setStats () method), a null
* value will return
**/
public double [] getDataStats () {
// If stats turned off or no entries,
then give up
if (!fDoDataStats || fMoments[0] ==
0) return null;
double [] stats = new double[I_NUMSTATS];
double n = fMoments[0];
// Average value = 1/n * sum[x]
fMean = fMoments[1]/n;
// Use running mean.
stats[0] = fMean;
double mean_sq = fMean * fMean;
// Check on minimum number of entries.
if (n < 2) return stats;
// Convert power sums to central moments
double m2 = fMoments[2]/n;
double cm2 = m2 - fMean * fMean;
double m3 = fMoments[3]/n;
double cm3 = 2.0 * fMean * mean_sq
- 3.0 * fMean * m2 + m3;
double m4 = fMoments[4]/n;
double cm4 = -3.0 * mean_sq * mean_sq
+ 6.0 * mean_sq * m2
-4.0 * fMean * m3 + m4;
// variance = N/ (N-1) m2
double variance = cm2 * (n/
(n-1.0));
// Std. Deviation s = sqrt (variance)
stats[1] = Math.sqrt (variance);
// Error on mean = s / sqrt (N)
stats[2] = stats[1]/Math.sqrt (n);
// Skewness = n^2/ (n-1) (n-2) * cm3/s^3
stats[3] = (n( (n-1) *
(n-2))) * n * cm3/(variance * stats[1]);
// Kurtosis = n(n+1)/(n-1)(n-2)(n-3)
* cm4/s^4 - 3(n-1)^2/(n-2)(n-3)
double factor1 = ( n *
(n+1.0))/ ( (n-1.0) * (n-2.0) * (n-3.0) );
double factor2 = ( 3.0
* (n-1.0) * (n-1.0) )/ ( (n-2.0) * (n-3.0));
stats[4] = factor1 * cm4 * n/ (variance*variance)
- factor2;
return stats;
} // getDataStats
} // class HistogramStat |
import
javax.swing.*;
import java.awt.*;
import java.awt.event.*;
/**
* This program will run as an applet inside
* an application frame.
*
* The applet uses the HistPanel to display contents
of
* an instance of Histogram. HistFormat used
by HistPanel to
* format the scale values.
*
* Includes "Go" button to add random values
from a Gaussian
* distribution to the histogram. The number
of values taken from
* entry in a JTextField. "Clear" button
clears the histogram.
* In standalone mode, the Exit button closes
the program.
*
**/
public class HistStatsApplet extends JApplet
implements ActionListener
{
// Use the HistPanel JPanel subclass here
HistPanel fOutputPanel;
HistogramStat fHistogram;
int fNumDataPoints = 100;
// A text field for input strings
JTextField fTextField;
// Flag for whether the applet is in a browser
// or running via the main () below.
boolean fInBrowser = true;
//Buttons
JButton fGoButton;
JButton fStatsButton;
JButton fClearButton;
JButton fExitButton;
/**
* Create a User Interface with a
histogram and a Go button
* to initiate processing and a Clear
button to clear the .
* histogram. In application mode,
the Exit button stops the
* program. Add a stats button to
open a frame window to show
* statistical measures.
**/
public void init () {
Container content_pane = getContentPane
();
JPanel panel = new JPanel (new BorderLayout
());
// Create a histogram with Gaussian
distribution.
makeHist ();
// JPanel subclass here.
fOutputPanel = new HistPanel (fHistogram);
panel.add (fOutputPanel,"Center");
// Use a textfield for an input
parameter.
fTextField =
new JTextField (Integer.toString
(fNumDataPoints), 10);
// If return hit after entering
text, the
// actionPerformed will be invoked.
fTextField.addActionListener (this);
fGoButton = new JButton ("Go");
fGoButton.addActionListener (this);
fStatsButton = new JButton ("Stats");
fStatsButton.addActionListener (this);
fClearButton = new JButton ("Clear");
fClearButton.addActionListener (this);
fExitButton = new JButton ("Exit");
fExitButton.addActionListener (this);
JPanel fControlPanel = new JPanel
();
fControlPanel.add (fTextField);
fControlPanel.add (fGoButton);
fControlPanel.add (fStatsButton);
fControlPanel.add (fClearButton);
fControlPanel.add (fExitButton);
if (fInBrowser) fExitButton.setEnabled
(false);
panel.add (fControlPanel,"South");
// Add text area with scrolling
to the contentPane.
content_pane.add (panel);
} // init
public void actionPerformed (ActionEvent e)
{
Object source = e.getSource ();
if (source == fGoButton || source
== fTextField) {
String strNumDataPoints
= fTextField.getText ();
try {
fNumDataPoints
= Integer.parseInt (strNumDataPoints);
}
catch (NumberFormatException
ex) {
//
Could open an error dialog here but just
//
display a message on the browser status line.
showStatus
("Bad input value");
return;
}
makeHist
();
repaint
();
} else if (source == fStatsButton)
{
displayStats
();
} else if (source == fClearButton)
{
fHistogram.clear
();
repaint
();
} else if (!fInBrowser)
System.exit
(0);
} // actionPerformed
/** Create a frame to display the distribution
statistics. **/
void displayStats () {
JFrame frame =
new JFrame
("Histogram Distributions Statistics");
// Create a listener to close the
frame
frame.setDefaultCloseOperation (JFrame.DISPOSE_ON_CLOSE);
JTextArea area = new JTextArea ();
double [] stats = fHistogram.getDataStats
();
if (stats != null) {
area.append ("Number
entries = "+fHistogram.getTotal ()+"\n");
String stat = PlotFormat.getFormatted
(
stats[HistogramStat.I_MEAN],
1000.0,0.001,3);
area.append ("Mean value
= "+ stat +" ");
stat = PlotFormat.getFormatted
(
stats[HistogramStat.I_MEAN_ERROR],
1000.0,0.001,3);
area.append (" +/- "+stat+"\n");
stat = PlotFormat.getFormatted
(
stats[HistogramStat.I_STD_DEV],
1000.0,0.001,3);
area.append ("Std. Dev.
= "+stat+"\n");
stat = PlotFormat.getFormatted
(
stats[HistogramStat.I_SKEWNESS],
1000.0,0.001,3);
area.append ("Skewness
= "+stat+"\n");
stat = PlotFormat.getFormatted
(
stats[HistogramStat.I_KURTOSIS],
1000.0,0.001,3);
area.append ("Kurtosis
= "+stat+"\n");
} else {
area.setText ("No statistical
information available");
}
frame.getContentPane ().add (area);
frame.setSize (200,200);
frame.setVisible (true);;
} // displayStats
void makeHist () {
// Create an instance of the Random
class for
// producing our random values.
java.util.Random r = new java.util.Random
();
// Them method nextGaussian in the
class Randomproduces a value
// centered at 0.0 and a standarde
deviation
// of 1.0.
// Create an instance of our basic
histogram class.
// Make it wide enough enough to
include most of the
// gaussian values.
if (fHistogram == null)
fHistogram
=
new
HistogramStat ("Gaussian Distribution with Statistics",
"random
values",
20,-3.0,3.0);
// Fill histogram with Gaussian
distribution
for (int i=0; i < fNumDataPoints;
i++) {
double val
= r.nextGaussian ();
fHistogram.add
(val);
}
} // makeHist
public static void main (String[] args) {
int frame_width=450;
int frame_height=300;
// Create the applet
HistStatsApplet applet = new HistStatsApplet
();
applet.fInBrowser = false;
applet.init ();
// Following anonymous class used
to close window & exit program
JFrame f = new JFrame ("Demo");
f.setDefaultCloseOperation (JFrame.EXIT_ON_CLOSE);
// Add applet to the frame
f.getContentPane ().add ( applet);
f.setSize (new Dimension (frame_width,frame_height));
f.setVisible (true);
} // main
} // class HistStatsApplet
|
References
& Web Resources
Last update: Feb.15.04
|