Statistical Demonstrations and Calculators

These are small files written in Microsoft Excel 97, PowerPoint, and Word. They demonstrate various common concepts of statistics or help you to perform certain kinds of calculations. You can copy any of these by clicking your mouse on the appropriate item below. You will then get a chance to save the compressed file(s) and to specify where on your computer hard drive to save it. Be sure to put them some place that you can find them later!

Sample Size Estimation

Sample size estimation is a common problem when designing a study. The pages in this file will help you with studies involving comparison of counts, means, or proportions. It is important to have an idea about what constitutes a clinically important difference in outcomes. For example, if the main outcome variable of your study is systolic blood pressure, is it important to doctors/journal readers to be able to detect an improvement of 1, 2, 5, 10, or 20 Torr? It is also important to have some idea about the variability you are likely to observe in the data you will eventually collect. This most often is obtained from previous studies like yours. For our blood pressure example, you should search and obtain several 3-5 papers like this one that has means and standard deviations of systolic blood pressure in the results section [Pesola GR, Pesola HR, Nelson MJ, Westfal RE (January 2001). "The normal difference in bilateral indirect blood pressure recordings in normotensive individuals". American Journal of Emergency Medicine 19 (1): 43–5.] Statistical consultation should always be obtained about this before you begin collecting data.

Sample Size Estimation

Statistical Frequency Distributions

Frequency distributions are one of the most important tools used in the analysis of experimental data. Many statistical tests are carried out by choosing a frequency distribution that closely matches the distribution of your observations. The mathematical properties of the matched frequency distribution are then used to calculate the probability of observing data like yours just by chance. Frequency distributions are also used to estimate confidence intervals. Choosing the most appropriate distribution to use to represent your data often involves getting expert help from an experienced statistician.

The frequency distributions most commonly used in biostatistics include Student’s t, normal, the chi-square, binomial, gamma, and the F distributions. Many others are available to model certain types of observations. When a frequency distribution is scaled so that the total area under its curve equals one, it is called a "probability density function." These curves are mathematically complicated and their values are usually obtained from a table or calculated with a computer.

Each distribution has one or more parameters that are use to set its center, shape, degree of asymmetry, and other properties. For example, the parameters of the standard normal distribution are the mean and the standard deviation. These two parameters uniquely specify its center and shape.

This spreadsheet demonstrates several commonly used distributions that are used in biostatistics. The parameters are adjustable with ‘scrollbars’, and the graphs of the distributions are drawn so that you can get an idea about the effect of different combinations of parameters. You can also use the spreadsheet in place of a table of the distribution. The values generated are accurate to about 10 decimal places.

Statistical Frequency Distributions

Confidence Intervals

Confidence intervals and hypothesis tests are ways of describing your uncertainty about your findings. Understanding the standard error of the mean and the central limit theorem will help you better understand these concepts.

Confidence Intervals

Multiple Comparisons

Comparing the means or proportions from more than two groups means that there are more than one possible two-way comparisons. As the number of groups increases, the number of possible two-way comparisons increases rapidly. Adjustments in your statistical procedures should take this fact into account to avoid underestimation of your experiment-wise Type II error rate. Statistical consultation should always be obtained to avoid making this type of mistake in the statistical interpretation of your data.

Multiple Comparisons