Day 11

Math 216: Statistical Thinking

Bastola

Introduction to Continuous Probability Distributions

Continuous random variables are the backbone of modeling real-world phenomena where outcomes can take any value within a range. Unlike discrete variables, which deal with countable outcomes, continuous variables are described by smooth curves called Probability Density Functions (PDFs).

Key Features of PDFs

  • Area as Probability: The probability that \(X\) falls between \(a\) and \(b\) is the area under the PDF curve between these points.
  • Non-negativity: \(f(x) \geq 0\) for all \(x\).
  • Total Probability: The integral of \(f(x)\) over its entire range equals 1, representing certainty.
  • Zero Probability at a Point: For continuous variables, \(P(X = a) = 0\) for any specific value \(a\). Probabilities are only meaningful over intervals.

Visualizing Continuous Distributions

  • Interpreting the Curve:
    The shape of the PDF reflects how probabilities are distributed. A tall, narrow peak indicates values are concentrated around a specific point, while a flat curve suggests more variability.

  • Calculating Probabilities:
    To find \(P(a \leq X \leq b)\), compute the area under the curve between \(a\) and \(b\). This is the essence of continuous probability!

The Uniform Distribution: Simplicity with Power

Definition

The uniform distribution models scenarios where every outcome in a range is equally likely. It’s the simplest continuous distribution, yet it’s incredibly useful in practice.

  • PDF: \(f(x) = \frac{1}{b-a}\) for \(x \in [a, b]\).
  • Graphical Representation: A flat rectangle, reflecting equal probability density across the range.

Visualizing and Calculating Probabilities

Why Use the Uniform Distribution?

  • Equal Likelihood:
    It’s ideal for modeling fair processes, such as random number generation or selecting a random time within a fixed interval.

  • Easy Calculations:
    Probabilities are straightforward to compute. For example, \(P(a \leq X \leq b) = (b-a) \times \frac{1}{d-c}\).

Statistical Measures of \(Unif(a,b)\)

  • Mean and Median:
    Both are located at the center of the interval: \(\mu = \frac{a+b}{2}\).

  • Variance:
    The spread of the distribution is \(\sigma^2 = \frac{(b-a)^2}{12}\). A wider range leads to greater variability.

The Normal Distribution

The Normal Distribution is a continuous probability distribution that is symmetrical around its mean, represented by \(\mu\). This distribution is crucial in statistics and is often used to represent real-world variables.

\[ f(x)=\frac{1}{\sigma \sqrt{2 \pi}} \exp \left(-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2\right) \]

  • \(\mu=\) Mean (average, expected value)
  • \(\sigma=\) Standard deviation
  • \(\pi \approx 3.141592653589\)
  • \(e \approx 2.71828\)

Applications of the Normal Distribution

Many real-world phenomena are well approximated by the normal distribution. For example:

  • Measurement errors, e.g., blood pressure readings.
  • Yearly rainfall amounts in certain regions.

These applications underscore the normal distribution’s role in statistical inference and various practical scenarios.

Applications of the Normal Distribution

To determine if data approximates a normal distribution, one can compare the sample’s frequency distribution against the theoretical normal curve:

  • Graphical methods: Q-Q plots, histograms.
  • Statistical tests: Shapiro-Wilk test, Anderson-Darling test.

The Standard Normal Distribution

A special case of the normal distribution is the standard normal distribution, where \(\mu=0\) and \(\sigma=1\). It is used extensively to simplify problems in statistics.

\[ f(z)=\frac{1}{\sqrt{2 \pi}} \exp \left(-\frac{1}{2} z^2\right) \]

  • Denoted as \(Z\) for a standard normal variable.

The Standard Normal Distribution

The probability that a normal random variable falls between two values is the area under the curve between those values. This area can be computed using:

  • Z-tables for the standard normal distribution.
  • Software functions, e.g., pnorm in R