Day 11

Math 216: Statistical Thinking

Bastola

Introduction to Continuous Probability Distributions

Continuous random variables are the backbone of modeling real-world phenomena where outcomes can take any value within a range. Unlike discrete variables, which deal with countable outcomes, continuous variables are described by smooth curves called Probability Density Functions (PDFs).

Key Features of PDFs

Area as Probability: The probability that \(X\) falls between \(a\) and \(b\) is the area under the PDF curve between these points.
Non-negativity: \(f(x) \geq 0\) for all \(x\).
Total Probability: The integral of \(f(x)\) over its entire range equals 1, representing certainty.
Zero Probability at a Point: For continuous variables, \(P(X = a) = 0\) for any specific value \(a\). Probabilities are only meaningful over intervals.

Visualizing Continuous Distributions

Interpreting the Curve:
The shape of the PDF reflects how probabilities are distributed. A tall, narrow peak indicates values are concentrated around a specific point, while a flat curve suggests more variability.
Calculating Probabilities:
To find \(P(a \leq X \leq b)\), compute the area under the curve between \(a\) and \(b\). This is the essence of continuous probability!

The Uniform Distribution: Simplicity with Power

Definition

The uniform distribution models scenarios where every outcome in a range is equally likely. It’s the simplest continuous distribution, yet it’s incredibly useful in practice.

PDF: \(f(x) = \frac{1}{b-a}\) for \(x \in [a, b]\).
Graphical Representation: A flat rectangle, reflecting equal probability density across the range.

Visualizing and Calculating Probabilities

Why Use the Uniform Distribution?

Equal Likelihood:
It’s ideal for modeling fair processes, such as random number generation or selecting a random time within a fixed interval.
Easy Calculations:
Probabilities are straightforward to compute. For example, \(P(a \leq X \leq b) = (b-a) \times \frac{1}{d-c}\).

Statistical Measures of \(Unif(a,b)\)

Mean and Median:
Both are located at the center of the interval: \(\mu = \frac{a+b}{2}\).
Variance:
The spread of the distribution is \(\sigma^2 = \frac{(b-a)^2}{12}\). A wider range leads to greater variability.

The Normal Distribution

The Normal Distribution is a continuous probability distribution that is symmetrical around its mean, represented by \(\mu\). This distribution is crucial in statistics and is often used to represent real-world variables.

\[ f(x)=\frac{1}{\sigma \sqrt{2 \pi}} \exp \left(-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2\right) \]

\(\mu=\) Mean (average, expected value)
\(\sigma=\) Standard deviation
\(\pi \approx 3.141592653589\)
\(e \approx 2.71828\)

Applications of the Normal Distribution

Many real-world phenomena are well approximated by the normal distribution. For example:

Measurement errors, e.g., blood pressure readings.
Yearly rainfall amounts in certain regions.

These applications underscore the normal distribution’s role in statistical inference and various practical scenarios.

Applications of the Normal Distribution

To determine if data approximates a normal distribution, one can compare the sample’s frequency distribution against the theoretical normal curve:

Graphical methods: Q-Q plots, histograms.
Statistical tests: Shapiro-Wilk test, Anderson-Darling test.

The Standard Normal Distribution

A special case of the normal distribution is the standard normal distribution, where \(\mu=0\) and \(\sigma=1\). It is used extensively to simplify problems in statistics.

\[ f(z)=\frac{1}{\sqrt{2 \pi}} \exp \left(-\frac{1}{2} z^2\right) \]

Denoted as \(Z\) for a standard normal variable.

The Standard Normal Distribution

The probability that a normal random variable falls between two values is the area under the curve between those values. This area can be computed using:

Z-tables for the standard normal distribution.
Software functions, e.g., pnorm in R