Continuous random variables: probability density functions, expectation, variance

Continuous Random Variables – Probability & Statistics 2 (S2)

1. Quick Review of Prerequisite Material

1.1 Representation of Data (Syllabus 5.1)

  • Stem‑and‑leaf plot: splits each observation into a “stem’’ (all but the last digit) and a “leaf’’ (the last digit). Example for the data 12, 13, 14, 21, 22:
    12 | 3 4
    21 | 2
            
  • Box‑plot (five‑number summary):
    • Minimum, Q₁, Median, Q₃, Maximum.
    • Shows centre (median) and spread (IQR = Q₃‑Q₁).
  • Histogram:
    • Class intervals on the horizontal axis, frequency (or relative frequency) on the vertical axis.
    • Area of each bar = proportion of observations in that class.
  • Cumulative frequency / ogive: plots cumulative proportion against the upper class boundary; useful for estimating percentiles.

1.2 Permutations & Combinations (Syllabus 5.2)

Counting formulas that appear in discrete‑distribution questions:

  • Combinations (unordered selections): \(\displaystyle \binom{n}{r}= \frac{n!}{r!(n-r)!}\).
  • Permutations (ordered selections): \(\displaystyle {}^{n}P_{r}= \frac{n!}{(n-r)!}\).
  • Example: Number of ways to choose a committee of 3 from 8 students is \(\binom{8}{3}=56\).

1.3 Probability Refresher (Syllabus 5.3)

For any events \(A,B\) with \(P(B)>0\):

  • Union (inclusion–exclusion): \(P(A\cup B)=P(A)+P(B)-P(A\cap B).\)
  • Conditional probability: \(\displaystyle P(A\mid B)=\frac{P(A\cap B)}{P(B)}.\)
  • Independence: \(A\) and \(B\) independent \(\iff P(A\cap B)=P(A)P(B).\)
  • Bayes’ theorem (useful for “reverse’’ questions): \[ P(A_i\mid B)=\frac{P(B\mid A_i)P(A_i)}{\sum_j P(B\mid A_j)P(A_j)}. \]
  • Tree diagrams are a quick way to organise multi‑stage experiments.

1.4 Discrete Random Variables (Syllabus 5.4 & 6.1)

  • Binomial distribution \(B(n,p)\): \[ P(X=k)=\binom{n}{k}p^{k}(1-p)^{\,n-k},\qquad k=0,\dots ,n. \] Mean \(E[X]=np\), variance \(\operatorname{Var}(X)=np(1-p)\).
  • Geometric distribution (number of trials until the first success): \[ P(X=k)=(1-p)^{k-1}p,\qquad k=1,2,\dots \] Mean \(E[X]=\dfrac1p\), variance \(\operatorname{Var}(X)=\dfrac{1-p}{p^{2}}\).
  • Poisson distribution (continuous‑time) – see Section 2.6.

2. Continuous Random Variables

2.1 Definition

A random variable \(X\) is continuous when its possible values form an interval (or a union of intervals) on the real line and

\[ P(X=x)=0\qquad\text{for every single value }x. \]

2.2 Probability Density Function (pdf)

The pdf \(f(x)\) satisfies

\[ P(a\le X\le b)=\int_{a}^{b} f(x)\,dx . \]
  • \(f(x)\ge 0\) for all \(x\).
  • \(\displaystyle\int_{-\infty}^{\infty} f(x)\,dx =1\) (total area = 1).

2.3 Cumulative Distribution Function (CDF)

\[ F(x)=P(X\le x)=\int_{-\infty}^{x} f(t)\,dt . \]
  • Derivative relationship: \(f(x)=\dfrac{d}{dx}F(x)\) wherever the derivative exists.
  • Boundary values: \(F(-\infty)=0,\;F(\infty)=1\).

2.4 Common Continuous Distributions

Distribution PDF \(f(x)\) Support Mean \(E[X]\) Variance \(\operatorname{Var}(X)\)
Uniform \(U(a,b)\) \(\displaystyle f(x)=\frac{1}{b-a}\;(a\le x\le b)\) \([a,b]\) \(\displaystyle\frac{a+b}{2}\) \(\displaystyle\frac{(b-a)^{2}}{12}\)
Exponential \(\operatorname{Exp}(\lambda)\) \(\displaystyle f(x)=\lambda e^{-\lambda x}\;(x\ge 0)\) \([0,\infty)\) \(\displaystyle\frac{1}{\lambda}\) \(\displaystyle\frac{1}{\lambda^{2}}\)
Normal \(N(\mu,\sigma^{2})\) \(\displaystyle f(x)=\frac{1}{\sigma\sqrt{2\pi}}\; e^{-\frac{(x-\mu)^{2}}{2\sigma^{2}}}\;(-\infty \((-\infty,\infty)\) \(\mu\) \(\sigma^{2}\)
Poisson (continuous‑time) – inter‑arrival time \(\displaystyle f(x)=\lambda e^{-\lambda x}\;(x\ge 0)\) \([0,\infty)\) \(\displaystyle\frac{1}{\lambda}\) \(\displaystyle\frac{1}{\lambda^{2}}\)

2.5 The Normal Distribution – Recap (Syllabus 5.5)

  • Standard normal \(Z\sim N(0,1)\) with pdf \(\displaystyle \phi(z)=\frac{1}{\sqrt{2\pi}}e^{-z^{2}/2}\).
  • Standardisation: \[ Z=\frac{X-\mu}{\sigma}\quad\Longrightarrow\quad Z\sim N(0,1) \] for any \(X\sim N(\mu,\sigma^{2})\).
  • Reading a Z‑table (excerpt):
    z0.000.010.020.030.040.050.060.070.080.09
    0.00.50000.50400.50800.51200.51600.51990.52390.52790.53190.5359
    0.10.53980.54380.54780.55170.55570.55960.56360.56750.57140.5753
    0.20.57930.58320.58710.59100.59480.59870.60260.60640.61030.6141

    To find \(P(Z\le 1.23)\): look up row 1.2, column 0.03 → 0.8907.

  • Continuity correction when a normal distribution approximates a discrete count: \[ P(X\le k)\approx P\!\left(Z\le\frac{k+0.5-\mu}{\sigma}\right),\qquad P(X\ge k)\approx P\!\left(Z\ge\frac{k-0.5-\mu}{\sigma}\right). \]
  • Normal approximation to binomial (when \(np\ge5\) and \(n(1-p)\ge5\)): \[ X\sim B(n,p)\;\approx\;N\bigl(np,\;np(1-p)\bigr). \]
    Normal approximation to Poisson (when \(\lambda\ge10\)): \[ X\sim\text{Poisson}(\lambda)\;\approx\;N\bigl(\lambda,\;\lambda\bigr). \]

2.6 Poisson Distribution (Continuous‑time) – Syllabus 6.1

If events occur independently at a constant average rate \(\lambda\) per unit time, the number of events \(X\) in a fixed interval follows a Poisson distribution:

\[ P(X=k)=\frac{e^{-\lambda}\lambda^{k}}{k!},\qquad k=0,1,2,\dots \]
  • Mean and variance are equal: \(\displaystyle E[X]=\operatorname{Var}(X)=\lambda.\)
  • The time between successive events has an exponential distribution \(\operatorname{Exp}(\lambda)\) (memoryless property).
  • Derivation as the limit of a binomial: \[ \lim_{n\to\infty}\! \binom{n}{k}\Bigl(\frac{\lambda}{n}\Bigr)^{k} \Bigl(1-\frac{\lambda}{n}\Bigr)^{n-k}= \frac{e^{-\lambda}\lambda^{k}}{k!}. \]

2.7 Linear Combinations of Random Variables – Syllabus 6.2

  • If \(X\) and \(Y\) are independent, \[ E[aX+bY]=aE[X]+bE[Y],\qquad \operatorname{Var}(aX+bY)=a^{2}\operatorname{Var}(X)+b^{2}\operatorname{Var}(Y). \]
  • For normal variables, any linear combination is also normal: \[ X\sim N(\mu_X,\sigma_X^{2}),\;Y\sim N(\mu_Y,\sigma_Y^{2})\;\Longrightarrow\; aX+bY\sim N\bigl(a\mu_X+b\mu_Y,\;a^{2}\sigma_X^{2}+b^{2}\sigma_Y^{2}\bigr). \]

3. Expectation and Variance for Continuous Variables

3.1 Expectation (Mean)

\[ E[X]=\int_{-\infty}^{\infty} x\,f(x)\,dx \qquad\text{(provided the integral converges).} \]
  • Linearity: \(E[aX+b]=aE[X]+b\) for constants \(a,b\).
  • If \(X\) and \(Y\) are independent, \(E[XY]=E[X]E[Y]\).

3.2 Variance

\[ \operatorname{Var}(X)=E\bigl[(X-E[X])^{2}\bigr] =\int_{-\infty}^{\infty} (x-\mu)^{2}f(x)\,dx, \qquad \mu=E[X]. \]

Often calculated via the shortcut

\[ \operatorname{Var}(X)=E[X^{2}]-(E[X])^{2},\qquad E[X^{2}]=\int_{-\infty}^{\infty} x^{2}f(x)\,dx. \]
  • Scaling: \(\operatorname{Var}(aX+b)=a^{2}\operatorname{Var}(X).\)
  • Sum of independent variables: \(\operatorname{Var}(X+Y)=\operatorname{Var}(X)+\operatorname{Var}(Y).\)

3.3 Worked Examples (Continuous)

Example 1 – Uniform Distribution \(U(2,5)\)

PDF: \(f(x)=\dfrac{1}{3}\) for \(2\le x\le5\).

\[ E[X]=\int_{2}^{5}x\frac{1}{3}\,dx=\frac{1}{3}\Bigl[\frac{x^{2}}{2}\Bigr]_{2}^{5}=3.5, \] \[ E[X^{2}]=\int_{2}^{5}x^{2}\frac{1}{3}\,dx=\frac{1}{3}\Bigl[\frac{x^{3}}{3}\Bigr]_{2}^{5}=13, \] \[ \operatorname{Var}(X)=13-(3.5)^{2}=0.75=\frac{(5-2)^{2}}{12}. \]
Example 2 – Exponential Distribution \(\operatorname{Exp}(\lambda=2)\)

PDF: \(f(x)=2e^{-2x},\;x\ge0\).

\[ E[X]=\int_{0}^{\infty}x\,2e^{-2x}\,dx=\frac{1}{2}, \qquad E[X^{2}]=\int_{0}^{\infty}x^{2}\,2e^{-2x}\,dx=\frac{1}{2}, \] \[ \operatorname{Var}(X)=\frac{1}{2}-\Bigl(\frac{1}{2}\Bigr)^{2}= \frac{1}{4}. \]
Example 3 – Standard Normal Distribution \(N(0,1)\)

By symmetry, \(E[X]=0\). The second moment is known:

\[ E[X^{2}]=\int_{-\infty}^{\infty}x^{2}\phi(x)\,dx=1, \qquad \operatorname{Var}(X)=1. \]
Example 4 – Normal Approximation to a Poisson Count

Let \(X\sim\text{Poisson}(\lambda=12)\). Approximate \(P(X\ge 15)\).

  • Mean = variance = 12, so use \(N(12,12)\).
  • Continuity correction: \(P(X\ge15)\approx P\!\left(Z\ge\frac{14.5-12}{\sqrt{12}}\right) =P(Z\ge0.72)=0.2358.\)
Example 5 – Linear Combination of Independent Normals

If \(X\sim N(2,1)\) and \(Y\sim N(5,4)\) are independent, find the distribution of \(Z=3X-2Y\).

  • Mean: \(E[Z]=3(2)-2(5)=6-10=-4.\)
  • Variance: \(\operatorname{Var}(Z)=3^{2}(1)+(-2)^{2}(4)=9+16=25.\)
  • Hence \(Z\sim N(-4,25).\)

4. Sampling, Estimation and Hypothesis Testing (Syllabus 5.6 & 5.7)

4.1 Sampling Distributions

  • If a sample of size \(n\) is taken from a population with mean \(\mu\) and known standard deviation \(\sigma\), the sampling distribution of the sample mean \(\bar{X}\) is \[ \bar{X}\sim N\!\left(\mu,\;\frac{\sigma^{2}}{n}\right) \] (Central Limit Theorem) when \(n\) is large (or the population is normal).
  • For a proportion \(p\) based on a binomial sample, \(\hat{p}\) is approximately normal with \[ \hat{p}\sim N\!\left(p,\;\frac{p(1-p)}{n}\right) \] provided \(np\ge5\) and \(n(1-p)\ge5\).

4.2 Point & Interval Estimates

  • Unbiased estimator: an estimator \(\hat\theta\) with \(E[\hat\theta]=\theta\). Example: \(\bar{X}\) is an unbiased estimator of the population mean \(\mu\).
  • Confidence interval for a mean (σ known): \[ \bar{x}\;\pm\;z_{\alpha/2}\,\frac{\sigma}{\sqrt{n}}, \] where \(z_{\alpha/2}\) is the critical value from the standard normal table.
  • Example (95 % CI): \(\bar{x}=12,\; \sigma=3,\; n=36\). \[ 12\pm1.96\frac{3}{6}=12\pm0.98\;\Rightarrow\;(11.02,\;12.98). \]

4.3 Hypothesis Testing (Syllabus 5.7)

  1. State hypotheses:
    • Null hypothesis \(H_{0}\): the statement to be tested (e.g. \(\mu=10\)).
    • Alternative hypothesis \(H_{1}\): what we would conclude if \(H_{0}\) is rejected (e.g. \(\mueq10\)).
  2. Select significance level \(\alpha\) (commonly 0.05 or 0.01).
  3. Choose test statistic:
    • For a mean with known \(\sigma\): \(Z=\dfrac{\bar{X}-\mu_{0}}{\sigma/\sqrt{n}}\).
    • For a mean with unknown \(\sigma\): \(t=\dfrac{\bar{X}-\mu_{0}}{s/\sqrt{n}}\) (use \(t\)-distribution with \(n-1\) df).
    • For a proportion: \(Z=\dfrac{\hat{p}-p_{0}}{\sqrt{p_{0}(1-p_{0})/n}}\).
  4. Determine critical region (using Z‑ or t‑tables) or compute the p‑value.
  5. Decision:
    • If test statistic lies in the critical region (or p‑value < α), reject \(H_{0}\).
    • Otherwise, do not reject \(H_{0}\).
  6. Interpretation in the context of the problem.
Decision‑flowchart for a hypothesis test
State \(H_{0}\) and \(H_{1}\) Choose \(\alpha\) and test statistic Compute statistic (or p‑value) Decision: reject or not? Interpret result

5. Summary Checklist (Paper 6 – S2)

  1. Identify the type of random variable (discrete vs continuous) and write down its support.
  2. For a continuous variable, write the pdf \(f(x)\) and verify \(\displaystyle\int f(x)\,dx=1\).
  3. Compute the mean: \[ E[X]=\int x\,f(x)\,dx. \]
  4. Find the second moment \(E[X^{2}]\) (or use a known formula) and obtain the variance: \[ \operatorname{Var}(X)=E[X^{2}]-(E[X])^{2}. \]
  5. Apply linearity, scaling, and independence rules for transformed variables or sums.
  6. For normal‑related questions:
    • Standardise to a \(Z\)‑score.
    • Read the required area from a Z‑table (or use a calculator).
    • Apply continuity correction when approximating a discrete count.
  7. Recall the binomial, geometric and Poisson distributions, their means and variances, and when normal approximation is appropriate.
  8. For sampling problems, use the appropriate sampling distribution (often normal by the CLT).
  9. When required, construct a point estimate, a confidence interval, or carry out a hypothesis test using the steps in Section 4.3.
  10. For sums or linear combinations of independent variables, add means and add variances (or use the normal‑sum rule).
Sketches of three common pdfs (Uniform, Exponential, Normal)
Uniform \(U(a,b)\) Exponential \(\operatorname{Exp}(\lambda)\) Normal \(N(\mu,\sigma^{2})\)

Create an account or Login to take a Quiz

51 views
0 improvement suggestions

Log in to suggest improvements to this note.