Statistics #6 | Models, Statistical Inference and Learning


6.1 Introduction

  • Statistical Inference, or “learning” as it is called in computer science, is the process of using data to infer distribution that generated the data.


6.2 Parametric and Nonparametric Models

  • A statistical model $\Im$ is a set of distributions (or densities or regression functions)

  • Parametric model : is a set of $\Im$ that can be parameterized by a finite number of parameters

  • If we assume that the data come from a Normal distribution, then It would be two-prarmeter model.

$$ \Im = {f(x; \mu \sigma) = \frac{1}{\sigma \sqrt{2 \pi}}} exp {-\frac{1}{2 \sigma^2}(x-\mu)^2 } $$

  • In general, a parametric model takes the form

$$ \Im = {f(x; \theta) : \theta \in \Theta }$$

  • Non-Parametric model : is a set $\Im$ that cannot be parameterized by a finited number of parameters.

  • Frequentists and Bayesians : The two dominant approaches to statistical inference are called frequentists inference and Bayyesian Inference.



6.3 Fundamental Concepts in inference

  • Many inferential problems can be identified as being one of three types : estimation, confidence sets, or hypothesis testing.

  • Point Estimation : refers to providing a single “best guess” of some quantity of interest

  • Confidence Sets : A $1-\alpha$ confidence interval for a parameter $\theta$ is an interval $C_n = (a,b)$ where $a = a(X_1, … , X_n)$ and $b = b(X_1, … , X_n)$

  • Hypothesis Testing : In hypothesis testing, we start with some default theory , called null hypothesis, and we ask if the data provide sufficient evidence to reject the theory.