6.1 Introduction
- Statistical Inference, or “learning” as it is called in computer science, is the process of using data to infer distribution that generated the data.
6.2 Parametric and Nonparametric Models
-
A statistical model $\Im$ is a set of distributions (or densities or regression functions)
-
Parametric model : is a set of $\Im$ that can be parameterized by a finite number of parameters
-
If we assume that the data come from a Normal distribution, then It would be two-prarmeter model.
$$ \Im = {f(x; \mu \sigma) = \frac{1}{\sigma \sqrt{2 \pi}}} exp {-\frac{1}{2 \sigma^2}(x-\mu)^2 } $$
- In general, a parametric model takes the form
$$ \Im = {f(x; \theta) : \theta \in \Theta }$$
-
Non-Parametric model : is a set $\Im$ that cannot be parameterized by a finited number of parameters.
-
Frequentists and Bayesians : The two dominant approaches to statistical inference are called frequentists inference and Bayyesian Inference.
6.3 Fundamental Concepts in inference
-
Many inferential problems can be identified as being one of three types : estimation, confidence sets, or hypothesis testing.
-
Point Estimation : refers to providing a single “best guess” of some quantity of interest
-
Confidence Sets : A $1-\alpha$ confidence interval for a parameter $\theta$ is an interval $C_n = (a,b)$ where $a = a(X_1, … , X_n)$ and $b = b(X_1, … , X_n)$
-
Hypothesis Testing : In hypothesis testing, we start with some default theory , called null hypothesis, and we ask if the data provide sufficient evidence to reject the theory.