Statistics #6 | Models, Statistical Inference and Learning

2019-02-02 2. Statistics Comments

6.1 Introduction

Statistical Inference, or “learning” as it is called in computer science, is the process of using data to infer distribution that generated the data.

A statistical model $\Im$ is a set of distributions (or densities or regression functions)
Parametric model : is a set of $\Im$ that can be parameterized by a finite number of parameters
If we assume that the data come from a Normal distribution, then It would be two-prarmeter model.

$$ \Im = {f(x; \mu \sigma) = \frac{1}{\sigma \sqrt{2 \pi}}} exp {-\frac{1}{2 \sigma^2}(x-\mu)^2 } $$

$$ \Im = {f(x; \theta) : \theta \in \Theta }$$

Non-Parametric model : is a set $\Im$ that cannot be parameterized by a finited number of parameters.
Frequentists and Bayesians : The two dominant approaches to statistical inference are called frequentists inference and Bayyesian Inference.

Many inferential problems can be identified as being one of three types : estimation, confidence sets, or hypothesis testing.
Point Estimation : refers to providing a single “best guess” of some quantity of interest
Confidence Sets : A $1-\alpha$ confidence interval for a parameter $\theta$ is an interval $C_n = (a,b)$ where $a = a(X_1, … , X_n)$ and $b = b(X_1, … , X_n)$
Hypothesis Testing : In hypothesis testing, we start with some default theory , called null hypothesis, and we ask if the data provide sufficient evidence to reject the theory.