`(0.9771562639858903, 0.9772498680518208)`

Lecture

2023-08-30

Today

PDFs and CDFs

Joint, marginal, and conditional distributions

Example: linear regression

Example: negative binomial as a mixture

Wrapup

If \(F_X\) is the *cumulative distribution function* (CDF) of \(X\) and \(f_X\) is the *probability density function* (PDF) of \(X\), then: \[
F_X ( x ) = \int_{-\infty}^x f_X(u) \, du,
\] and (if \(f_X\) is continuous at \(x\) which it typically will be) \[
f_{X}(x)={\frac {d}{dx}}F_{X}(x).
\] A useful property is \[
\Pr[a\leq X\leq b]=\int _{a}^{b}f_{X}(x)\,dx
\]

**Important**

We can only talk about the probability that \(y\) is in some interval \([a, b]\), which is given by the integral of the PDF over that interval. The probability that \(y\) takes on the value \(y^*\), written \(p(y=y^*)\), is zero.

Simple example to illustrate that \[ F_X(2) = \int_{-\infty}^2 f_X(u) \, du \]

We will use a standard Normal distribution as an example

`(0.9771562639858903, 0.9772498680518208)`

- Mean 0 and standard deviation 1 by default
`pdf(d, x)`

tells us the probability density function of distribution`d`

evaluated at`x`

`quad_trap`

is a trapezoidal approximation of the integral with arguments: function, lower bound, upper bound, and number of points

- Discrete distributions (like the Poisson) have a
*probability mass function*(PMF) instead of a PDF - For PMFs, \(p(y=y^*)\) is the probability that \(y\) takes on the value \(y^*\), and is defined

- In the
`Distributions`

package, both PDFs and PMFs are called`pdf`

Today

PDFs and CDFs

Joint, marginal, and conditional distributions

Example: linear regression

Example: negative binomial as a mixture

Wrapup

\[ p(\theta, y) = p(\theta) p(y | \theta) \] and thus \[ p(\theta | y) = \frac{p(\theta, y)}{p(y)} = \frac{p(\theta) p(y | \theta)}{p(y)} \] generally: \[ p(\theta | y) \propto p(\theta) p(y | \theta) \]

Probability of event \(A\): \(\Pr(A)\)

We will write the marginal probability density function as \[ p(\theta) \quad \text{or} \quad p(y) \]

Probability of events \(A\) and \(B\): \(\Pr(A \& B)\)

\[ p(\theta, y) \]

Probability of event \(A\) given event \(B\): \(\Pr(A | B)\)

\[ p(\theta | y) \quad \text{or} \quad p(y | \theta) \]

A gambler presents you with an even-money wager. You will roll two dice, and if the highest number showing is one, two, three or four, then you win. If the highest number on either die is five or six, then she wins. Should you take the bet?

Today

PDFs and CDFs

Joint, marginal, and conditional distributions

Example: linear regression

Example: negative binomial as a mixture

Wrapup

Standard linear regression model, let’s assume \(x \in \mathbb{R}\) for simplicity (1 predictor): \[ y_i = ax_i + b + \epsilon_i \] where \(\epsilon_i \sim N(0, \sigma^2)\).

The conditional probability density of \(y_i\) given \(x_i\) is \[ p(y_i | x_i, a, b, \sigma) = N(ax_i + b, \sigma^2) \] which is a shorthand for writing out the full equation for the Normal PDF. We can (and often will) write this as \[ y_i \sim \mathcal{N}(ax_i + b, \sigma^2) \] Finally, we will sometimes write \(p(y_i | x_i)\) as a shorthand for \(p(y_i | x_i, a, b, \sigma)\). While fine in many circumstances, we should take care to make sure we are extremely clear about what parameters we are conditioning on.

The marginal probability density of \(y_i\) is \[ p(y_i | a, b, \sigma) = \int p(y_i | x_i, a, b, \sigma) p(x_i) \, dx_i \] where \(p(x_i)\) is the probability density of \(x_i\).

The joint probability density of \(y_i\) and \(x_i\) is \[ p(y_i, x_i | a, b, \sigma) = p(y_i | x_i, a, b, \sigma) p(x_i) \] where \(p(x_i)\) is the probability density of \(x_i\).

If \(x=2\), we can simulate from the conditional distribution of \(y\):

If \(x \sim N(0, 1)\), then we can simulate from the joint distribution of \(x\) and \(y\):