(0.9771562639858903, 0.9772498680518208)
Lecture
2023-08-30
Today
PDFs and CDFs
Joint, marginal, and conditional distributions
Example: linear regression
Example: negative binomial as a mixture
Wrapup
If \(F_X\) is the cumulative distribution function (CDF) of \(X\) and \(f_X\) is the probability density function (PDF) of \(X\), then: \[ F_X ( x ) = \int_{-\infty}^x f_X(u) \, du, \] and (if \(f_X\) is continuous at \(x\) which it typically will be) \[ f_{X}(x)={\frac {d}{dx}}F_{X}(x). \] A useful property is \[ \Pr[a\leq X\leq b]=\int _{a}^{b}f_{X}(x)\,dx \]
Important
We can only talk about the probability that \(y\) is in some interval \([a, b]\), which is given by the integral of the PDF over that interval. The probability that \(y\) takes on the value \(y^*\), written \(p(y=y^*)\), is zero.
Simple example to illustrate that \[ F_X(2) = \int_{-\infty}^2 f_X(u) \, du \]
We will use a standard Normal distribution as an example
(0.9771562639858903, 0.9772498680518208)
pdf(d, x)
tells us the probability density function of distribution d
evaluated at x
quad_trap
is a trapezoidal approximation of the integral with arguments: function, lower bound, upper bound, and number of pointsDistributions
package, both PDFs and PMFs are called pdf
Today
PDFs and CDFs
Joint, marginal, and conditional distributions
Example: linear regression
Example: negative binomial as a mixture
Wrapup
\[ p(\theta, y) = p(\theta) p(y | \theta) \] and thus \[ p(\theta | y) = \frac{p(\theta, y)}{p(y)} = \frac{p(\theta) p(y | \theta)}{p(y)} \] generally: \[ p(\theta | y) \propto p(\theta) p(y | \theta) \]
Probability of event \(A\): \(\Pr(A)\)
We will write the marginal probability density function as \[ p(\theta) \quad \text{or} \quad p(y) \]
Probability of events \(A\) and \(B\): \(\Pr(A \& B)\)
\[ p(\theta, y) \]
Probability of event \(A\) given event \(B\): \(\Pr(A | B)\)
\[ p(\theta | y) \quad \text{or} \quad p(y | \theta) \]
A gambler presents you with an even-money wager. You will roll two dice, and if the highest number showing is one, two, three or four, then you win. If the highest number on either die is five or six, then she wins. Should you take the bet?
Today
PDFs and CDFs
Joint, marginal, and conditional distributions
Example: linear regression
Example: negative binomial as a mixture
Wrapup
Standard linear regression model, let’s assume \(x \in \mathbb{R}\) for simplicity (1 predictor): \[ y_i = ax_i + b + \epsilon_i \] where \(\epsilon_i \sim N(0, \sigma^2)\).
The conditional probability density of \(y_i\) given \(x_i\) is \[ p(y_i | x_i, a, b, \sigma) = N(ax_i + b, \sigma^2) \] which is a shorthand for writing out the full equation for the Normal PDF. We can (and often will) write this as \[ y_i \sim \mathcal{N}(ax_i + b, \sigma^2) \] Finally, we will sometimes write \(p(y_i | x_i)\) as a shorthand for \(p(y_i | x_i, a, b, \sigma)\). While fine in many circumstances, we should take care to make sure we are extremely clear about what parameters we are conditioning on.
The marginal probability density of \(y_i\) is \[ p(y_i | a, b, \sigma) = \int p(y_i | x_i, a, b, \sigma) p(x_i) \, dx_i \] where \(p(x_i)\) is the probability density of \(x_i\).
The joint probability density of \(y_i\) and \(x_i\) is \[ p(y_i, x_i | a, b, \sigma) = p(y_i | x_i, a, b, \sigma) p(x_i) \] where \(p(x_i)\) is the probability density of \(x_i\).
If \(x=2\), we can simulate from the conditional distribution of \(y\):
If \(x \sim N(0, 1)\), then we can simulate from the joint distribution of \(x\) and \(y\):
rand.(Normal.(m .* x .+ b, σ))
but it is easy to read. The results are the same.Finally, assuming the same distribution, we can simulate from the marginal distribution of \(y\):
Today
PDFs and CDFs
Joint, marginal, and conditional distributions
Example: linear regression
Example: negative binomial as a mixture
Wrapup
The Negative Binomial distribution (see last lecture) can be interpreted as a Gamma-Poisson mixture:
\[ \begin{align} y &\sim \textrm{Poisson}(\lambda) \\ \lambda &\sim \textrm{Gamma}\left(r, \frac{p}{1-p} \right) \end{align} \]
We can show mathematically that if \(y ~ \textrm{Negative Binomial}(r, p)\), that is equivalent to the mixture model \(y ~ \textrm{Poisson}(\lambda)\) and \(\lambda ~ \textrm{Gamma}(r, p / (1 - p))\). \[ \begin{align} & \int_0^{\infty} f_{\text {Poisson }(\lambda)}(y) \times f_{\operatorname{Gamma}\left(r, \frac{p}{1-p}\right)}(\lambda) \mathrm{d} \lambda \\ & = \int_0^{\infty} \frac{\lambda^y}{y !} e^{-\lambda} \times \frac{1}{\Gamma(r)}\left(\frac{p}{1-p} \lambda\right)^{r-1} e^{-\frac{p}{1-p} \lambda}\left(\frac{p}{1-p} \mathrm{~d} \lambda\right) \\ \ldots \\ &= f_{\text {Negative Binomial }(r, p)}(y) \end{align} \] For all the steps see Wikipedia.
We can see this with simulation. First we define a function to simulate from the Gamma-Poisson mixture:
gamma_poisson (generic function with 1 method)
Then we can simulate from the mixture and compare to the Negative Binomial distribution:
I don’t need you to know all the details of this particular mixture model. What I do want you to understand is:
Today
PDFs and CDFs
Joint, marginal, and conditional distributions
Example: linear regression
Example: negative binomial as a mixture
Wrapup