Package 'distributional'

Title: Vectorised Probability Distributions
Description: Vectorised distribution objects with tools for manipulating, visualising, and using probability distributions. Designed to allow model prediction outputs to return distributions rather than their parameters, allowing users to directly interact with predictive distributions in a data-oriented workflow. In addition to providing generic replacements for p/d/q/r functions, other useful statistics can be computed including means, variances, intervals, and highest density regions.
Authors: Mitchell O'Hara-Wild [aut, cre] (ORCID: <https://orcid.org/0000-0001-6729-7695>), Matthew Kay [aut] (ORCID: <https://orcid.org/0000-0001-9446-0419>), Alex Hayes [aut] (ORCID: <https://orcid.org/0000-0002-4985-5160>), Rob Hyndman [aut] (ORCID: <https://orcid.org/0000-0002-2140-5352>), Earo Wang [ctb] (ORCID: <https://orcid.org/0000-0001-6448-5260>), Vencislav Popov [ctb] (ORCID: <https://orcid.org/0000-0002-8073-4199>)
Maintainer: Mitchell O'Hara-Wild <[email protected]>
License: GPL-3
Version: 0.7.0.9000
Built: 2026-05-27 14:38:33 UTC
Source: https://github.com/mitchelloharawild/distributional

Help Index


The cumulative distribution function

Description

[Stable]

Usage

cdf(x, q, ..., log = FALSE)

## S3 method for class 'distribution'
cdf(x, q, ...)

Arguments

x

The distribution(s).

q

The quantile at which the cdf is calculated.

...

Additional arguments passed to methods.

log

If TRUE, probabilities will be given as log probabilities.


Covariance

Description

[Stable]

A generic function for computing the covariance of an object.

Usage

covariance(x, ...)

Arguments

x

An object.

...

Additional arguments used by methods.

See Also

covariance.distribution(), variance()


Covariance of a probability distribution

Description

[Stable]

Returns the empirical covariance of the probability distribution. If the method does not exist, the covariance of a random sample will be returned.

Usage

## S3 method for class 'distribution'
covariance(x, ...)

Arguments

x

The distribution(s).

...

Additional arguments used by methods.


The probability density/mass function

Description

[Stable]

Computes the probability density function for a continuous distribution, or the probability mass function for a discrete distribution.

Usage

## S3 method for class 'distribution'
density(x, at, ..., log = FALSE)

Arguments

x

The distribution(s).

at

The point at which to compute the density/mass.

...

Additional arguments passed to methods.

log

If TRUE, probabilities will be given as log probabilities.


The Bernoulli distribution

Description

[Stable]

Bernoulli distributions are used to represent events like coin flips when there is single trial that is either successful or unsuccessful. The Bernoulli distribution is a special case of the Binomial() distribution with n = 1.

Usage

dist_bernoulli(prob)

Arguments

prob

The probability of success on each trial, prob can be any value in ⁠[0, 1]⁠.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_bernoulli.html

In the following, let XX be a Bernoulli random variable with parameter prob = pp. Some textbooks also define q=1pq = 1 - p, or use π\pi instead of pp.

The Bernoulli probability distribution is widely used to model binary variables, such as 'failure' and 'success'. The most typical example is the flip of a coin, when pp is thought as the probability of flipping a head, and q=1pq = 1 - p is the probability of flipping a tail.

Support: {0,1}\{0, 1\}

Mean: pp

Variance: p(1p)=pqp \cdot (1 - p) = p \cdot q

Probability mass function (p.m.f):

P(X=x)=px(1p)1x=pxq1xP(X = x) = p^x (1 - p)^{1-x} = p^x q^{1-x}

Cumulative distribution function (c.d.f):

P(Xx)={0x<01p0x<11x1P(X \le x) = \left \{ \begin{array}{ll} 0 & x < 0 \\ 1 - p & 0 \leq x < 1 \\ 1 & x \geq 1 \end{array} \right.

Moment generating function (m.g.f):

E(etX)=(1p)+petE(e^{tX}) = (1 - p) + p e^t

Skewness:

12pp(1p)=qppq\frac{1 - 2p}{\sqrt{p(1-p)}} = \frac{q - p}{\sqrt{pq}}

Excess Kurtosis:

16p(1p)p(1p)=16pqpq\frac{1 - 6p(1-p)}{p(1-p)} = \frac{1 - 6pq}{pq}

See Also

stats::Binomial

Examples

dist <- dist_bernoulli(prob = c(0.05, 0.5, 0.3, 0.9, 0.1))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Beta distribution

Description

[Stable]

The Beta distribution is a continuous probability distribution defined on the interval [0, 1], commonly used to model probabilities and proportions.

Usage

dist_beta(shape1, shape2)

Arguments

shape1, shape2

The non-negative shape parameters of the Beta distribution.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_beta.html

In the following, let XX be a Beta random variable with parameters shape1 = α\alpha and shape2 = β\beta.

Support: x[0,1]x \in [0, 1]

Mean: αα+β\frac{\alpha}{\alpha + \beta}

Variance: αβ(α+β)2(α+β+1)\frac{\alpha\beta}{(\alpha + \beta)^2(\alpha + \beta + 1)}

Probability density function (p.d.f):

f(x)=xα1(1x)β1B(α,β)=Γ(α+β)Γ(α)Γ(β)xα1(1x)β1f(x) = \frac{x^{\alpha - 1}(1-x)^{\beta - 1}}{B(\alpha, \beta)} = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)} x^{\alpha - 1}(1-x)^{\beta - 1}

where B(α,β)=Γ(α)Γ(β)Γ(α+β)B(\alpha, \beta) = \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha + \beta)} is the Beta function.

Cumulative distribution function (c.d.f):

F(x)=Ix(alpha,beta)=B(x;α,β)B(α,β)F(x) = I_x(alpha, beta) = \frac{B(x; \alpha, \beta)}{B(\alpha, \beta)}

where Ix(α,β)I_x(\alpha, \beta) is the regularized incomplete beta function and B(x;α,β)=0xtα1(1t)β1dtB(x; \alpha, \beta) = \int_0^x t^{\alpha-1}(1-t)^{\beta-1} dt.

Moment generating function (m.g.f):

The moment generating function does not have a simple closed form, but the moments can be calculated as:

E(Xk)=r=0k1α+rα+β+rE(X^k) = \prod_{r=0}^{k-1} \frac{\alpha + r}{\alpha + \beta + r}

See Also

stats::Beta

Examples

dist <- dist_beta(shape1 = c(0.5, 5, 1, 2, 2), shape2 = c(0.5, 1, 3, 2, 5))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Binomial distribution

Description

[Stable]

Binomial distributions are used to represent situations can that can be thought as the result of nn Bernoulli experiments (here the nn is defined as the size of the experiment). The classical example is nn independent coin flips, where each coin flip has probability p of success. In this case, the individual probability of flipping heads or tails is given by the Bernoulli(p) distribution, and the probability of having xx equal results (xx heads, for example), in nn trials is given by the Binomial(n, p) distribution. The equation of the Binomial distribution is directly derived from the equation of the Bernoulli distribution.

Usage

dist_binomial(size, prob)

Arguments

size

The number of trials. Must be an integer greater than or equal to one. When size = 1L, the Binomial distribution reduces to the Bernoulli distribution. Often called n in textbooks.

prob

The probability of success on each trial, prob can be any value in ⁠[0, 1]⁠.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_binomial.html

The Binomial distribution comes up when you are interested in the portion of people who do a thing. The Binomial distribution also comes up in the sign test, sometimes called the Binomial test (see stats::binom.test()), where you may need the Binomial C.D.F. to compute p-values.

In the following, let XX be a Binomial random variable with parameter size = nn and p = pp. Some textbooks define q=1pq = 1 - p, or called π\pi instead of pp.

Support: {0,1,2,...,n}\{0, 1, 2, ..., n\}

Mean: npnp

Variance: np(1p)=npqnp \cdot (1 - p) = np \cdot q

Probability mass function (p.m.f):

P(X=k)=(nk)pk(1p)nkP(X = k) = {n \choose k} p^k (1 - p)^{n-k}

Cumulative distribution function (c.d.f):

P(Xk)=i=0k(ni)pi(1p)niP(X \le k) = \sum_{i=0}^{\lfloor k \rfloor} {n \choose i} p^i (1 - p)^{n-i}

Moment generating function (m.g.f):

E(etX)=(1p+pet)nE(e^{tX}) = (1 - p + p e^t)^n

Skewness:

12pnp(1p)\frac{1 - 2p}{\sqrt{np(1-p)}}

Excess kurtosis:

16p(1p)np(1p)\frac{1 - 6p(1-p)}{np(1-p)}

See Also

stats::Binomial

Examples

dist <- dist_binomial(size = 1:5, prob = c(0.05, 0.5, 0.3, 0.9, 0.1))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Burr distribution

Description

[Stable]

The Burr distribution (Type XII) is a flexible continuous probability distribution often used for modeling income distributions, reliability data, and failure times.

Usage

dist_burr(shape1, shape2, rate = 1, scale = 1/rate)

Arguments

shape1, shape2, scale

parameters. Must be strictly positive.

rate

an alternative way to specify the scale.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_burr.html

In the following, let XX be a Burr random variable with parameters shape1 = α\alpha, shape2 = γ\gamma, and rate = λ\lambda.

Support: x(0,)x \in (0, \infty)

Mean: λ1/αγB(γ1/α,1+1/α)γ\frac{\lambda^{-1/\alpha} \gamma B(\gamma - 1/\alpha, 1 + 1/\alpha)}{\gamma} (for αγ>1\alpha \gamma > 1)

Variance: λ2/αγB(γ2/α,1+2/α)γμ2\frac{\lambda^{-2/\alpha} \gamma B(\gamma - 2/\alpha, 1 + 2/\alpha)}{\gamma} - \mu^2 (for αγ>2\alpha \gamma > 2)

Probability density function (p.d.f):

f(x)=αγλxα1(1+λxα)γ1f(x) = \alpha \gamma \lambda x^{\alpha - 1} (1 + \lambda x^\alpha)^{-\gamma - 1}

Cumulative distribution function (c.d.f):

F(x)=1(1+λxα)γF(x) = 1 - (1 + \lambda x^\alpha)^{-\gamma}

Quantile function:

F1(p)=λ1/α((1p)1/γ1)1/αF^{-1}(p) = \lambda^{-1/\alpha} ((1 - p)^{-1/\gamma} - 1)^{1/\alpha}

Moment generating function (m.g.f):

Does not exist in closed form.

See Also

actuar::Burr

Examples

dist <- dist_burr(shape1 = c(1,1,1,2,3,0.5), shape2 = c(1,2,3,1,1,2))
dist


mean(dist)
variance(dist)
support(dist)
generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Categorical distribution

Description

[Stable]

Categorical distributions are used to represent events with multiple outcomes, such as what number appears on the roll of a dice. This is also referred to as the 'generalised Bernoulli' or 'multinoulli' distribution. The Categorical distribution is a special case of the Multinomial() distribution with n = 1.

Usage

dist_categorical(prob, outcomes = NULL)

Arguments

prob

A list of probabilities of observing each outcome category.

outcomes

The list of vectors where each value represents each outcome.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_categorical.html

In the following, let XX be a Categorical random variable with probability parameters prob = {p1,p2,,pk}\{p_1, p_2, \ldots, p_k\}.

The Categorical probability distribution is widely used to model the occurance of multiple events. A simple example is the roll of a dice, where p={1/6,1/6,1/6,1/6,1/6,1/6}p = \{1/6, 1/6, 1/6, 1/6, 1/6, 1/6\} giving equal chance of observing each number on a 6 sided dice.

Support: {1,,k}\{1, \ldots, k\}

Mean: Not defined for unordered categories. For ordered categories with integer outcomes {1,2,,k}\{1, 2, \ldots, k\}, the mean is:

E(X)=i=1kipiE(X) = \sum_{i=1}^{k} i \cdot p_i

Variance: Not defined for unordered categories. For ordered categories with integer outcomes {1,2,,k}\{1, 2, \ldots, k\}, the variance is:

Var(X)=i=1ki2pi(i=1kipi)2\text{Var}(X) = \sum_{i=1}^{k} i^2 \cdot p_i - \left(\sum_{i=1}^{k} i \cdot p_i\right)^2

Probability mass function (p.m.f):

P(X=i)=piP(X = i) = p_i

Cumulative distribution function (c.d.f):

The c.d.f is undefined for unordered categories. For ordered categories with outcomes x1<x2<<xkx_1 < x_2 < \ldots < x_k, the c.d.f is:

P(Xxj)=i=1jpiP(X \le x_j) = \sum_{i=1}^{j} p_i

Moment generating function (m.g.f):

E(etX)=i=1ketxipiE(e^{tX}) = \sum_{i=1}^{k} e^{tx_i} \cdot p_i

Skewness: Approximated numerically for ordered categories.

Kurtosis: Approximated numerically for ordered categories.

See Also

stats::Multinomial

Examples

dist <- dist_categorical(prob = list(c(0.05, 0.5, 0.15, 0.2, 0.1), c(0.3, 0.1, 0.6)))

dist

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

# The outcomes aren't ordered, so many statistics are not applicable.
cdf(dist, 0.6)
quantile(dist, 0.7)
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

# Some of these statistics are meaningful for ordered outcomes
dist <- dist_categorical(list(rpois(26, 3)), list(ordered(letters)))
dist
cdf(dist, "m")
quantile(dist, 0.5)

dist <- dist_categorical(
  prob = list(c(0.05, 0.5, 0.15, 0.2, 0.1), c(0.3, 0.1, 0.6)),
  outcomes = list(letters[1:5], letters[24:26])
)

generate(dist, 10)

density(dist, "a")
density(dist, "z", log = TRUE)

The Cauchy distribution

Description

[Stable]

The Cauchy distribution is the student's t distribution with one degree of freedom. The Cauchy distribution does not have a well defined mean or variance. Cauchy distributions often appear as priors in Bayesian contexts due to their heavy tails.

Usage

dist_cauchy(location, scale)

Arguments

location, scale

location and scale parameters.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_cauchy.html

In the following, let XX be a Cauchy variable with mean ⁠location =⁠ x0x_0 and scale = γ\gamma.

Support: RR, the set of all real numbers

Mean: Undefined.

Variance: Undefined.

Probability density function (p.d.f):

f(x)=1πγ[1+(xx0γ)2]f(x) = \frac{1}{\pi \gamma \left[1 + \left(\frac{x - x_0}{\gamma} \right)^2 \right]}

Cumulative distribution function (c.d.f):

F(t)=1πarctan(tx0γ)+12F(t) = \frac{1}{\pi} \arctan \left( \frac{t - x_0}{\gamma} \right) + \frac{1}{2}

Moment generating function (m.g.f):

Does not exist.

See Also

stats::Cauchy

Examples

dist <- dist_cauchy(location = c(0, 0, 0, -2), scale = c(0.5, 1, 2, 1))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The (non-central) Chi-Squared Distribution

Description

[Stable]

Chi-square distributions show up often in frequentist settings as the sampling distribution of test statistics, especially in maximum likelihood estimation settings.

Usage

dist_chisq(df, ncp = 0)

Arguments

df

Degrees of freedom (non-centrality parameter). Can be any positive real number.

ncp

Non-centrality parameter. Can be any non-negative real number. Defaults to 0 (central chi-squared distribution).

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_chisq.html

In the following, let XX be a χ2\chi^2 random variable with df = kk and ncp = λ\lambda.

Support: R+R^+, the set of positive real numbers

Mean: k+λk + \lambda

Variance: 2(k+2λ)2(k + 2\lambda)

Probability density function (p.d.f):

For the central chi-squared distribution (λ=0\lambda = 0):

f(x)=12k/2Γ(k/2)xk/21ex/2f(x) = \frac{1}{2^{k/2} \Gamma(k/2)} x^{k/2 - 1} e^{-x/2}

For the non-central chi-squared distribution (λ>0\lambda > 0):

f(x)=12e(x+λ)/2(xλ)k/41/2Ik/21(λx)f(x) = \frac{1}{2} e^{-(x+\lambda)/2} \left(\frac{x}{\lambda}\right)^{k/4-1/2} I_{k/2-1}\left(\sqrt{\lambda x}\right)

where Iν(z)I_\nu(z) is the modified Bessel function of the first kind.

Cumulative distribution function (c.d.f):

For the central chi-squared distribution (λ=0\lambda = 0):

F(x)=γ(k/2,x/2)Γ(k/2)=P(k/2,x/2)F(x) = \frac{\gamma(k/2, x/2)}{\Gamma(k/2)} = P(k/2, x/2)

where γ(s,x)\gamma(s, x) is the lower incomplete gamma function and P(s,x)P(s, x) is the regularized gamma function.

For the non-central chi-squared distribution (λ>0\lambda > 0):

F(x)=j=0eλ/2(λ/2)jj!P(k/2+j,x/2)F(x) = \sum_{j=0}^{\infty} \frac{e^{-\lambda/2} (\lambda/2)^j}{j!} P(k/2 + j, x/2)

This is approximated numerically.

Moment generating function (m.g.f):

For the central chi-squared distribution (λ=0\lambda = 0):

E(etX)=(12t)k/2,t<1/2E(e^{tX}) = (1 - 2t)^{-k/2}, \quad t < 1/2

For the non-central chi-squared distribution (λ>0\lambda > 0):

E(etX)=eλt/(12t)(12t)k/2,t<1/2E(e^{tX}) = \frac{e^{\lambda t / (1 - 2t)}}{(1 - 2t)^{k/2}}, \quad t < 1/2

Skewness:

γ1=23/2(k+3λ)(k+2λ)3/2\gamma_1 = \frac{2^{3/2}(k + 3\lambda)}{(k + 2\lambda)^{3/2}}

For the central case (λ=0\lambda = 0), this simplifies to 8/k\sqrt{8/k}.

Excess Kurtosis:

γ2=12(k+4λ)(k+2λ)2\gamma_2 = \frac{12(k + 4\lambda)}{(k + 2\lambda)^2}

For the central case (λ=0\lambda = 0), this simplifies to 12/k12/k.

See Also

stats::Chisquare

Examples

dist <- dist_chisq(df = c(1,2,3,4,6,9))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The degenerate distribution

Description

[Stable]

The degenerate distribution takes a single value which is certain to be observed. It takes a single parameter, which is the value that is observed by the distribution.

Usage

dist_degenerate(x)

Arguments

x

The value of the distribution (location parameter). Can be any real number.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_degenerate.html

In the following, let XX be a degenerate random variable with value x = k0k_0.

Support: {k0}\{k_0\}, a single point

Mean: μ=k0\mu = k_0

Variance: σ2=0\sigma^2 = 0

Probability density function (p.d.f):

f(x)=1 for x=k0f(x) = 1 \textrm{ for } x = k_0

f(x)=0 for xk0f(x) = 0 \textrm{ for } x \neq k_0

Cumulative distribution function (c.d.f):

F(t)=0 for t<k0F(t) = 0 \textrm{ for } t < k_0

F(t)=1 for tk0F(t) = 1 \textrm{ for } t \ge k_0

Moment generating function (m.g.f):

E(etX)=ek0tE(e^{tX}) = e^{k_0 t}

Skewness: Undefined (NA)

Excess Kurtosis: Undefined (NA)

See Also

stats::Distributions

Examples

dist <- dist_degenerate(x = 1:5)

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Dirichlet distribution

Description

[Stable]

The Dirichlet distribution is a multivariate generalisation of the Beta distribution. It is the conjugate prior of the Categorical and Multinomial distributions, and describes a probability distribution over the (k1)(k-1)-simplex — the set of kk-dimensional vectors whose components are non-negative and sum to one.

Usage

dist_dirichlet(alpha)

Arguments

alpha

A list of positive numeric concentration vectors.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_dirichlet.html

In the following, let X=(X1,,Xk)\mathbf{X} = (X_1, \ldots, X_k) be a Dirichlet random variable with concentration parameter alpha = α=(α1,,αk)\boldsymbol{\alpha} = (\alpha_1, \ldots, \alpha_k), where each αi>0\alpha_i > 0.

Support: x\mathbf{x} on the (k1)(k-1)-simplex, i.e. xi0x_i \geq 0 and i=1kxi=1\sum_{i=1}^k x_i = 1.

Mean: E(Xi)=αiα0E(X_i) = \frac{\alpha_i}{\alpha_0} where α0=i=1kαi\alpha_0 = \sum_{i=1}^k \alpha_i.

Variance:

Var(Xi)=αi(α0αi)α02(α0+1)\mathrm{Var}(X_i) = \frac{\alpha_i(\alpha_0 - \alpha_i)}{\alpha_0^2(\alpha_0 + 1)}

Covariance:

Cov(Xi,Xj)=αiαjα02(α0+1),ij\mathrm{Cov}(X_i, X_j) = \frac{-\alpha_i \alpha_j}{\alpha_0^2(\alpha_0 + 1)}, \quad i \neq j

Probability density function (p.d.f):

f(x)=1B(α)i=1kxiαi1f(\mathbf{x}) = \frac{1}{B(\boldsymbol{\alpha})} \prod_{i=1}^k x_i^{\alpha_i - 1}

where B(α)=i=1kΓ(αi)Γ(α0)B(\boldsymbol{\alpha}) = \frac{\prod_{i=1}^k \Gamma(\alpha_i)}{\Gamma(\alpha_0)} is the multivariate Beta function.

See Also

LaplacesDemon::ddirichlet(), LaplacesDemon::rdirichlet()

Examples

dist <- dist_dirichlet(alpha = list(c(2, 5, 3)))
dist


mean(dist)
variance(dist)
support(dist)
generate(dist, 10)

density(dist, cbind(0.2, 0.5, 0.3))
density(dist, cbind(0.2, 0.5, 0.3), log = TRUE)

The Exponential Distribution

Description

[Stable]

Exponential distributions are frequently used to model waiting times and the time between events in a Poisson process.

Usage

dist_exponential(rate)

Arguments

rate

vector of rates.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_exponential.html

In the following, let XX be an Exponential random variable with parameter rate = λ\lambda.

Support: x[0,)x \in [0, \infty)

Mean: 1λ\frac{1}{\lambda}

Variance: 1λ2\frac{1}{\lambda^2}

Probability density function (p.d.f):

f(x)=λeλxf(x) = \lambda e^{-\lambda x}

Cumulative distribution function (c.d.f):

F(x)=1eλxF(x) = 1 - e^{-\lambda x}

Moment generating function (m.g.f):

E(etX)=λλt,t<λE(e^{tX}) = \frac{\lambda}{\lambda - t}, \quad t < \lambda

See Also

stats::Exponential

Examples

dist <- dist_exponential(rate = c(2, 1, 2/3))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The F Distribution

Description

[Stable]

The F distribution is commonly used in statistical inference, particularly in the analysis of variance (ANOVA), testing the equality of variances, and in regression analysis. It arises as the ratio of two scaled chi-squared distributions divided by their respective degrees of freedom.

Usage

dist_f(df1, df2, ncp = NULL)

Arguments

df1

Degrees of freedom for the numerator. Can be any positive number.

df2

Degrees of freedom for the denominator. Can be any positive number.

ncp

Non-centrality parameter. If NULL (default), the central F distribution is used. If specified, must be non-negative.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_f.html

In the following, let XX be an F random variable with numerator degrees of freedom df1 = d1d_1 and denominator degrees of freedom df2 = d2d_2.

Support: x(0,)x \in (0, \infty)

Mean:

For the central F distribution (ncp = NULL):

E(X)=d2d22E(X) = \frac{d_2}{d_2 - 2}

for d2>2d_2 > 2, otherwise undefined.

For the non-central F distribution with non-centrality parameter ncp = λ\lambda:

E(X)=d2(d1+λ)d1(d22)E(X) = \frac{d_2 (d_1 + \lambda)}{d_1 (d_2 - 2)}

for d2>2d_2 > 2, otherwise undefined.

Variance:

For the central F distribution (ncp = NULL):

Var(X)=2d22(d1+d22)d1(d22)2(d24)\text{Var}(X) = \frac{2 d_2^2 (d_1 + d_2 - 2)}{d_1 (d_2 - 2)^2 (d_2 - 4)}

for d2>4d_2 > 4, otherwise undefined.

For the non-central F distribution with non-centrality parameter ncp = λ\lambda:

Var(X)=2d22d12(d1+λ)2+(d1+2λ)(d22)(d22)2(d24)\text{Var}(X) = \frac{2 d_2^2}{d_1^2} \cdot \frac{(d_1 + \lambda)^2 + (d_1 + 2\lambda)(d_2 - 2)}{(d_2 - 2)^2 (d_2 - 4)}

for d2>4d_2 > 4, otherwise undefined.

Skewness:

For the central F distribution (ncp = NULL):

Skew(X)=(2d1+d22)8(d24)(d26)d1(d1+d22)\text{Skew}(X) = \frac{(2 d_1 + d_2 - 2) \sqrt{8 (d_2 - 4)}}{(d_2 - 6) \sqrt{d_1 (d_1 + d_2 - 2)}}

for d2>6d_2 > 6, otherwise undefined.

For the non-central F distribution, skewness has no simple closed form and is not computed.

Excess Kurtosis:

For the central F distribution (ncp = NULL):

Kurt(X)=12[d1(5d222)(d1+d22)+(d24)(d22)2]d1(d26)(d28)(d1+d22)\text{Kurt}(X) = \frac{12[d_1 (5 d_2 - 22)(d_1 + d_2 - 2) + (d_2 - 4)(d_2 - 2)^2]}{d_1 (d_2 - 6)(d_2 - 8)(d_1 + d_2 - 2)}

for d2>8d_2 > 8, otherwise undefined.

For the non-central F distribution, kurtosis has no simple closed form and is not computed.

Probability density function (p.d.f):

For the central F distribution (ncp = NULL):

f(x)=(d1x)d1d2d2(d1x+d2)d1+d2xB(d1/2,d2/2)f(x) = \frac{\sqrt{\frac{(d_1 x)^{d_1} d_2^{d_2}}{(d_1 x + d_2)^{d_1 + d_2}}}}{x \, B(d_1/2, d_2/2)}

where B(,)B(\cdot, \cdot) is the beta function.

For the non-central F distribution, the density involves an infinite series and is approximated numerically.

Cumulative distribution function (c.d.f):

The c.d.f. does not have a simple closed form expression and is approximated numerically using regularized incomplete beta functions and related special functions.

Moment generating function (m.g.f):

The moment generating function for the F distribution does not exist in general (it diverges for t>0t > 0).

See Also

stats::FDist

Examples

dist <- dist_f(df1 = c(1,2,5,10,100), df2 = c(1,1,2,1,100))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Gamma distribution

Description

[Stable]

Several important distributions are special cases of the Gamma distribution. When the shape parameter is 1, the Gamma is an exponential distribution with parameter 1/β1/\beta. When the shape=n/2shape = n/2 and rate=1/2rate = 1/2, the Gamma is a equivalent to a chi squared distribution with n degrees of freedom. Moreover, if we have X1X_1 is Gamma(α1,β)Gamma(\alpha_1, \beta) and X2X_2 is Gamma(α2,β)Gamma(\alpha_2, \beta), a function of these two variables of the form X1X1+X2\frac{X_1}{X_1 + X_2} Beta(α1,α2)Beta(\alpha_1, \alpha_2). This last property frequently appears in another distributions, and it has extensively been used in multivariate methods. More about the Gamma distribution will be added soon.

Usage

dist_gamma(shape, rate = 1/scale, scale = 1/rate)

Arguments

shape, scale

shape and scale parameters. Must be positive, scale strictly.

rate

an alternative way to specify the scale.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_gamma.html

In the following, let XX be a Gamma random variable with parameters shape = α\alpha and rate = β\beta.

Support: x(0,)x \in (0, \infty)

Mean: αβ\frac{\alpha}{\beta}

Variance: αβ2\frac{\alpha}{\beta^2}

Probability density function (p.m.f):

f(x)=βαΓ(α)xα1eβxf(x) = \frac{\beta^{\alpha}}{\Gamma(\alpha)} x^{\alpha - 1} e^{-\beta x}

Cumulative distribution function (c.d.f):

f(x)=Γ(α,βx)Γαf(x) = \frac{\Gamma(\alpha, \beta x)}{\Gamma{\alpha}}

Moment generating function (m.g.f):

E(etX)=(ββt)α,t<βE(e^{tX}) = \Big(\frac{\beta}{ \beta - t}\Big)^{\alpha}, \thinspace t < \beta

See Also

stats::GammaDist

Examples

dist <- dist_gamma(shape = c(1,2,3,5,9,7.5,0.5), rate = c(0.5,0.5,0.5,1,2,1,1))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Geometric Distribution

Description

[Stable]

The Geometric distribution can be thought of as a generalization of the dist_bernoulli() distribution where we ask: "if I keep flipping a coin with probability p of heads, what is the probability I need kk flips before I get my first heads?" The Geometric distribution is a special case of Negative Binomial distribution.

Usage

dist_geometric(prob)

Arguments

prob

probability of success in each trial. 0 < prob <= 1.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_geometric.html

In the following, let XX be a Geometric random variable with success probability prob = pp. Note that there are multiple parameterizations of the Geometric distribution.

Support: {0,1,2,3,...}\{0, 1, 2, 3, ...\}

Mean: 1pp\frac{1-p}{p}

Variance: 1pp2\frac{1-p}{p^2}

Probability mass function (p.m.f):

P(X=k)=p(1p)kP(X = k) = p(1-p)^k

Cumulative distribution function (c.d.f):

P(Xk)=1(1p)k+1P(X \le k) = 1 - (1-p)^{k+1}

Moment generating function (m.g.f):

E(etX)=pet1(1p)etE(e^{tX}) = \frac{pe^t}{1 - (1-p)e^t}

Skewness:

2p1p\frac{2 - p}{\sqrt{1 - p}}

Excess Kurtosis:

6+p21p6 + \frac{p^2}{1 - p}

See Also

stats::Geometric

Examples

dist <- dist_geometric(prob = c(0.2, 0.5, 0.8))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Generalized Extreme Value Distribution

Description

[Stable]

The GEV distribution is widely used in extreme value theory to model the distribution of maxima (or minima) of samples. The parametric form encompasses the Gumbel, Frechet, and reverse Weibull distributions.

Usage

dist_gev(location, scale, shape)

Arguments

location

the location parameter μ\mu of the GEV distribution.

scale

the scale parameter σ\sigma of the GEV distribution. Must be strictly positive.

shape

the shape parameter ξ\xi of the GEV distribution. Determines the tail behavior: ξ=0\xi = 0 gives Gumbel, ξ>0\xi > 0 gives Frechet, ξ<0\xi < 0 gives reverse Weibull.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_gev.html

In the following, let XX be a GEV random variable with parameters location = μ\mu, scale = σ\sigma, and shape = ξ\xi.

Support:

  • xRx \in \mathbb{R} (all real numbers) if ξ=0\xi = 0

  • xμσ/ξx \geq \mu - \sigma/\xi if ξ>0\xi > 0

  • xμσ/ξx \leq \mu - \sigma/\xi if ξ<0\xi < 0

Mean:

E(X)={μ+σγif ξ=0μ+σΓ(1ξ)1ξif ξ<1if ξ1E(X) = \begin{cases} \mu + \sigma \gamma & \text{if } \xi = 0 \\ \mu + \sigma \frac{\Gamma(1-\xi) - 1}{\xi} & \text{if } \xi < 1 \\ \infty & \text{if } \xi \geq 1 \end{cases}

where γ0.5772\gamma \approx 0.5772 is the Euler-Mascheroni constant and Γ()\Gamma(\cdot) is the gamma function.

Median:

Median(X)={μσlog(log2)if ξ=0μ+σ(log2)ξ1ξif ξ0\text{Median}(X) = \begin{cases} \mu - \sigma \log(\log 2) & \text{if } \xi = 0 \\ \mu + \sigma \frac{(\log 2)^{-\xi} - 1}{\xi} & \text{if } \xi \neq 0 \end{cases}

Variance:

Var(X)={π2σ26if ξ=0σ2ξ2[Γ(12ξ)Γ(1ξ)2]if ξ<0.5if ξ0.5\text{Var}(X) = \begin{cases} \frac{\pi^2 \sigma^2}{6} & \text{if } \xi = 0 \\ \frac{\sigma^2}{\xi^2} [\Gamma(1-2\xi) - \Gamma(1-\xi)^2] & \text{if } \xi < 0.5 \\ \infty & \text{if } \xi \geq 0.5 \end{cases}

Probability density function (p.d.f):

For ξ=0\xi = 0 (Gumbel):

f(x)=1σexp(xμσ)exp[exp(xμσ)]f(x) = \frac{1}{\sigma} \exp\left(-\frac{x-\mu}{\sigma}\right) \exp\left[-\exp\left(-\frac{x-\mu}{\sigma}\right)\right]

For ξ0\xi \neq 0:

f(x)=1σ[1+ξ(xμσ)]1/ξ1exp{[1+ξ(xμσ)]1/ξ}f(x) = \frac{1}{\sigma} \left[1 + \xi\left(\frac{x-\mu}{\sigma}\right)\right]^{-1/\xi-1} \exp\left\{-\left[1 + \xi\left(\frac{x-\mu}{\sigma}\right)\right]^{-1/\xi}\right\}

where 1+ξ(xμ)/σ>01 + \xi(x-\mu)/\sigma > 0.

Cumulative distribution function (c.d.f):

For ξ=0\xi = 0 (Gumbel):

F(x)=exp[exp(xμσ)]F(x) = \exp\left[-\exp\left(-\frac{x-\mu}{\sigma}\right)\right]

For ξ0\xi \neq 0:

F(x)=exp{[1+ξ(xμσ)]1/ξ}F(x) = \exp\left\{-\left[1+\xi\left(\frac{x-\mu}{\sigma}\right)\right]^{-1/\xi}\right\}

where 1+ξ(xμ)/σ>01 + \xi(x-\mu)/\sigma > 0.

Quantile function:

For ξ=0\xi = 0 (Gumbel):

Q(p)=μσlog(logp)Q(p) = \mu - \sigma \log(-\log p)

For ξ0\xi \neq 0:

Q(p)=μ+σξ[(logp)ξ1]Q(p) = \mu + \frac{\sigma}{\xi}\left[(-\log p)^{-\xi} - 1\right]

References

Jenkinson, A. F. (1955) The frequency distribution of the annual maximum (or minimum) of meteorological elements. Quart. J. R. Met. Soc., 81, 158–171.

See Also

evd::dgev()

Examples

# Create GEV distributions with different shape parameters

# Gumbel distribution (shape = 0)
gumbel <- dist_gev(location = 0, scale = 1, shape = 0)

# Frechet distribution (shape > 0, heavy-tailed)
frechet <- dist_gev(location = 0, scale = 1, shape = 0.3)

# Reverse Weibull distribution (shape < 0, bounded above)
weibull <- dist_gev(location = 0, scale = 1, shape = -0.2)

dist <- c(gumbel, frechet, weibull)
dist

# Statistical properties
mean(dist)
median(dist)
variance(dist)

# Generate random samples
generate(dist, 10)

# Evaluate density
density(dist, 2)
density(dist, 2, log = TRUE)

# Evaluate cumulative distribution
cdf(dist, 4)

# Calculate quantiles
quantile(dist, 0.95)

The generalised g-and-h Distribution

Description

[Stable]

The generalised g-and-h distribution is a flexible distribution used to model univariate data, similar to the g-k distribution. It is known for its ability to handle skewness and heavy-tailed behavior.

Usage

dist_gh(A, B, g, h, c = 0.8)

Arguments

A

Vector of A (location) parameters.

B

Vector of B (scale) parameters. Must be positive.

g

Vector of g parameters.

h

Vector of h parameters. Must be non-negative.

c

Vector of c parameters (used for generalised g-and-h). Often fixed at 0.8 which is the default.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_gh.html

In the following, let XX be a g-and-h random variable with parameters A = AA, B = BB, g = gg, h = hh, and c = cc.

Support: (,)(-\infty, \infty)

Mean: Does not have a closed-form expression. Approximated numerically.

Variance: Does not have a closed-form expression. Approximated numerically.

Probability density function (p.d.f):

The g-and-h distribution does not have a closed-form expression for its density. The density is approximated numerically from the quantile function. The distribution is defined through its quantile function:

Q(u)=A+B(1+c1exp(gz(u))1+exp(gz(u)))exp(hz(u)2/2)z(u)Q(u) = A + B \left( 1 + c \frac{1 - \exp(-gz(u))}{1 + \exp(-gz(u))} \right) \exp(h z(u)^2/2) z(u)

where z(u)=Φ1(u)z(u) = \Phi^{-1}(u) is the standard normal quantile function.

Cumulative distribution function (c.d.f):

Does not have a closed-form expression. The cumulative distribution function is approximated numerically by inverting the quantile function.

Quantile function:

Q(p)=A+B(1+c1exp(gΦ1(p))1+exp(gΦ1(p)))exp(h(Φ1(p))2/2)Φ1(p)Q(p) = A + B \left( 1 + c \frac{1 - \exp(-g\Phi^{-1}(p))}{1 + \exp(-g\Phi^{-1}(p))} \right) \exp(h (\Phi^{-1}(p))^2/2) \Phi^{-1}(p)

where Φ1(p)\Phi^{-1}(p) is the standard normal quantile function.

See Also

gk::dgh(), gk::pgh(), gk::qgh(), gk::rgh(), dist_gk()

Examples

dist <- dist_gh(A = 0, B = 1, g = 0, h = 0.5)
dist


mean(dist)
variance(dist)
support(dist)
generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The g-and-k Distribution

Description

[Stable]

The g-and-k distribution is a flexible distribution often used to model univariate data. It is particularly known for its ability to handle skewness and heavy-tailed behavior.

Usage

dist_gk(A, B, g, k, c = 0.8)

Arguments

A

Vector of A (location) parameters.

B

Vector of B (scale) parameters. Must be positive.

g

Vector of g parameters.

k

Vector of k parameters. Must be at least -0.5.

c

Vector of c parameters. Often fixed at 0.8 which is the default.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_gk.html

In the following, let XX be a g-k random variable with parameters A, B, g, k, and c.

Support: (,)(-\infty, \infty)

Mean: Not available in closed form.

Variance: Not available in closed form.

Probability density function (p.d.f):

The g-k distribution does not have a closed-form expression for its density. Instead, it is defined through its quantile function:

Q(u)=A+B(1+c1exp(gz(u))1+exp(gz(u)))(1+z(u)2)kz(u)Q(u) = A + B \left( 1 + c \frac{1 - \exp(-gz(u))}{1 + \exp(-gz(u))} \right) (1 + z(u)^2)^k z(u)

where z(u)=Φ1(u)z(u) = \Phi^{-1}(u), the standard normal quantile of u.

Cumulative distribution function (c.d.f):

The cumulative distribution function is typically evaluated numerically due to the lack of a closed-form expression.

See Also

gk::dgk, dist_gh

Examples

dist <- dist_gk(A = 0, B = 1, g = 0, k = 0.5)
dist


mean(dist)
variance(dist)
support(dist)
generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Generalized Pareto Distribution

Description

The GPD distribution is commonly used to model the tails of distributions, particularly in extreme value theory.

The Pickands–Balkema–De Haan theorem states that for a large class of distributions, the tail (above some threshold) can be approximated by a GPD.

Usage

dist_gpd(location, scale, shape)

Arguments

location

the location parameter aa of the GPD distribution.

scale

the scale parameter bb of the GPD distribution.

shape

the shape parameter ss of the GPD distribution.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_gpd.html

In the following, let XX be a Generalized Pareto random variable with parameters location = aa, scale = b>0b > 0, and shape = ss.

Support: xax \ge a if s0s \ge 0, axab/sa \le x \le a - b/s if s<0s < 0

Mean:

E(X)=a+b1sfor s<1E(X) = a + \frac{b}{1 - s} \quad \textrm{for } s < 1

E(X)=E(X) = \infty for s1s \ge 1

Variance:

Var(X)=b2(1s)2(12s)for s<0.5\textrm{Var}(X) = \frac{b^2}{(1-s)^2(1-2s)} \quad \textrm{for } s < 0.5

Var(X)=\textrm{Var}(X) = \infty for s0.5s \ge 0.5

Probability density function (p.d.f):

For s=0s = 0:

f(x)=1bexp(xab)for xaf(x) = \frac{1}{b}\exp\left(-\frac{x-a}{b}\right) \quad \textrm{for } x \ge a

For s0s \ne 0:

f(x)=1b(1+sxab)1/s1f(x) = \frac{1}{b}\left(1 + s\frac{x-a}{b}\right)^{-1/s - 1}

where 1+s(xa)/b>01 + s(x-a)/b > 0

Cumulative distribution function (c.d.f):

For s=0s = 0:

F(x)=1exp(xab)for xaF(x) = 1 - \exp\left(-\frac{x-a}{b}\right) \quad \textrm{for } x \ge a

For s0s \ne 0:

F(x)=1(1+sxab)1/sF(x) = 1 - \left(1 + s\frac{x-a}{b}\right)^{-1/s}

where 1+s(xa)/b>01 + s(x-a)/b > 0

Quantile function:

For s=0s = 0:

Q(p)=ablog(1p)Q(p) = a - b\log(1-p)

For s0s \ne 0:

Q(p)=a+bs[(1p)s1]Q(p) = a + \frac{b}{s}\left[(1-p)^{-s} - 1\right]

Median:

For s=0s = 0:

Median(X)=a+blog(2)\textrm{Median}(X) = a + b\log(2)

For s0s \ne 0:

Median(X)=a+bs(2s1)\textrm{Median}(X) = a + \frac{b}{s}\left(2^s - 1\right)

Skewness and Kurtosis: No closed-form expressions; approximated numerically.

See Also

evd::dgpd()

Examples

dist <- dist_gpd(location = 0, scale = 1, shape = 0)

dist
mean(dist)
variance(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Gumbel distribution

Description

[Stable]

The Gumbel distribution is a special case of the Generalized Extreme Value distribution, obtained when the GEV shape parameter ξ\xi is equal to 0. It may be referred to as a type I extreme value distribution.

Usage

dist_gumbel(alpha, scale)

Arguments

alpha

location parameter.

scale

parameter. Must be strictly positive.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_gumbel.html

In the following, let XX be a Gumbel random variable with location parameter alpha = α\alpha and scale parameter scale = σ\sigma.

Support: RR, the set of all real numbers.

Mean:

E(X)=α+σγE(X) = \alpha + \sigma\gamma

where γ\gamma is the Euler-Mascheroni constant, approximately equal to 0.5772157.

Variance:

Var(X)=π2σ26\textrm{Var}(X) = \frac{\pi^2 \sigma^2}{6}

Skewness:

Skew(X)=126ζ(3)π31.1395\textrm{Skew}(X) = \frac{12\sqrt{6}\zeta(3)}{\pi^3} \approx 1.1395

where ζ(3)\zeta(3) is Apery's constant, approximately equal to 1.2020569. Note that skewness is independent of the distribution parameters.

Kurtosis (excess):

Kurt(X)=125=2.4\textrm{Kurt}(X) = \frac{12}{5} = 2.4

Note that excess kurtosis is independent of the distribution parameters.

Median:

Median(X)=ασln(ln2)\textrm{Median}(X) = \alpha - \sigma\ln(\ln 2)

Probability density function (p.d.f):

f(x)=1σexp[xασ]exp{exp[xασ]}f(x) = \frac{1}{\sigma} \exp\left[-\frac{x - \alpha}{\sigma}\right] \exp\left\{-\exp\left[-\frac{x - \alpha}{\sigma}\right]\right\}

for xx in RR, the set of all real numbers.

Cumulative distribution function (c.d.f):

F(x)=exp{exp[xασ]}F(x) = \exp\left\{-\exp\left[-\frac{x - \alpha}{\sigma}\right]\right\}

for xx in RR, the set of all real numbers.

Quantile function (inverse c.d.f):

F1(p)=ασln(lnp)F^{-1}(p) = \alpha - \sigma \ln(-\ln p)

for pp in (0, 1).

Moment generating function (m.g.f):

E(etX)=Γ(1σt)eαtE(e^{tX}) = \Gamma(1 - \sigma t) e^{\alpha t}

for σt<1\sigma t < 1, where Γ\Gamma is the gamma function.

See Also

actuar::Gumbel, actuar::dgumbel(), actuar::pgumbel(), actuar::qgumbel(), actuar::rgumbel(), actuar::mgumbel()

Examples

dist <- dist_gumbel(alpha = c(0.5, 1, 1.5, 3), scale = c(2, 2, 3, 4))
dist


mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)
support(dist)
generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Horseshoe distribution

Description

[Stable]

The horseshoe distribution (Carvalho et al., 2008) is a heavy-tailed continuous distribution defined as a scale mixture of normals. It is primarily used as a shrinkage prior in sparse Bayesian regression, where it concentrates mass near zero while retaining heavy tails that leave large signals unshrunk.

Usage

dist_horseshoe(lambda, tau)

Arguments

lambda

A positive numeric vector of local scale parameters λ>0\lambda > 0 (one per observation).

tau

A positive scalar global scale parameter τ>0\tau > 0.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_horseshoe.html

In the following, let XX be a horseshoe random variable with local scale parameter lambda = λ>0\lambda > 0 and global scale parameter tau = τ>0\tau > 0.

Support: xRx \in \mathbb{R}, the set of all real numbers.

Mean: E(X)E(X) — not available in closed form.

Variance: Var(X)\mathrm{Var}(X) — not available in closed form.

Probability density function (p.d.f):

The horseshoe density does not have a simple closed form but can be expressed as a scale mixture:

Xλ,τN(0,λ2τ2)X \mid \lambda, \tau \sim \mathcal{N}(0,\, \lambda^2 \tau^2)

where the half-Cauchy hyperprior λC+(0,1)\lambda \sim C^+(0, 1) induces the characteristic horseshoe shrinkage behaviour.

References

Carvalho, C.M., Polson, N.G., and Scott, J.G. (2008). "The Horseshoe Estimator for Sparse Signals". Discussion Paper 2008-31. Duke University Department of Statistical Science.

Carvalho, C.M., Polson, N.G., and Scott, J.G. (2009). "Handling Sparsity via the Horseshoe". Journal of Machine Learning Research, 5, p. 73–80.

See Also

LaplacesDemon::dhs(), LaplacesDemon::rhs()

Examples

dist <- dist_horseshoe(lambda = c(0.5, 1, 2), tau = 1)
dist


support(dist)
generate(dist, 10)

density(dist, 0)
density(dist, 0, log = TRUE)

The Hypergeometric distribution

Description

[Stable]

To understand the HyperGeometric distribution, consider a set of rr objects, of which mm are of the type I and nn are of the type II. A sample with size kk (k<rk<r) with no replacement is randomly chosen. The number of observed type I elements observed in this sample is set to be our random variable XX.

Usage

dist_hypergeometric(m, n, k)

Arguments

m

The number of type I elements available.

n

The number of type II elements available.

k

The size of the sample taken.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_hypergeometric.html

In the following, let XX be a HyperGeometric random variable with success probability p = p=m/(m+n)p = m/(m+n).

Support: x{max(0,kn),,min(k,m)}x \in \{\max(0, k-n), \dots, \min(k,m)\}

Mean: kmm+n=kp\frac{km}{m+n} = kp

Variance: kmn(m+nk)(m+n)2(m+n1)=kp(1p)(1k1m+n1)\frac{kmn(m+n-k)}{(m+n)^2 (m+n-1)} = kp(1-p)\left(1 - \frac{k-1}{m+n-1}\right)

Probability mass function (p.m.f):

P(X=x)=(mx)(nkx)(m+nk)P(X = x) = \frac{{m \choose x}{n \choose k-x}}{{m+n \choose k}}

Cumulative distribution function (c.d.f):

P(Xx)=i=max(0,kn)x(mi)(nki)(m+nk)P(X \le x) = \sum_{i = \max(0, k-n)}^{\lfloor x \rfloor} \frac{{m \choose i}{n \choose k-i}}{{m+n \choose k}}

Moment generating function (m.g.f):

E(etX)=(mk)(m+nk)2F1(m,k;m+nk+1;et)E(e^{tX}) = \frac{{m \choose k}}{{m+n \choose k}}{}_2F_1(-m, -k; m+n-k+1; e^t)

where 2F1_2F_1 is the hypergeometric function.

Skewness:

(m+n2k)(m+n1)1/2(m+n2n)[kmn(m+nk)]1/2(m+n2)\frac{(m+n-2k)(m+n-1)^{1/2}(m+n-2n)}{[kmn(m+n-k)]^{1/2}(m+n-2)}

See Also

stats::Hypergeometric

Examples

dist <- dist_hypergeometric(m = rep(500, 3), n = c(50, 60, 70), k = c(100, 200, 300))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

Inflate a value of a probability distribution

Description

[Stable]

Inflated distributions add extra probability mass at a specific value, most commonly zero (zero-inflation). These distributions are useful for modeling data with excess observations at a particular value compared to what the base distribution would predict. Common applications include zero-inflated Poisson or negative binomial models for count data with many zeros.

Usage

dist_inflated(dist, prob, x = 0)

Arguments

dist

The distribution(s) to inflate.

prob

The added probability of observing x.

x

The value to inflate. The default of x = 0 is for zero-inflation.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_inflated.html

In the following, let YY be an inflated random variable based on a base distribution XX, with inflation value x = cc and inflation probability prob = pp.

Support: Same as the base distribution, but with additional probability mass at cc

Mean: (when x is numeric)

E(Y)=pc+(1p)E(X)E(Y) = p \cdot c + (1-p) \cdot E(X)

Variance: (when x = 0)

Var(Y)=(1p)Var(X)+p(1p)[E(X)]2\text{Var}(Y) = (1-p) \cdot \text{Var}(X) + p(1-p) \cdot [E(X)]^2

For non-zero inflation values, the variance is not computed in closed form.

Probability mass/density function (p.m.f/p.d.f):

For discrete distributions:

fY(y)={p+(1p)fX(c)if y=c(1p)fX(y)if ycf_Y(y) = \begin{cases} p + (1-p) \cdot f_X(c) & \text{if } y = c \\ (1-p) \cdot f_X(y) & \text{if } y \neq c \end{cases}

For continuous distributions:

fY(y)={pif y=c(1p)fX(y)if ycf_Y(y) = \begin{cases} p & \text{if } y = c \\ (1-p) \cdot f_X(y) & \text{if } y \neq c \end{cases}

Cumulative distribution function (c.d.f):

FY(q)={(1p)FX(q)if q<cp+(1p)FX(q)if qcF_Y(q) = \begin{cases} (1-p) \cdot F_X(q) & \text{if } q < c \\ p + (1-p) \cdot F_X(q) & \text{if } q \geq c \end{cases}

Quantile function:

The quantile function is computed numerically by inverting the inflated CDF, accounting for the jump in probability at the inflation point.

Examples

# Zero-inflated Poisson
dist <- dist_inflated(dist_poisson(lambda = 2), prob = 0.3, x = 0)

dist
mean(dist)
variance(dist)

generate(dist, 10)

density(dist, 0)
density(dist, 1)

cdf(dist, 2)

quantile(dist, 0.5)

The Inverse Exponential distribution

Description

[Stable]

The Inverse Exponential distribution is used to model the reciprocal of exponentially distributed variables.

Usage

dist_inverse_exponential(rate)

Arguments

rate

an alternative way to specify the scale.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_inverse_exponential.html

In the following, let XX be an Inverse Exponential random variable with parameter rate = λ\lambda.

Support: x>0x > 0

Mean: Does not exist, returns NA

Variance: Does not exist, returns NA

Probability density function (p.d.f):

f(x)=λx2eλ/xf(x) = \frac{\lambda}{x^2} e^{-\lambda/x}

Cumulative distribution function (c.d.f):

F(x)=eλ/xF(x) = e^{-\lambda/x}

Quantile function (inverse c.d.f):

F1(p)=λlog(p)F^{-1}(p) = -\frac{\lambda}{\log(p)}

Moment generating function (m.g.f):

Does not exist (divergent integral).

See Also

actuar::InverseExponential

Examples

dist <- dist_inverse_exponential(rate = 1:5)
dist


mean(dist)
variance(dist)
support(dist)
generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Inverse Gamma distribution

Description

[Stable]

The Inverse Gamma distribution is commonly used as a prior distribution in Bayesian statistics, particularly for variance parameters.

Usage

dist_inverse_gamma(shape, rate = 1/scale, scale)

Arguments

shape, scale

parameters. Must be strictly positive.

rate

an alternative way to specify the scale.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_inverse_gamma.html

In the following, let XX be an Inverse Gamma random variable with shape parameter shape = α\alpha and rate parameter rate = β\beta (equivalently, scale = 1/β1/\beta).

Support: x(0,)x \in (0, \infty)

Mean: βα1\frac{\beta}{\alpha - 1} for α>1\alpha > 1, otherwise undefined

Variance: β2(α1)2(α2)\frac{\beta^2}{(\alpha - 1)^2 (\alpha - 2)} for α>2\alpha > 2, otherwise undefined

Probability density function (p.d.f):

f(x)=βαΓ(α)xα1eβ/xf(x) = \frac{\beta^\alpha}{\Gamma(\alpha)} x^{-\alpha - 1} e^{-\beta/x}

Cumulative distribution function (c.d.f):

F(x)=Γ(α,β/x)Γ(α)=Q(α,β/x)F(x) = \frac{\Gamma(\alpha, \beta/x)}{\Gamma(\alpha)} = Q(\alpha, \beta/x)

where Γ(α,z)\Gamma(\alpha, z) is the upper incomplete gamma function and QQ is the regularized incomplete gamma function.

Moment generating function (m.g.f):

MX(t)=2(βt)α/2Γ(α)Kα(4βt)M_X(t) = \frac{2 (-\beta t)^{\alpha/2}}{\Gamma(\alpha)} K_\alpha\left(\sqrt{-4\beta t}\right)

for t<0t < 0, where KαK_\alpha is the modified Bessel function of the second kind. The MGF does not exist for t0t \ge 0.

See Also

actuar::InverseGamma

Examples

dist <- dist_inverse_gamma(shape = c(1,2,3,3), rate = c(1,1,1,2))
dist


mean(dist)
variance(dist)
support(dist)
generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Inverse Gaussian distribution

Description

[Stable]

Usage

dist_inverse_gaussian(mean, shape)

Arguments

mean, shape

parameters. Must be strictly positive. Infinite values are supported.

Details

The inverse Gaussian distribution (also known as the Wald distribution) is commonly used to model positive-valued data, particularly in contexts involving first passage times and reliability analysis.

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_inverse_gaussian.html

In the following, let XX be an Inverse Gaussian random variable with parameters mean = μ\mu and shape = λ\lambda.

Support: (0,)(0, \infty)

Mean: μ\mu

Variance: μ3λ\frac{\mu^3}{\lambda}

Probability density function (p.d.f):

f(x)=λ2πx3exp(λ(xμ)22μ2x)f(x) = \sqrt{\frac{\lambda}{2\pi x^3}} \exp\left(-\frac{\lambda(x - \mu)^2}{2\mu^2 x}\right)

Cumulative distribution function (c.d.f):

F(x)=Φ(λx(xμ1))+exp(2λμ)Φ(λx(xμ+1))F(x) = \Phi\left(\sqrt{\frac{\lambda}{x}} \left(\frac{x}{\mu} - 1\right)\right) + \exp\left(\frac{2\lambda}{\mu}\right) \Phi\left(-\sqrt{\frac{\lambda}{x}} \left(\frac{x}{\mu} + 1\right)\right)

where Φ\Phi is the standard normal c.d.f.

Moment generating function (m.g.f):

E(etX)=exp(λμ(112μ2tλ))E(e^{tX}) = \exp\left(\frac{\lambda}{\mu} \left(1 - \sqrt{1 - \frac{2\mu^2 t}{\lambda}}\right)\right)

for t<λ2μ2t < \frac{\lambda}{2\mu^2}.

Skewness: 3μλ3\sqrt{\frac{\mu}{\lambda}}

Excess Kurtosis: 15μλ\frac{15\mu}{\lambda}

Quantiles: No closed-form expression, approximated numerically.

See Also

actuar::InverseGaussian

Examples

dist <- dist_inverse_gaussian(mean = c(1,1,1,3,3), shape = c(0.2, 1, 3, 0.2, 1))
dist


mean(dist)
variance(dist)
support(dist)
generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Laplace distribution

Description

[Stable]

The Laplace distribution, also known as the double exponential distribution, is a continuous probability distribution that is symmetric around its location parameter.

Usage

dist_laplace(mu, sigma)

Arguments

mu

The location parameter (mean) of the Laplace distribution.

sigma

The positive scale parameter of the Laplace distribution.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_laplace.html

In the following, let XX be a Laplace random variable with location parameter mu = μ\mu and scale parameter sigma = σ\sigma.

Support: RR, the set of all real numbers

Mean: μ\mu

Variance: 2σ22\sigma^2

Probability density function (p.d.f):

f(x)=12σexp(xμσ)f(x) = \frac{1}{2\sigma} \exp\left(-\frac{|x - \mu|}{\sigma}\right)

Cumulative distribution function (c.d.f):

F(x)={12exp(xμσ)if x<μ112exp(xμσ)if xμF(x) = \begin{cases} \frac{1}{2} \exp\left(\frac{x - \mu}{\sigma}\right) & \text{if } x < \mu \\ 1 - \frac{1}{2} \exp\left(-\frac{x - \mu}{\sigma}\right) & \text{if } x \geq \mu \end{cases}

Moment generating function (m.g.f):

E(etX)=exp(μt)1σ2t2 for t<1σE(e^{tX}) = \frac{\exp(\mu t)}{1 - \sigma^2 t^2} \text{ for } |t| < \frac{1}{\sigma}

See Also

extraDistr::Laplace

Examples

dist <- dist_laplace(mu = c(0, 2, -1), sigma = c(1, 2, 0.5))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 0)
density(dist, 0, log = TRUE)

cdf(dist, 1)

quantile(dist, 0.7)

The Logarithmic distribution

Description

[Stable]

The Logarithmic distribution is a discrete probability distribution derived from the logarithmic series. It is useful in modeling the abundance of species and other phenomena where the frequency of an event follows a logarithmic pattern.

Usage

dist_logarithmic(prob)

Arguments

prob

parameter. 0 <= prob < 1.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_logarithmic.html

In the following, let XX be a Logarithmic random variable with parameter prob = pp.

Support: {1,2,3,...}\{1, 2, 3, ...\}

Mean: 1log(1p)p1p\frac{-1}{\log(1-p)} \cdot \frac{p}{1-p}

Variance: (p2+plog(1p))[(1p)log(1p)]2\frac{-(p^2 + p\log(1-p))}{[(1-p)\log(1-p)]^2}

Probability mass function (p.m.f):

P(X=k)=1log(1p)pkkP(X = k) = \frac{-1}{\log(1-p)} \cdot \frac{p^k}{k}

for k=1,2,3,k = 1, 2, 3, \ldots

Cumulative distribution function (c.d.f):

The c.d.f. does not have a simple closed form. It is computed using the recurrence relationship P(X=k+1)=pkk+1P(X=k)P(X = k+1) = \frac{p \cdot k}{k+1} \cdot P(X = k) starting from P(X=1)=plog(1p)P(X = 1) = \frac{-p}{\log(1-p)}.

Moment generating function (m.g.f):

E(etX)=log(1pet)log(1p)E(e^{tX}) = \frac{\log(1 - pe^t)}{\log(1-p)}

for pet<1pe^t < 1

See Also

actuar::Logarithmic

Examples

dist <- dist_logarithmic(prob = c(0.33, 0.66, 0.99))
dist


mean(dist)
variance(dist)
support(dist)
generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Logistic distribution

Description

[Stable]

A continuous distribution on the real line. For binary outcomes the model given by P(Y=1X)=F(Xβ)P(Y = 1 | X) = F(X \beta) where FF is the Logistic cdf() is called logistic regression.

Usage

dist_logistic(location, scale)

Arguments

location, scale

location and scale parameters.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_logistic.html

In the following, let XX be a Logistic random variable with location = μ\mu and scale = ss.

Support: RR, the set of all real numbers

Mean: μ\mu

Variance: s2π2/3s^2 \pi^2 / 3

Probability density function (p.d.f):

f(x)=exμss[1+exμs]2f(x) = \frac{e^{-\frac{x - \mu}{s}}}{s \left[1 + e^{-\frac{x - \mu}{s}}\right]^2}

Cumulative distribution function (c.d.f):

F(x)=11+exμsF(x) = \frac{1}{1 + e^{-\frac{x - \mu}{s}}}

Moment generating function (m.g.f):

E(etX)=eμtB(1st,1+st)E(e^{tX}) = e^{\mu t} B(1 - st, 1 + st)

for 1<st<1-1 < st < 1, where B(a,b)B(a, b) is the Beta function.

See Also

stats::Logistic

Examples

dist <- dist_logistic(location = c(5,9,9,6,2), scale = c(2,3,4,2,1))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The log-normal distribution

Description

[Stable]

The log-normal distribution is a commonly used transformation of the Normal distribution. If XX follows a log-normal distribution, then lnX\ln{X} would be characterised by a Normal distribution.

Usage

dist_lognormal(mu = 0, sigma = 1)

Arguments

mu

The mean (location parameter) of the distribution, which is the mean of the associated Normal distribution. Can be any real number.

sigma

The standard deviation (scale parameter) of the distribution. Can be any positive number.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_lognormal.html

In the following, let XX be a log-normal random variable with mu = μ\mu and sigma = σ\sigma.

Support: R+R^+, the set of positive real numbers.

Mean: eμ+σ2/2e^{\mu + \sigma^2/2}

Variance: (eσ21)e2μ+σ2(e^{\sigma^2} - 1) e^{2\mu + \sigma^2}

Skewness: (eσ2+2)eσ21(e^{\sigma^2} + 2) \sqrt{e^{\sigma^2} - 1}

Excess Kurtosis: e4σ2+2e3σ2+3e2σ26e^{4\sigma^2} + 2 e^{3\sigma^2} + 3 e^{2\sigma^2} - 6

Probability density function (p.d.f):

f(x)=1x2πσ2e(lnxμ)2/(2σ2)f(x) = \frac{1}{x\sqrt{2 \pi \sigma^2}} e^{-(\ln{x} - \mu)^2 / (2 \sigma^2)}

Cumulative distribution function (c.d.f):

F(x)=Φ(lnxμσ)F(x) = \Phi\left(\frac{\ln{x} - \mu}{\sigma}\right)

where Φ\Phi is the c.d.f. of the standard Normal distribution.

Moment generating function (m.g.f):

Does not exist in closed form.

See Also

stats::Lognormal

Examples

dist <- dist_lognormal(mu = 1:5, sigma = 0.1)

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

# A log-normal distribution X is exp(Y), where Y is a Normal distribution of
# the same parameters. So log(X) will produce the Normal distribution Y.
log(dist)

Missing distribution

Description

[Maturing]

A placeholder distribution for handling missing values in a vector of distributions.

Usage

dist_missing(length = 1)

Arguments

length

The number of missing distributions

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_missing.html

The missing distribution represents the absence of distributional information. It is used as a placeholder when distribution values are not available or not applicable, similar to how NA is used for missing scalar values.

Support: Undefined

Mean: NA\text{NA}

Variance: NA\text{NA}

Skewness: NA\text{NA}

Kurtosis: NA\text{NA}

Probability density function (p.d.f): Undefined

f(x)=NAf(x) = \text{NA}

Cumulative distribution function (c.d.f): Undefined

F(t)=NAF(t) = \text{NA}

Quantile function: Undefined

Q(p)=NAQ(p) = \text{NA}

Moment generating function (m.g.f): Undefined

E(etX)=NAE(e^{tX}) = \text{NA}

All statistical operations on missing distributions return NA values of appropriate length, propagating the missingness through calculations.

See Also

base::NA

Examples

dist <- dist_missing(3L)

dist
mean(dist)
variance(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

Create a mixture of distributions

Description

[Maturing]

A mixture distribution combines multiple component distributions with specified weights. The resulting distribution can model complex, multimodal data by representing it as a weighted sum of simpler distributions.

Usage

dist_mixture(..., weights = numeric())

Arguments

...

Distributions to be used in the mixture. Can be any distributional objects.

weights

A numeric vector of non-negative weights that sum to 1. The length must match the number of distributions passed to .... Each weight wiw_i represents the probability that a random draw comes from the ii-th component distribution.

Details

In the following, let XX be a mixture random variable composed of KK component distributions F1,F2,,FKF_1, F_2, \ldots, F_K with corresponding weights w1,w2,,wKw_1, w_2, \ldots, w_K where i=1Kwi=1\sum_{i=1}^K w_i = 1 and wi0w_i \geq 0 for all ii.

Support: The union of the supports of all component distributions

Mean:

For univariate mixtures:

E(X)=i=1KwiμiE(X) = \sum_{i=1}^K w_i \mu_i

where μi\mu_i is the mean of the ii-th component distribution.

For multivariate mixtures:

E(X)=i=1KwiμiE(\mathbf{X}) = \sum_{i=1}^K w_i \boldsymbol{\mu}_i

where μi\boldsymbol{\mu}_i is the mean vector of the ii-th component distribution.

Variance:

For univariate mixtures:

Var(X)=i=1Kwi(μi2+σi2)(i=1Kwiμi)2\text{Var}(X) = \sum_{i=1}^K w_i (\mu_i^2 + \sigma_i^2) - \left(\sum_{i=1}^K w_i \mu_i\right)^2

where σi2\sigma_i^2 is the variance of the ii-th component distribution.

Covariance:

For multivariate mixtures:

Cov(X)=i=1Kwi[(μiμˉ)(μiμˉ)T+Σi]\text{Cov}(\mathbf{X}) = \sum_{i=1}^K w_i \left[ (\boldsymbol{\mu}_i - \bar{\boldsymbol{\mu}})(\boldsymbol{\mu}_i - \bar{\boldsymbol{\mu}})^T + \boldsymbol{\Sigma}_i \right]

where μˉ=i=1Kwiμi\bar{\boldsymbol{\mu}} = \sum_{i=1}^K w_i \boldsymbol{\mu}_i is the overall mean vector and Σi\boldsymbol{\Sigma}_i is the covariance matrix of the ii-th component distribution.

Probability density/mass function (p.d.f/p.m.f):

f(x)=i=1Kwifi(x)f(x) = \sum_{i=1}^K w_i f_i(x)

where fi(x)f_i(x) is the density or mass function of the ii-th component distribution.

Cumulative distribution function (c.d.f):

For univariate mixtures:

F(x)=i=1KwiFi(x)F(x) = \sum_{i=1}^K w_i F_i(x)

where Fi(x)F_i(x) is the c.d.f. of the ii-th component distribution.

For multivariate mixtures, the c.d.f. is approximated numerically.

Quantile function:

For univariate mixtures, the quantile function has no closed form and is computed numerically by inverting the c.d.f. using root-finding (stats::uniroot()).

For multivariate mixtures, quantiles are not yet implemented.

See Also

stats::uniroot(), vctrs::vec_unique_count()

Examples

# Univariate mixture of two normal distributions
dist <- dist_mixture(dist_normal(0, 1), dist_normal(5, 2), weights = c(0.3, 0.7))
dist

mean(dist)
variance(dist)

density(dist, 2)
cdf(dist, 2)
quantile(dist, 0.5)

generate(dist, 10)

The Multinomial distribution

Description

[Stable]

The multinomial distribution is a generalization of the binomial distribution to multiple categories. It is perhaps easiest to think that we first extend a dist_bernoulli() distribution to include more than two categories, resulting in a dist_categorical() distribution. We then extend repeat the Categorical experiment several (nn) times.

Usage

dist_multinomial(size, prob)

Arguments

size

The number of draws from the Categorical distribution.

prob

The probability of an event occurring from each draw.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_multinomial.html

In the following, let X=(X1,...,Xk)X = (X_1, ..., X_k) be a Multinomial random variable with success probability prob = pp. Note that pp is vector with kk elements that sum to one. Assume that we repeat the Categorical experiment size = nn times.

Support: Each XiX_i is in {0,1,2,...,n}\{0, 1, 2, ..., n\}.

Mean: The mean of XiX_i is npin p_i.

Variance: The variance of XiX_i is npi(1pi)n p_i (1 - p_i). For iji \neq j, the covariance of XiX_i and XjX_j is npipj-n p_i p_j.

Probability mass function (p.m.f):

P(X1=x1,...,Xk=xk)=n!x1!x2!xk!p1x1p2x2pkxkP(X_1 = x_1, ..., X_k = x_k) = \frac{n!}{x_1! x_2! \cdots x_k!} p_1^{x_1} \cdot p_2^{x_2} \cdot \ldots \cdot p_k^{x_k}

where i=1kxi=n\sum_{i=1}^k x_i = n and i=1kpi=1\sum_{i=1}^k p_i = 1.

Cumulative distribution function (c.d.f):

P(X1q1,...,Xkqk)=x1,,xk0xiqi for all ii=1kxi=nn!x1!x2!xk!p1x1p2x2pkxkP(X_1 \le q_1, ..., X_k \le q_k) = \sum_{\substack{x_1, \ldots, x_k \ge 0 \\ x_i \le q_i \text{ for all } i \\ \sum_{i=1}^k x_i = n}} \frac{n!}{x_1! x_2! \cdots x_k!} p_1^{x_1} \cdot p_2^{x_2} \cdot \ldots \cdot p_k^{x_k}

The c.d.f. is computed as a finite sum of the p.m.f. over all integer vectors in the support that satisfy the componentwise inequalities.

Moment generating function (m.g.f):

E(etX)=(i=1kpieti)nE(e^{t'X}) = \left(\sum_{i=1}^k p_i e^{t_i}\right)^n

where t=(t1,...,tk)t = (t_1, ..., t_k) is a vector of the same dimension as XX.

Skewness: The skewness of XiX_i is

12pinpi(1pi)\frac{1 - 2p_i}{\sqrt{n p_i (1 - p_i)}}

Excess Kurtosis: The excess kurtosis of XiX_i is

16pi(1pi)npi(1pi)\frac{1 - 6p_i(1 - p_i)}{n p_i (1 - p_i)}

See Also

stats::dmultinom(), stats::rmultinom()

Examples

dist <- dist_multinomial(size = c(4, 3), prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4)))

dist
mean(dist)
variance(dist)

generate(dist, 10)

density(dist, list(d = rbind(cbind(1,2,1), cbind(0,2,1))))
density(dist, list(d = rbind(cbind(1,2,1), cbind(0,2,1))), log = TRUE)

cdf(dist, cbind(1,2,1))

The multivariate normal distribution

Description

[Stable]

The multivariate normal distribution is a generalization of the univariate normal distribution to higher dimensions. It is widely used in multivariate statistics and describes the joint distribution of multiple correlated continuous random variables.

Usage

dist_multivariate_normal(mu = 0, sigma = list(diag(1)))

Arguments

mu

A list of numeric vectors for the distribution's mean.

sigma

A list of matrices for the distribution's variance-covariance matrix.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_multivariate_normal.html

In the following, let X\mathbf{X} be a kk-dimensional multivariate normal random variable with mean vector mu = μ\boldsymbol{\mu} and variance-covariance matrix sigma = Σ\boldsymbol{\Sigma}.

Support: xRk\mathbf{x} \in \mathbb{R}^k

Mean: μ\boldsymbol{\mu}

Variance-covariance matrix: Σ\boldsymbol{\Sigma}

Probability density function (p.d.f):

f(x)=1(2π)k/2Σ1/2exp(12(xμ)TΣ1(xμ))f(\mathbf{x}) = \frac{1}{(2\pi)^{k/2} |\boldsymbol{\Sigma}|^{1/2}} \exp\left(-\frac{1}{2}(\mathbf{x} - \boldsymbol{\mu})^T \boldsymbol{\Sigma}^{-1}(\mathbf{x} - \boldsymbol{\mu})\right)

where Σ|\boldsymbol{\Sigma}| is the determinant of Σ\boldsymbol{\Sigma}.

Cumulative distribution function (c.d.f):

P(Xq)=P(X1q1,,Xkqk)P(\mathbf{X} \le \mathbf{q}) = P(X_1 \le q_1, \ldots, X_k \le q_k)

The c.d.f. does not have a closed-form expression and is computed numerically.

Moment generating function (m.g.f):

M(t)=E(etTX)=exp(tTμ+12tTΣt)M(\mathbf{t}) = E(e^{\mathbf{t}^T \mathbf{X}}) = \exp\left(\mathbf{t}^T \boldsymbol{\mu} + \frac{1}{2}\mathbf{t}^T \boldsymbol{\Sigma} \mathbf{t}\right)

See Also

mvtnorm::dmvnorm(), mvtnorm::pmvnorm(), mvtnorm::qmvnorm(), mvtnorm::rmvnorm()

Examples

dist <- dist_multivariate_normal(mu = list(c(1,2)), sigma = list(matrix(c(4,2,2,3), ncol=2)))
dimnames(dist) <- c("x", "y")
dist


mean(dist)
variance(dist)
support(dist)
generate(dist, 10)

density(dist, cbind(2, 1))
density(dist, cbind(2, 1), log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7, kind = "equicoordinate")
quantile(dist, 0.7, kind = "marginal")

The multivariate t-distribution

Description

[Stable]

The multivariate t-distribution is a generalization of the univariate Student's t-distribution to multiple dimensions. It is commonly used for modeling heavy-tailed multivariate data and in robust statistics.

Usage

dist_multivariate_t(df = 1, mu = 0, sigma = diag(1))

Arguments

df

A numeric vector of degrees of freedom (must be positive).

mu

A list of numeric vectors for the distribution location parameter.

sigma

A list of matrices for the distribution scale matrix.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_multivariate_t.html

In the following, let X\mathbf{X} be a multivariate t random vector with degrees of freedom df = ν\nu, location parameter mu = μ\boldsymbol{\mu}, and scale matrix sigma = Σ\boldsymbol{\Sigma}.

Support: xRk\mathbf{x} \in \mathbb{R}^k, where kk is the dimension of the distribution

Mean: μ\boldsymbol{\mu} for ν>1\nu > 1, undefined otherwise

Covariance matrix:

Cov(X)=νν2Σ\text{Cov}(\mathbf{X}) = \frac{\nu}{\nu - 2} \boldsymbol{\Sigma}

for ν>2\nu > 2, undefined otherwise

Probability density function (p.d.f):

f(x)=Γ(ν+k2)Γ(ν2)νk/2πk/2Σ1/2[1+1ν(xμ)TΣ1(xμ)]ν+k2f(\mathbf{x}) = \frac{\Gamma\left(\frac{\nu + k}{2}\right)} {\Gamma\left(\frac{\nu}{2}\right) \nu^{k/2} \pi^{k/2} |\boldsymbol{\Sigma}|^{1/2}} \left[1 + \frac{1}{\nu}(\mathbf{x} - \boldsymbol{\mu})^T \boldsymbol{\Sigma}^{-1} (\mathbf{x} - \boldsymbol{\mu})\right]^{-\frac{\nu + k}{2}}

where kk is the dimension of the distribution and Γ()\Gamma(\cdot) is the gamma function.

Cumulative distribution function (c.d.f):

F(t)=t1tkf(x)dxF(\mathbf{t}) = \int_{-\infty}^{t_1} \cdots \int_{-\infty}^{t_k} f(\mathbf{x}) \, d\mathbf{x}

This integral does not have a closed form solution and is approximated numerically.

Quantile function:

The equicoordinate quantile function finds qq such that:

P(X1q,,Xkq)=pP(X_1 \leq q, \ldots, X_k \leq q) = p

This does not have a closed form solution and is approximated numerically.

The marginal quantile function for each dimension ii is:

Qi(p)=μi+Σiitν1(p)Q_i(p) = \mu_i + \sqrt{\Sigma_{ii}} \cdot t_{\nu}^{-1}(p)

where tν1(p)t_{\nu}^{-1}(p) is the quantile function of the univariate Student's t-distribution with ν\nu degrees of freedom, and Σii\Sigma_{ii} is the ii-th diagonal element of sigma.

See Also

mvtnorm::dmvt, mvtnorm::pmvt, mvtnorm::qmvt, mvtnorm::rmvt

Examples

dist <- dist_multivariate_t(
  df = 5,
  mu = list(c(1, 2)),
  sigma = list(matrix(c(4, 2, 2, 3), ncol = 2))
)
dimnames(dist) <- c("x", "y")
dist


mean(dist)
variance(dist)
support(dist)
generate(dist, 10)

density(dist, cbind(2, 1))
density(dist, cbind(2, 1), log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)
quantile(dist, 0.7, kind = "marginal")

The Negative Binomial distribution

Description

[Stable]

A generalization of the geometric distribution. It is the number of failures in a sequence of i.i.d. Bernoulli trials before a specified number of successes (size) occur. The probability of success in each trial is given by prob.

Usage

dist_negative_binomial(size, prob)

Arguments

size

The number of successful trials (target number of successes). Must be a positive number. Also called the dispersion parameter.

prob

The probability of success in each trial. Must be between 0 and 1.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_negative_binomial.html

In the following, let XX be a Negative Binomial random variable with success probability prob = pp and the number of successes size = rr.

Support: {0,1,2,3,...}\{0, 1, 2, 3, ...\}

Mean: r(1p)p\frac{r(1-p)}{p}

Variance: r(1p)p2\frac{r(1-p)}{p^2}

Probability mass function (p.m.f):

P(X=k)=(k+r1k)(1p)rpkP(X = k) = \binom{k + r - 1}{k} (1-p)^r p^k

Cumulative distribution function (c.d.f):

F(k)=i=0k(i+r1i)(1p)rpiF(k) = \sum_{i=0}^{\lfloor k \rfloor} \binom{i + r - 1}{i} (1-p)^r p^i

This can also be expressed in terms of the regularized incomplete beta function, and is computed numerically.

Moment generating function (m.g.f):

E(etX)=(1p1pet)r,t<logpE(e^{tX}) = \left(\frac{1-p}{1-pe^t}\right)^r, \quad t < -\log p

Skewness:

γ1=2pr(1p)\gamma_1 = \frac{2-p}{\sqrt{r(1-p)}}

Excess Kurtosis:

γ2=6r+p2r(1p)\gamma_2 = \frac{6}{r} + \frac{p^2}{r(1-p)}

See Also

stats::NegBinomial

Examples

dist <- dist_negative_binomial(size = 10, prob = 0.5)

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)
support(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Normal distribution

Description

[Stable]

The Normal distribution is ubiquitous in statistics, partially because of the central limit theorem, which states that sums of i.i.d. random variables eventually become Normal. Linear transformations of Normal random variables result in new random variables that are also Normal. If you are taking an intro stats course, you'll likely use the Normal distribution for Z-tests and in simple linear regression. Under regularity conditions, maximum likelihood estimators are asymptotically Normal. The Normal distribution is also called the gaussian distribution.

Usage

dist_normal(mu = 0, sigma = 1, mean = mu, sd = sigma)

Arguments

mu, mean

The mean (location parameter) of the distribution, which is also the mean of the distribution. Can be any real number.

sigma, sd

The standard deviation (scale parameter) of the distribution. Can be any positive number. If you would like a Normal distribution with variance σ2\sigma^2, be sure to take the square root, as this is a common source of errors.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_normal.html

In the following, let XX be a Normal random variable with mean mu = μ\mu and standard deviation sigma = σ\sigma.

Support: RR, the set of all real numbers

Mean: μ\mu

Variance: σ2\sigma^2

Probability density function (p.d.f):

f(x)=12πσ2e(xμ)2/2σ2f(x) = \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-(x - \mu)^2 / 2 \sigma^2}

Cumulative distribution function (c.d.f):

F(t)=t12πσ2e(xμ)2/2σ2dxF(t) = \int_{-\infty}^t \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-(x - \mu)^2 / 2 \sigma^2} dx

This integral does not have a closed form solution and is approximated numerically. The c.d.f. of a standard Normal is sometimes called the "error function". The notation Φ(t)\Phi(t) also stands for the c.d.f. of a standard Normal evaluated at tt. Z-tables list the value of Φ(t)\Phi(t) for various tt.

Moment generating function (m.g.f):

E(etX)=eμt+σ2t2/2E(e^{tX}) = e^{\mu t + \sigma^2 t^2 / 2}

See Also

stats::Normal

Examples

dist <- dist_normal(mu = 1:5, sigma = 3)

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Pareto Distribution

Description

[Stable]

The Pareto distribution is a power-law probability distribution commonly used in actuarial science to model loss severity and in economics to model income distributions and firm sizes.

Usage

dist_pareto(shape, scale)

Arguments

shape, scale

parameters. Must be strictly positive.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_pareto.html

In the following, let XX be a Pareto random variable with parameters shape = α\alpha and scale = θ\theta.

Support: (0,)(0, \infty)

Mean: θα1\frac{\theta}{\alpha - 1} for α>1\alpha > 1, undefined otherwise

Variance: αθ2(α1)2(α2)\frac{\alpha\theta^2}{(\alpha - 1)^2(\alpha - 2)} for α>2\alpha > 2, undefined otherwise

Probability density function (p.d.f):

f(x)=αθα(x+θ)α+1f(x) = \frac{\alpha\theta^\alpha}{(x + \theta)^{\alpha + 1}}

for x>0x > 0, α>0\alpha > 0 and θ>0\theta > 0.

Cumulative distribution function (c.d.f):

F(x)=1(θx+θ)αF(x) = 1 - \left(\frac{\theta}{x + \theta}\right)^\alpha

for x>0x > 0.

Moment generating function (m.g.f):

Does not exist in closed form, but the kkth raw moment E[Xk]E[X^k] exists for 1<k<α-1 < k < \alpha.

Note

There are many different definitions of the Pareto distribution in the literature; see Arnold (2015) or Kleiber and Kotz (2003). This implementation uses the Pareto distribution without a location parameter as described in actuar::Pareto.

References

Kleiber, C. and Kotz, S. (2003), Statistical Size Distributions in Economics and Actuarial Sciences, Wiley.

Klugman, S. A., Panjer, H. H. and Willmot, G. E. (2012), Loss Models, From Data to Decisions, Fourth Edition, Wiley.

See Also

actuar::Pareto

Examples

dist <- dist_pareto(shape = c(10, 3, 2, 1), scale = rep(1, 4))
dist


mean(dist)
variance(dist)
support(dist)
generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

Percentile distribution

Description

[Stable]

The Percentile distribution is a non-parametric distribution defined by a set of quantiles at specified percentile values. This distribution is useful for representing empirical distributions or elicited expert knowledge when only percentile information is available. The distribution uses linear interpolation between percentiles and can be used to approximate complex distributions that may not have simple parametric forms.

Usage

dist_percentile(x, percentile)

Arguments

x

A list of values

percentile

A list of percentiles

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_percentile.html

In the following, let XX be a Percentile random variable defined by values x1,x2,,xnx_1, x_2, \ldots, x_n at percentiles p1,p2,,pnp_1, p_2, \ldots, p_n where 0pi1000 \le p_i \le 100.

Support: [min(xi),max(xi)][\min(x_i), \max(x_i)] if min(pi)>0\min(p_i) > 0 or max(pi)<100\max(p_i) < 100, otherwise support is approximated from the specified percentiles.

Mean: Approximated numerically using spline interpolation and numerical integration:

E(X)01Q(u)duE(X) \approx \int_0^1 Q(u) du

where Q(u)Q(u) is a spline function interpolating the percentile values.

Variance: Approximated numerically.

Probability density function (p.d.f): Approximated numerically using kernel density estimation from generated samples.

Cumulative distribution function (c.d.f): Defined by linear interpolation:

F(t)={p1/100if t<x1pi/100+(txi)(pi+1pi)100(xi+1xi)if xit<xi+1pn/100if txnF(t) = \begin{cases} p_1/100 & \text{if } t < x_1 \\ p_i/100 + \frac{(t - x_i)(p_{i+1} - p_i)}{100(x_{i+1} - x_i)} & \text{if } x_i \le t < x_{i+1} \\ p_n/100 & \text{if } t \ge x_n \end{cases}

Quantile function: Defined by linear interpolation:

Q(u)=xi+(100upi)(xi+1xi)pi+1piQ(u) = x_i + \frac{(100u - p_i)(x_{i+1} - x_i)}{p_{i+1} - p_i}

for pi/100upi+1/100p_i/100 \le u \le p_{i+1}/100.

Examples

dist <- dist_normal()
percentiles <- seq(0.01, 0.99, by = 0.01)
x <- vapply(percentiles, quantile, double(1L), x = dist)
dist_percentile(list(x), list(percentiles*100))

The Poisson Distribution

Description

[Stable]

Poisson distributions are frequently used to model counts. The Poisson distribution is commonly used to model the number of events occurring in a fixed interval of time or space when these events occur with a known constant mean rate and independently of the time since the last event. Examples include the number of emails received per hour, the number of decay events per second from a radioactive source, or the number of customers arriving at a store per day.

Usage

dist_poisson(lambda)

Arguments

lambda

The rate parameter (mean and variance) of the distribution. Can be any positive number. This represents the expected number of events in the given interval.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_poisson.html

In the following, let XX be a Poisson random variable with parameter lambda = λ\lambda.

Support: {0,1,2,3,...}\{0, 1, 2, 3, ...\}

Mean: λ\lambda

Variance: λ\lambda

Probability mass function (p.m.f):

P(X=k)=λkeλk!P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}

Cumulative distribution function (c.d.f):

P(Xk)=eλi=0kλii!P(X \le k) = e^{-\lambda} \sum_{i = 0}^{\lfloor k \rfloor} \frac{\lambda^i}{i!}

Moment generating function (m.g.f):

E(etX)=eλ(et1)E(e^{tX}) = e^{\lambda (e^t - 1)}

Skewness:

γ1=1λ\gamma_1 = \frac{1}{\sqrt{\lambda}}

Excess kurtosis:

γ2=1λ\gamma_2 = \frac{1}{\lambda}

See Also

stats::Poisson

Examples

dist <- dist_poisson(lambda = c(1, 4, 10))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Poisson-Inverse Gaussian distribution

Description

[Stable]

The Poisson-Inverse Gaussian distribution is a compound Poisson distribution where the rate parameter follows an Inverse Gaussian distribution. It is useful for modeling overdispersed count data.

Usage

dist_poisson_inverse_gaussian(mean, shape)

Arguments

mean, shape

parameters. Must be strictly positive. Infinite values are supported.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_poisson_inverse_gaussian.html

In the following, let XX be a Poisson-Inverse Gaussian random variable with parameters mean = μ\mu and shape = ϕ\phi.

Support: {0,1,2,3,...}\{0, 1, 2, 3, ...\}

Mean: μ\mu

Variance: μϕ(μ2+ϕ)\frac{\mu}{\phi}(\mu^2 + \phi)

Probability mass function (p.m.f):

P(X=x)=eϕ2π(ϕμ2)x/21x!0ux1/2exp(ϕu2ϕ2μ2u)duP(X = x) = \frac{e^{\phi}}{\sqrt{2\pi}} \left(\frac{\phi}{\mu^2}\right)^{x/2} \frac{1}{x!} \int_0^\infty u^{x-1/2} \exp\left(-\frac{\phi u}{2} - \frac{\phi}{2\mu^2 u}\right) du

for x=0,1,2,x = 0, 1, 2, \ldots

Cumulative distribution function (c.d.f):

P(Xx)=k=0xP(X=k)P(X \le x) = \sum_{k=0}^{\lfloor x \rfloor} P(X = k)

The c.d.f does not have a closed form and is approximated numerically.

Moment generating function (m.g.f):

E(etX)=exp{ϕ[112μ2ϕ(et1)]}E(e^{tX}) = \exp\left\{\phi\left[1 - \sqrt{1 - \frac{2\mu^2}{\phi}(e^t - 1)}\right]\right\}

for t<log(1+ϕ/(2μ2))t < -\log(1 + \phi/(2\mu^2))

See Also

actuar::PoissonInverseGaussian, actuar::dpoisinvgauss(), actuar::ppoisinvgauss(), actuar::qpoisinvgauss(), actuar::rpoisinvgauss()

Examples

dist <- dist_poisson_inverse_gaussian(mean = rep(0.1, 3), shape = c(0.4, 0.8, 1))
dist


mean(dist)
variance(dist)
support(dist)
generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

Sampling distribution

Description

[Stable]

The sampling distribution represents an empirical distribution based on observed samples. It is useful for bootstrapping, representing posterior distributions from Markov Chain Monte Carlo (MCMC) algorithms, or working with any empirical data where the parametric form is unknown. Unlike parametric distributions, the sampling distribution makes no assumptions about the underlying data-generating process and instead uses the sample itself to estimate distributional properties. The distribution can handle both univariate and multivariate samples.

Usage

dist_sample(x)

Arguments

x

A list of sampled values. For univariate distributions, each element should be a numeric vector. For multivariate distributions, each element should be a matrix where columns represent variables and rows represent observations.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_sample.html

In the following, let XX be a random variable with sample x1,x2,,xnx_1, x_2, \ldots, x_n of size nn.

Support: The observed range of the sample

Mean (univariate):

xˉ=1ni=1nxi\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i

Mean (multivariate): Computed independently for each variable.

Variance (univariate):

s2=1n1i=1n(xixˉ)2s^2 = \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})^2

Covariance (multivariate): The sample covariance matrix.

Skewness (univariate):

g1=ni=1n(xixˉ)3(i=1n(xixˉ)2)3/2(11n)3/2g_1 = \frac{\sqrt{n} \sum_{i=1}^{n} (x_i - \bar{x})^3}{\left(\sum_{i=1}^{n} (x_i - \bar{x})^2\right)^{3/2}} \left(1 - \frac{1}{n}\right)^{3/2}

Probability density function: Approximated numerically using kernel density estimation.

Cumulative distribution function (univariate):

F(q)=1ni=1nI(xiq)F(q) = \frac{1}{n} \sum_{i=1}^{n} I(x_i \leq q)

where I()I(\cdot) is the indicator function.

Cumulative distribution function (multivariate):

F(q)=1ni=1nI(xiq)F(\mathbf{q}) = \frac{1}{n} \sum_{i=1}^{n} I(\mathbf{x}_i \leq \mathbf{q})

where the inequality is applied element-wise.

Quantile function (univariate): The sample quantile, computed using the specified quantile type (see stats::quantile()).

Quantile function (multivariate): Marginal quantiles are computed independently for each variable.

Random generation: Bootstrap sampling with replacement from the empirical sample.

See Also

stats::density(), stats::quantile(), stats::cov()

Examples

# Univariate numeric samples
dist <- dist_sample(x = list(rnorm(100), rnorm(100, 10)))

dist
mean(dist)
variance(dist)
skewness(dist)
generate(dist, 10)

density(dist, 1)

# Multivariate numeric samples
dist <- dist_sample(x = list(cbind(rnorm(100), rnorm(100, 10))))
dimnames(dist) <- c("x", "y")

dist
mean(dist)
variance(dist)
generate(dist, 10)
quantile(dist, 0.4) # Returns the marginal quantiles
cdf(dist, matrix(c(0.3,9), nrow = 1))

The (non-central) location-scale Student t Distribution

Description

[Stable]

The Student's T distribution is closely related to the Normal() distribution, but has heavier tails. As ν\nu increases to \infty, the Student's T converges to a Normal. The T distribution appears repeatedly throughout classic frequentist hypothesis testing when comparing group means.

Usage

dist_student_t(df, mu = 0, sigma = 1, ncp = NULL)

Arguments

df

degrees of freedom (>0> 0, maybe non-integer). df = Inf is allowed.

mu

The location parameter of the distribution. If ncp == 0 (or NULL), this is the median.

sigma

The scale parameter of the distribution.

ncp

non-centrality parameter δ\delta; currently except for rt(), accurate only for abs(ncp) <= 37.62. If omitted, use the central t distribution.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_student_t.html

In the following, let XX be a location-scale Student's T random variable with df = ν\nu, mu = μ\mu, sigma = σ\sigma, and ncp = δ\delta (non-centrality parameter).

If ZZ follows a standard Student's T distribution (with df = ν\nu and ncp = δ\delta), then X=μ+σZX = \mu + \sigma Z.

Support: RR, the set of all real numbers

Mean:

For the central distribution (ncp = 0 or NULL):

E(X)=μE(X) = \mu

for ν>1\nu > 1, and undefined otherwise.

For the non-central distribution (ncp \neq 0):

E(X)=μ+δν2Γ((ν1)/2)Γ(ν/2)σE(X) = \mu + \delta \sqrt{\frac{\nu}{2}} \frac{\Gamma((\nu-1)/2)}{\Gamma(\nu/2)} \sigma

for ν>1\nu > 1, and undefined otherwise.

Variance:

For the central distribution (ncp = 0 or NULL):

Var(X)=νν2σ2\mathrm{Var}(X) = \frac{\nu}{\nu - 2} \sigma^2

for ν>2\nu > 2. Undefined if ν1\nu \le 1, infinite when 1<ν21 < \nu \le 2.

For the non-central distribution (ncp \neq 0):

Var(X)=[ν(1+δ2)ν2(δν2Γ((ν1)/2)Γ(ν/2))2]σ2\mathrm{Var}(X) = \left[\frac{\nu(1+\delta^2)}{\nu-2} - \left(\delta \sqrt{\frac{\nu}{2}} \frac{\Gamma((\nu-1)/2)}{\Gamma(\nu/2)}\right)^2\right] \sigma^2

for ν>2\nu > 2. Undefined if ν1\nu \le 1, infinite when 1<ν21 < \nu \le 2.

Probability density function (p.d.f):

For the central distribution (ncp = 0 or NULL), the standard t distribution with df = ν\nu has density:

fZ(z)=Γ((ν+1)/2)πνΓ(ν/2)(1+z2ν)(ν+1)/2f_Z(z) = \frac{\Gamma((\nu + 1)/2)}{\sqrt{\pi \nu} \Gamma(\nu/2)} \left(1 + \frac{z^2}{\nu} \right)^{- (\nu + 1)/2}

The location-scale version with mu = μ\mu and sigma = σ\sigma has density:

f(x)=1σfZ(xμσ)f(x) = \frac{1}{\sigma} f_Z\left(\frac{x - \mu}{\sigma}\right)

For the non-central distribution (ncp \neq 0), the density is computed numerically via stats::dt().

Cumulative distribution function (c.d.f):

For the central distribution (ncp = 0 or NULL), the cumulative distribution function is computed numerically via stats::pt(), which uses the relationship to the incomplete beta function:

Fν(t)=12Ix(ν2,12)F_\nu(t) = \frac{1}{2} I_x\left(\frac{\nu}{2}, \frac{1}{2}\right)

for t0t \le 0, where x=ν/(ν+t2)x = \nu/(\nu + t^2) and Ix(a,b)I_x(a,b) is the incomplete beta function (stats::pbeta()). For t0t \ge 0:

Fν(t)=112Ix(ν2,12)F_\nu(t) = 1 - \frac{1}{2} I_x\left(\frac{\nu}{2}, \frac{1}{2}\right)

The location-scale version is: F(x)=Fν((xμ)/σ)F(x) = F_\nu((x - \mu)/\sigma).

For the non-central distribution (ncp \neq 0), the cumulative distribution function is computed numerically via stats::pt().

Moment generating function (m.g.f):

Does not exist in closed form. Moments are computed using the formulas for mean and variance above where available.

See Also

stats::TDist

Examples

dist <- dist_student_t(df = c(1,2,5), mu = c(0,1,2), sigma = c(1,2,3))

dist
mean(dist)
variance(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Studentized Range distribution

Description

[Stable]

Tukey's studentized range distribution, used for Tukey's honestly significant differences test in ANOVA.

Usage

dist_studentized_range(nmeans, df, nranges)

Arguments

nmeans

sample size for range (same for each group).

df

degrees of freedom for ss (see below).

nranges

number of groups whose maximum range is considered.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_studentized_range.html

In the following, let QQ be a Studentized Range random variable with parameters nmeans = kk (number of groups), df = ν\nu (degrees of freedom), and nranges = nn (number of ranges).

Support: R+R^+, the set of positive real numbers.

Mean: Approximated numerically.

Variance: Approximated numerically.

Probability density function (p.d.f): The density does not have a closed-form expression and is computed numerically.

Cumulative distribution function (c.d.f): The c.d.f does not have a simple closed-form expression. For n=1n = 1 (single range), it involves integration over the joint distribution of the sample range and an independent chi-square variable. The general form is computed numerically using algorithms described in the references for stats::ptukey().

Moment generating function (m.g.f): Does not exist in closed form.

See Also

stats::Tukey

Examples

dist <- dist_studentized_range(nmeans = c(6, 2), df = c(5, 4), nranges = c(1, 1))

dist

cdf(dist, 4)

quantile(dist, 0.7)

Modify a distribution with a transformation

Description

[Maturing]

A transformed distribution applies a monotonic transformation to an existing distribution. This is useful for creating derived distributions such as log-normal (exponential transformation of normal), or other custom transformations of base distributions.

The density(), mean(), and variance() methods are approximate as they are based on numerical derivatives.

Usage

dist_transformed(dist, transform, inverse)

Arguments

dist

A univariate distribution vector.

transform

A function used to transform the distribution. This transformation should be monotonic over appropriate domain.

inverse

The inverse of the transform function.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_transformed.html

Let Y=g(X)Y = g(X) where XX is the base distribution with transformation function transform = gg and inverse = g1g^{-1}. The transformation gg must be monotonic over the support of XX.

Support: g(SX)g(S_X) where SXS_X is the support of XX

Mean: Approximated numerically using a second-order Taylor expansion:

E(Y)g(μX)+12g(μX)σX2E(Y) \approx g(\mu_X) + \frac{1}{2}g''(\mu_X)\sigma_X^2

where μX\mu_X and σX2\sigma_X^2 are the mean and variance of the base distribution XX, and gg'' is the second derivative of the transformation. The derivative is computed numerically using numDeriv::hessian().

Variance: Approximated numerically using the delta method:

Var(Y)[g(μX)]2σX2+12[g(μX)σX2]2\mathrm{Var}(Y) \approx [g'(\mu_X)]^2\sigma_X^2 + \frac{1}{2}[g''(\mu_X)\sigma_X^2]^2

where gg' is the first derivative (Jacobian) computed numerically using numDeriv::jacobian().

Probability density function (p.d.f): Using the change of variables formula:

fY(y)=fX(g1(y))ddyg1(y)f_Y(y) = f_X(g^{-1}(y)) \left|\frac{d}{dy}g^{-1}(y)\right|

where fXf_X is the p.d.f. of the base distribution and the Jacobian d/dyg1(y)|d/dy \, g^{-1}(y)| is computed numerically using numDeriv::jacobian().

Cumulative distribution function (c.d.f):

For monotonically increasing gg:

FY(y)=FX(g1(y))F_Y(y) = F_X(g^{-1}(y))

For monotonically decreasing gg:

FY(y)=1FX(g1(y))F_Y(y) = 1 - F_X(g^{-1}(y))

where FXF_X is the c.d.f. of the base distribution.

Quantile function: The inverse of the c.d.f.

For monotonically increasing gg:

QY(p)=g(QX(p))Q_Y(p) = g(Q_X(p))

For monotonically decreasing gg:

QY(p)=g(QX(1p))Q_Y(p) = g(Q_X(1-p))

where QXQ_X is the quantile function of the base distribution.

See Also

numDeriv::jacobian(), numDeriv::hessian()

Examples

# Create a log normal distribution
dist <- dist_transformed(dist_normal(0, 0.5), exp, log)
density(dist, 1) # dlnorm(1, 0, 0.5)
cdf(dist, 4) # plnorm(4, 0, 0.5)
quantile(dist, 0.1) # qlnorm(0.1, 0, 0.5)
generate(dist, 10) # rlnorm(10, 0, 0.5)

Truncate a distribution

Description

[Stable]

Note that the samples are generated using inverse transform sampling, and the means and variances are estimated from samples.

Usage

dist_truncated(dist, lower = -Inf, upper = Inf)

Arguments

dist

The distribution(s) to truncate.

lower, upper

The range of values to keep from a distribution.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_truncated.html

In the following, let XX be a truncated random variable with underlying distribution YY, truncation bounds lower = aa and upper = bb, where FY(x)F_Y(x) is the c.d.f. of YY and fY(x)f_Y(x) is the p.d.f. of YY.

Support: [a,b][a, b]

Mean: For the general case, the mean is approximated numerically. For a truncated Normal distribution with underlying mean μ\mu and standard deviation σ\sigma, the mean is:

E(X)=μ+ϕ(α)ϕ(β)Φ(β)Φ(α)σE(X) = \mu + \frac{\phi(\alpha) - \phi(\beta)}{\Phi(\beta) - \Phi(\alpha)} \sigma

where α=(aμ)/σ\alpha = (a - \mu)/\sigma, β=(bμ)/σ\beta = (b - \mu)/\sigma, ϕ\phi is the standard Normal p.d.f., and Φ\Phi is the standard Normal c.d.f.

Variance: Approximated numerically for all distributions.

Probability density function (p.d.f):

f(x)={fY(x)FY(b)FY(a)if axb0otherwisef(x) = \begin{cases} \frac{f_Y(x)}{F_Y(b) - F_Y(a)} & \text{if } a \le x \le b \\ 0 & \text{otherwise} \end{cases}

Cumulative distribution function (c.d.f):

F(x)={0if x<aFY(x)FY(a)FY(b)FY(a)if axb1if x>bF(x) = \begin{cases} 0 & \text{if } x < a \\ \frac{F_Y(x) - F_Y(a)}{F_Y(b) - F_Y(a)} & \text{if } a \le x \le b \\ 1 & \text{if } x > b \end{cases}

Quantile function:

Q(p)=FY1(FY(a)+p(FY(b)FY(a)))Q(p) = F_Y^{-1}(F_Y(a) + p(F_Y(b) - F_Y(a)))

clamped to the interval [a,b][a, b].

Examples

dist <- dist_truncated(dist_normal(2,1), lower = 0)

dist
mean(dist)
variance(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

if(requireNamespace("ggdist")) {
library(ggplot2)
ggplot() +
  ggdist::stat_dist_halfeye(
    aes(y = c("Normal", "Truncated"),
        dist = c(dist_normal(2,1), dist_truncated(dist_normal(2,1), lower = 0)))
  )
}

The Uniform distribution

Description

[Stable]

A distribution with constant density on an interval.

Usage

dist_uniform(min, max)

Arguments

min, max

lower and upper limits of the distribution. Must be finite.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_uniform.html

In the following, let XX be a Uniform random variable with parameters min = aa and max = bb.

Support: [a,b][a, b]

Mean: a+b2\frac{a + b}{2}

Variance: (ba)212\frac{(b - a)^2}{12}

Probability density function (p.d.f):

f(x)=1baf(x) = \frac{1}{b - a}

for x[a,b]x \in [a, b], and f(x)=0f(x) = 0 otherwise.

Cumulative distribution function (c.d.f):

F(x)=xabaF(x) = \frac{x - a}{b - a}

for x[a,b]x \in [a, b], with F(x)=0F(x) = 0 for x<ax < a and F(x)=1F(x) = 1 for x>bx > b.

Moment generating function (m.g.f):

E(etX)=etbetat(ba)E(e^{tX}) = \frac{e^{tb} - e^{ta}}{t(b - a)}

for t0t \neq 0, and E(etX)=1E(e^{tX}) = 1 for t=0t = 0.

Skewness: 00

Excess Kurtosis: 65-\frac{6}{5}

See Also

stats::Uniform

Examples

dist <- dist_uniform(min = c(3, -2), max = c(5, 4))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

The Weibull distribution

Description

[Stable]

Generalization of the gamma distribution. Often used in survival and time-to-event analyses.

Usage

dist_weibull(shape, scale)

Arguments

shape, scale

shape and scale parameters, the latter defaulting to 1.

Details

We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_weibull.html

In the following, let XX be a Weibull random variable with shape parameter shape = kk and scale parameter scale = λ\lambda.

Support: [0,)[0, \infty)

Mean:

E(X)=λΓ(1+1k)E(X) = \lambda \Gamma\left(1 + \frac{1}{k}\right)

where Γ\Gamma is the gamma function.

Variance:

Var(X)=λ2[Γ(1+2k)(Γ(1+1k))2]\text{Var}(X) = \lambda^2 \left[\Gamma\left(1 + \frac{2}{k}\right) - \left(\Gamma\left(1 + \frac{1}{k}\right)\right)^2\right]

Probability density function (p.d.f):

f(x)=kλ(xλ)k1e(x/λ)k,x0f(x) = \frac{k}{\lambda}\left(\frac{x}{\lambda}\right)^{k-1}e^{-(x/\lambda)^k}, \quad x \ge 0

Cumulative distribution function (c.d.f):

F(x)=1e(x/λ)k,x0F(x) = 1 - e^{-(x/\lambda)^k}, \quad x \ge 0

Moment generating function (m.g.f):

E(etX)=n=0tnλnn!Γ(1+nk)E(e^{tX}) = \sum_{n=0}^\infty \frac{t^n\lambda^n}{n!} \Gamma\left(1+\frac{n}{k}\right)

Skewness:

γ1=μ33μσ2μ3σ3\gamma_1 = \frac{\mu^3 - 3\mu\sigma^2 - \mu^3}{\sigma^3}

where μ=E(X)\mu = E(X), σ2=Var(X)\sigma^2 = \text{Var}(X), and the third raw moment is

μ3=λ3Γ(1+3k)\mu^3 = \lambda^3 \Gamma\left(1 + \frac{3}{k}\right)

Excess Kurtosis:

γ2=μ44γ1μσ36μ2σ2μ4σ43\gamma_2 = \frac{\mu^4 - 4\gamma_1\mu\sigma^3 - 6\mu^2\sigma^2 - \mu^4}{\sigma^4} - 3

where the fourth raw moment is

μ4=λ4Γ(1+4k)\mu^4 = \lambda^4 \Gamma\left(1 + \frac{4}{k}\right)

See Also

stats::Weibull

Examples

dist <- dist_weibull(shape = c(0.5, 1, 1.5, 5), scale = rep(1, 4))

dist
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

cdf(dist, 4)

quantile(dist, 0.7)

Create a distribution from p/d/q/r style functions

Description

[Maturing]

If a distribution is not yet supported, you can vectorise p/d/q/r functions using this function. dist_wrap() stores the distributions parameters, and provides wrappers which call the appropriate p/d/q/r functions.

Using this function to wrap a distribution should only be done if the distribution is not yet available in this package. If you need a distribution which isn't in the package yet, consider making a request at https://github.com/mitchelloharawild/distributional/issues.

Usage

dist_wrap(dist, ..., package = NULL)

Arguments

dist

The name of the distribution used in the functions (name that is prefixed by p/d/q/r)

...

Named arguments used to parameterise the distribution.

package

The package from which the distribution is provided. If NULL, the calling environment's search path is used to find the distribution functions. Alternatively, an arbitrary environment can also be provided here.

Details

The dist_wrap() function provides a generic interface to create distribution objects from any set of p/d/q/r style functions. The statistical properties depend on the specific distribution being wrapped.

Examples

dist <- dist_wrap("norm", mean = 1:3, sd = c(3, 9, 2))

density(dist, 1) # dnorm()
cdf(dist, 4) # pnorm()
quantile(dist, 0.975) # qnorm()
generate(dist, 10) # rnorm()

library(actuar)
dist <- dist_wrap("invparalogis", package = "actuar", shape = 2, rate = 2)
density(dist, 1) # actuar::dinvparalogis()
cdf(dist, 4) # actuar::pinvparalogis()
quantile(dist, 0.975) # actuar::qinvparalogis()
generate(dist, 10) # actuar::rinvparalogis()

Extract the name of the distribution family

Description

[Experimental]

Usage

## S3 method for class 'distribution'
family(object, ...)

Arguments

object

The distribution(s).

...

Additional arguments used by methods.

Examples

dist <- c(
  dist_normal(1:2),
  dist_poisson(3),
  dist_multinomial(size = c(4, 3),
  prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4)))
  )
family(dist)

Randomly sample values from a distribution

Description

[Stable]

Generate random samples from probability distributions.

Usage

## S3 method for class 'distribution'
generate(x, times, ...)

Arguments

x

The distribution(s).

times

The number of samples.

...

Additional arguments used by methods.


Check if a distribution is symmetric

Description

[Experimental]

Determines whether a probability distribution is symmetric around its center.

Usage

has_symmetry(x, ...)

Arguments

x

The distribution(s).

...

Additional arguments used by methods.

Value

A logical value indicating whether the distribution is symmetric.

Examples

# Normal distribution is symmetric
has_symmetry(dist_normal(mu = 0, sigma = 1))
has_symmetry(dist_normal(mu = 5, sigma = 2))

# Beta distribution symmetry depends on parameters
has_symmetry(dist_beta(shape1 = 2, shape2 = 2))  # symmetric
has_symmetry(dist_beta(shape1 = 2, shape2 = 5))  # not symmetric

Compute highest density regions

Description

Used to extract a specified prediction interval at a particular confidence level from a distribution.

Usage

hdr(x, ...)

Arguments

x

Object to create hilo from.

...

Additional arguments used by methods.


Highest density regions of probability distributions

Description

[Maturing]

This function is highly experimental and will change in the future. In particular, improved functionality for object classes and visualisation tools will be added in a future release.

Computes minimally sized probability intervals highest density regions.

Usage

## S3 method for class 'distribution'
hdr(x, size = 95, n = 512, ...)

Arguments

x

The distribution(s).

size

The size of the interval (between 0 and 100).

n

The resolution used to estimate the distribution's density.

...

Additional arguments used by methods.


Compute intervals

Description

[Stable]

Used to extract a specified prediction interval at a particular confidence level from a distribution.

The numeric lower and upper bounds can be extracted from the interval using ⁠<hilo>$lower⁠ and ⁠<hilo>$upper⁠ as shown in the examples below.

Usage

hilo(x, ...)

Arguments

x

Object to create hilo from.

...

Additional arguments used by methods.

Examples

# 95% interval from a standard normal distribution
interval <- hilo(dist_normal(0, 1), 95)
interval

# Extract the individual quantities with `$lower`, `$upper`, and `$level`
interval$lower
interval$upper
interval$level

Probability intervals of a probability distribution

Description

[Stable]

Returns a hilo central probability interval with probability coverage of size. By default, the distribution's quantile() will be used to compute the lower and upper bound for a centered interval

Usage

## S3 method for class 'distribution'
hilo(x, size = 95, ...)

Arguments

x

The distribution(s).

size

The size of the interval (between 0 and 100).

...

Additional arguments used by methods.

See Also

hdr.distribution()


Test if the object is a distribution

Description

[Stable]

This function returns TRUE for distributions and FALSE for all other objects.

Usage

is_distribution(x)

Arguments

x

An object.

Value

TRUE if the object inherits from the distribution class.

Examples

dist <- dist_normal()
is_distribution(dist)
is_distribution("distributional")

Is the object a hdr

Description

Is the object a hdr

Usage

is_hdr(x)

Arguments

x

An object.


Is the object a hilo

Description

Is the object a hilo

Usage

is_hilo(x)

Arguments

x

An object.


Kurtosis of a probability distribution

Description

[Stable]

Usage

kurtosis(x, ...)

## S3 method for class 'distribution'
kurtosis(x, ...)

Arguments

x

The distribution(s).

...

Additional arguments used by methods.


The (log) likelihood of a sample matching a distribution

Description

[Stable]

Usage

likelihood(x, ...)

## S3 method for class 'distribution'
likelihood(x, sample, ..., log = FALSE)

log_likelihood(x, ...)

Arguments

x

The distribution(s).

...

Additional arguments used by methods.

sample

A list of sampled values to compare to distribution(s).

log

If TRUE, the log-likelihood will be computed.


Mean of a probability distribution

Description

[Stable]

Returns the empirical mean of the probability distribution. If the method does not exist, the mean of a random sample will be returned.

Usage

## S3 method for class 'distribution'
mean(x, ...)

Arguments

x

The distribution(s).

...

Additional arguments used by methods.


Median of a probability distribution

Description

[Stable]

Returns the median (50th percentile) of a probability distribution. This is equivalent to quantile(x, p=0.5).

Usage

## S3 method for class 'distribution'
median(x, na.rm = FALSE, ...)

Arguments

x

The distribution(s).

na.rm

Unused, included for consistency with the generic function.

...

Additional arguments used by methods.


Construct distributions

Description

[Maturing]

Allows extension package developers to define a new distribution class compatible with the distributional package.

Usage

new_dist(..., class = NULL, dimnames = NULL)

Arguments

...

Parameters of the distribution (named).

class

The class of the distribution for S3 dispatch.

dimnames

The names of the variables in the distribution (optional).


Construct hdr intervals

Description

Construct hdr intervals

Usage

new_hdr(
  lower = list_of(.ptype = double()),
  upper = list_of(.ptype = double()),
  size = double()
)

Arguments

lower, upper

A list of numeric vectors specifying the region's lower and upper bounds.

size

A numeric vector specifying the coverage size of the region.

Value

A "hdr" vector

Author(s)

Mitchell O'Hara-Wild

Examples

new_hdr(lower = list(1, c(3,6)), upper = list(10, c(5, 8)), size = c(80, 95))

Construct hilo intervals

Description

[Stable]

Class constructor function to help with manually creating hilo interval objects.

Usage

new_hilo(lower = double(), upper = double(), size = double())

Arguments

lower, upper

A numeric vector of values for lower and upper limits.

size

Size of the interval between [0, 100].

Value

A "hilo" vector

Author(s)

Earo Wang & Mitchell O'Hara-Wild

Examples

new_hilo(lower = rnorm(10), upper = rnorm(10) + 5, size = 95)

Construct support regions

Description

Construct support regions

Usage

new_support_region(x = numeric(), limits = list(), closed = list())

Arguments

x

A list of prototype vectors defining the distribution type.

limits

A list of value limits for the distribution.

closed

A list of logical(2L) indicating whether the limits are closed.


Extract the parameters of a distribution

Description

[Experimental]

Usage

parameters(x, ...)

## S3 method for class 'distribution'
parameters(x, ...)

Arguments

x

The distribution(s).

...

Additional arguments used by methods.

Examples

dist <- c(
  dist_normal(1:2),
  dist_poisson(3),
  dist_multinomial(size = c(4, 3),
  prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4)))
  )
parameters(dist)

Distribution Quantiles

Description

[Stable]

Computes the quantiles of a distribution.

Usage

## S3 method for class 'distribution'
quantile(x, p, ..., log = FALSE)

Arguments

x

The distribution(s).

p

The probability of the quantile.

...

Additional arguments passed to methods.

log

If TRUE, probabilities will be given as log probabilities.


Skewness of a probability distribution

Description

[Stable]

Usage

skewness(x, ...)

## S3 method for class 'distribution'
skewness(x, ...)

Arguments

x

The distribution(s).

...

Additional arguments used by methods.


Region of support of a distribution

Description

[Experimental]

Usage

support(x, ...)

## S3 method for class 'distribution'
support(x, ...)

Arguments

x

The distribution(s).

...

Additional arguments used by methods.


Variance

Description

[Stable]

A generic function for computing the variance of an object.

Usage

variance(x, ...)

## S3 method for class 'numeric'
variance(x, ...)

## S3 method for class 'matrix'
variance(x, ...)

## S3 method for class 'numeric'
covariance(x, ...)

Arguments

x

An object.

...

Additional arguments used by methods.

Details

The implementation of variance() for numeric variables coerces the input to a vector then uses stats::var() to compute the variance. This means that, unlike stats::var(), if variance() is passed a matrix or a 2-dimensional array, it will still return the variance (stats::var() returns the covariance matrix in that case).

See Also

variance.distribution(), covariance()


Variance of a probability distribution

Description

[Stable]

Returns the empirical variance of the probability distribution. If the method does not exist, the variance of a random sample will be returned.

Usage

## S3 method for class 'distribution'
variance(x, ...)

Arguments

x

The distribution(s).

...

Additional arguments used by methods.