Title: | Vectorised Probability Distributions |
---|---|
Description: | Vectorised distribution objects with tools for manipulating, visualising, and using probability distributions. Designed to allow model prediction outputs to return distributions rather than their parameters, allowing users to directly interact with predictive distributions in a data-oriented workflow. In addition to providing generic replacements for p/d/q/r functions, other useful statistics can be computed including means, variances, intervals, and highest density regions. |
Authors: | Mitchell O'Hara-Wild [aut, cre] , Matthew Kay [aut] , Alex Hayes [aut] , Rob Hyndman [aut] , Earo Wang [ctb] , Vencislav Popov [ctb] |
Maintainer: | Mitchell O'Hara-Wild <[email protected]> |
License: | GPL-3 |
Version: | 0.5.0.9000 |
Built: | 2024-11-14 12:30:11 UTC |
Source: | https://github.com/mitchelloharawild/distributional |
cdf(x, q, ..., log = FALSE) ## S3 method for class 'distribution' cdf(x, q, ...)
cdf(x, q, ..., log = FALSE) ## S3 method for class 'distribution' cdf(x, q, ...)
x |
The distribution(s). |
q |
The quantile at which the cdf is calculated. |
... |
Additional arguments passed to methods. |
log |
If |
A generic function for computing the covariance of an object.
covariance(x, ...)
covariance(x, ...)
x |
An object. |
... |
Additional arguments used by methods. |
covariance.distribution()
, variance()
Returns the empirical covariance of the probability distribution. If the method does not exist, the covariance of a random sample will be returned.
## S3 method for class 'distribution' covariance(x, ...)
## S3 method for class 'distribution' covariance(x, ...)
x |
The distribution(s). |
... |
Additional arguments used by methods. |
Computes the probability density function for a continuous distribution, or the probability mass function for a discrete distribution.
## S3 method for class 'distribution' density(x, at, ..., log = FALSE)
## S3 method for class 'distribution' density(x, at, ..., log = FALSE)
x |
The distribution(s). |
at |
The point at which to compute the density/mass. |
... |
Additional arguments passed to methods. |
log |
If |
Bernoulli distributions are used to represent events like coin flips
when there is single trial that is either successful or unsuccessful.
The Bernoulli distribution is a special case of the Binomial()
distribution with n = 1
.
dist_bernoulli(prob)
dist_bernoulli(prob)
prob |
The probability of success on each trial, |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Bernoulli random variable with parameter
p
= . Some textbooks also define
, or use
instead of
.
The Bernoulli probability distribution is widely used to model
binary variables, such as 'failure' and 'success'. The most
typical example is the flip of a coin, when is thought as the
probability of flipping a head, and
is the
probability of flipping a tail.
Support:
Mean:
Variance:
Probability mass function (p.m.f):
Cumulative distribution function (c.d.f):
Moment generating function (m.g.f):
dist <- dist_bernoulli(prob = c(0.05, 0.5, 0.3, 0.9, 0.1)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_bernoulli(prob = c(0.05, 0.5, 0.3, 0.9, 0.1)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist_beta(shape1, shape2)
dist_beta(shape1, shape2)
shape1 , shape2
|
The non-negative shape parameters of the Beta distribution. |
dist <- dist_beta(shape1 = c(0.5, 5, 1, 2, 2), shape2 = c(0.5, 1, 3, 2, 5)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_beta(shape1 = c(0.5, 5, 1, 2, 2), shape2 = c(0.5, 1, 3, 2, 5)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
Binomial distributions are used to represent situations can that can
be thought as the result of Bernoulli experiments (here the
is defined as the
size
of the experiment). The classical
example is independent coin flips, where each coin flip has
probability
p
of success. In this case, the individual probability of
flipping heads or tails is given by the Bernoulli(p) distribution,
and the probability of having equal results (
heads,
for example), in
trials is given by the Binomial(n, p) distribution.
The equation of the Binomial distribution is directly derived from
the equation of the Bernoulli distribution.
dist_binomial(size, prob)
dist_binomial(size, prob)
size |
The number of trials. Must be an integer greater than or equal
to one. When |
prob |
The probability of success on each trial, |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
The Binomial distribution comes up when you are interested in the portion
of people who do a thing. The Binomial distribution
also comes up in the sign test, sometimes called the Binomial test
(see stats::binom.test()
), where you may need the Binomial C.D.F. to
compute p-values.
In the following, let be a Binomial random variable with parameter
size
= and
p
= . Some textbooks define
,
or called
instead of
.
Support:
Mean:
Variance:
Probability mass function (p.m.f):
Cumulative distribution function (c.d.f):
Moment generating function (m.g.f):
dist <- dist_binomial(size = 1:5, prob = c(0.05, 0.5, 0.3, 0.9, 0.1)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_binomial(size = 1:5, prob = c(0.05, 0.5, 0.3, 0.9, 0.1)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist_burr(shape1, shape2, rate = 1, scale = 1/rate)
dist_burr(shape1, shape2, rate = 1, scale = 1/rate)
shape1 , shape2 , scale
|
parameters. Must be strictly positive. |
rate |
an alternative way to specify the scale. |
dist <- dist_burr(shape1 = c(1,1,1,2,3,0.5), shape2 = c(1,2,3,1,1,2)) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_burr(shape1 = c(1,1,1,2,3,0.5), shape2 = c(1,2,3,1,1,2)) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
Categorical distributions are used to represent events with multiple
outcomes, such as what number appears on the roll of a dice. This is also
referred to as the 'generalised Bernoulli' or 'multinoulli' distribution.
The Cateogorical distribution is a special case of the Multinomial()
distribution with n = 1
.
dist_categorical(prob, outcomes = NULL)
dist_categorical(prob, outcomes = NULL)
prob |
A list of probabilities of observing each outcome category. |
outcomes |
The values used to represent each outcome. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Categorical random variable with
probability parameters
p
= .
The Categorical probability distribution is widely used to model the
occurance of multiple events. A simple example is the roll of a dice, where
giving equal chance of observing
each number on a 6 sided dice.
Support:
Mean:
Variance:
Probability mass function (p.m.f):
Cumulative distribution function (c.d.f):
The cdf() of a categorical distribution is undefined as the outcome categories aren't ordered.
dist <- dist_categorical(prob = list(c(0.05, 0.5, 0.15, 0.2, 0.1), c(0.3, 0.1, 0.6))) dist generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) # The outcomes aren't ordered, so many statistics are not applicable. cdf(dist, 4) quantile(dist, 0.7) mean(dist) variance(dist) skewness(dist) kurtosis(dist) dist <- dist_categorical( prob = list(c(0.05, 0.5, 0.15, 0.2, 0.1), c(0.3, 0.1, 0.6)), outcomes = list(letters[1:5], letters[24:26]) ) generate(dist, 10) density(dist, "a") density(dist, "z", log = TRUE)
dist <- dist_categorical(prob = list(c(0.05, 0.5, 0.15, 0.2, 0.1), c(0.3, 0.1, 0.6))) dist generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) # The outcomes aren't ordered, so many statistics are not applicable. cdf(dist, 4) quantile(dist, 0.7) mean(dist) variance(dist) skewness(dist) kurtosis(dist) dist <- dist_categorical( prob = list(c(0.05, 0.5, 0.15, 0.2, 0.1), c(0.3, 0.1, 0.6)), outcomes = list(letters[1:5], letters[24:26]) ) generate(dist, 10) density(dist, "a") density(dist, "z", log = TRUE)
The Cauchy distribution is the student's t distribution with one degree of freedom. The Cauchy distribution does not have a well defined mean or variance. Cauchy distributions often appear as priors in Bayesian contexts due to their heavy tails.
dist_cauchy(location, scale)
dist_cauchy(location, scale)
location , scale
|
location and scale parameters. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Cauchy variable with mean
location =
and
scale
= .
Support: , the set of all real numbers
Mean: Undefined.
Variance: Undefined.
Probability density function (p.d.f):
Cumulative distribution function (c.d.f):
Moment generating function (m.g.f):
Does not exist.
dist <- dist_cauchy(location = c(0, 0, 0, -2), scale = c(0.5, 1, 2, 1)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_cauchy(location = c(0, 0, 0, -2), scale = c(0.5, 1, 2, 1)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
Chi-square distributions show up often in frequentist settings as the sampling distribution of test statistics, especially in maximum likelihood estimation settings.
dist_chisq(df, ncp = 0)
dist_chisq(df, ncp = 0)
df |
degrees of freedom (non-negative, but can be non-integer). |
ncp |
non-centrality parameter (non-negative). |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a
random variable with
df
= .
Support: , the set of positive real numbers
Mean:
Variance:
Probability density function (p.d.f):
Cumulative distribution function (c.d.f):
The cumulative distribution function has the form
but this integral does not have a closed form solution and must be
approximated numerically. The c.d.f. of a standard normal is sometimes
called the "error function". The notation also stands
for the c.d.f. of a standard normal evaluated at
. Z-tables
list the value of
for various
.
Moment generating function (m.g.f):
dist <- dist_chisq(df = c(1,2,3,4,6,9)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_chisq(df = c(1,2,3,4,6,9)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
The degenerate distribution takes a single value which is certain to be observed. It takes a single parameter, which is the value that is observed by the distribution.
dist_degenerate(x)
dist_degenerate(x)
x |
The value of the distribution. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a degenerate random variable with value
x
= .
Support: , the set of all real numbers
Mean:
Variance:
Probability density function (p.d.f):
Cumulative distribution function (c.d.f):
The cumulative distribution function has the form
Moment generating function (m.g.f):
dist_degenerate(x = 1:5)
dist_degenerate(x = 1:5)
dist_exponential(rate)
dist_exponential(rate)
rate |
vector of rates. |
dist <- dist_exponential(rate = c(2, 1, 2/3)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_exponential(rate = c(2, 1, 2/3)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist_f(df1, df2, ncp = NULL)
dist_f(df1, df2, ncp = NULL)
df1 , df2
|
degrees of freedom. |
ncp |
non-centrality parameter. If omitted the central F is assumed. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Gamma random variable
with parameters
shape
= and
rate
= .
Support:
Mean:
Variance:
Probability density function (p.m.f):
Cumulative distribution function (c.d.f):
Moment generating function (m.g.f):
dist <- dist_f(df1 = c(1,2,5,10,100), df2 = c(1,1,2,1,100)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_f(df1 = c(1,2,5,10,100), df2 = c(1,1,2,1,100)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
Several important distributions are special cases of the Gamma
distribution. When the shape parameter is 1
, the Gamma is an
exponential distribution with parameter . When the
and
, the Gamma is a equivalent to
a chi squared distribution with n degrees of freedom. Moreover, if
we have
is
and
is
, a function of these two variables
of the form
.
This last property frequently appears in another distributions, and it
has extensively been used in multivariate methods. More about the Gamma
distribution will be added soon.
dist_gamma(shape, rate, scale = 1/rate)
dist_gamma(shape, rate, scale = 1/rate)
shape , scale
|
shape and scale parameters. Must be positive,
|
rate |
an alternative way to specify the scale. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Gamma random variable
with parameters
shape
= and
rate
= .
Support:
Mean:
Variance:
Probability density function (p.m.f):
Cumulative distribution function (c.d.f):
Moment generating function (m.g.f):
dist <- dist_gamma(shape = c(1,2,3,5,9,7.5,0.5), rate = c(0.5,0.5,0.5,1,2,1,1)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_gamma(shape = c(1,2,3,5,9,7.5,0.5), rate = c(0.5,0.5,0.5,1,2,1,1)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
The Geometric distribution can be thought of as a generalization
of the dist_bernoulli()
distribution where we ask: "if I keep flipping a
coin with probability p
of heads, what is the probability I need
flips before I get my first heads?" The Geometric
distribution is a special case of Negative Binomial distribution.
dist_geometric(prob)
dist_geometric(prob)
prob |
probability of success in each trial. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Geometric random variable with
success probability
p
= . Note that there are multiple
parameterizations of the Geometric distribution.
Support: 0 < p < 1,
Mean:
Variance:
Probability mass function (p.m.f):
Cumulative distribution function (c.d.f):
Moment generating function (m.g.f):
dist <- dist_geometric(prob = c(0.2, 0.5, 0.8)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_geometric(prob = c(0.2, 0.5, 0.8)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
The GEV distribution function with parameters ,
and
is
dist_gev(location, scale, shape)
dist_gev(location, scale, shape)
location |
the location parameter |
scale |
the scale parameter |
shape |
the shape parameter |
for , where
. If
the distribution
is defined by continuity, giving
The support of the distribution is the real line if ,
if
, and
if
.
The parametric form of the GEV encompasses that of the Gumbel, Frechet and
reverse Weibull distributions, which are obtained for ,
and
respectively. It was first introduced by
Jenkinson (1955).
Jenkinson, A. F. (1955) The frequency distribution of the annual maximum (or minimum) of meteorological elements. Quart. J. R. Met. Soc., 81, 158–171.
dist <- dist_gev(location = 0, scale = 1, shape = 0)
dist <- dist_gev(location = 0, scale = 1, shape = 0)
The generalised g-and-h distribution is a flexible distribution used to model univariate data, similar to the g-k distribution. It is known for its ability to handle skewness and heavy-tailed behavior.
dist_gh(A, B, g, h, c = 0.8)
dist_gh(A, B, g, h, c = 0.8)
A |
Vector of A (location) parameters. |
B |
Vector of B (scale) parameters. Must be positive. |
g |
Vector of g parameters. |
h |
Vector of h parameters. Must be non-negative. |
c |
Vector of c parameters (used for generalised g-and-h). Often fixed at 0.8 which is the default. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a g-and-h random variable with parameters
A
, B
, g
, h
, and c
.
Support:
Mean: Not available in closed form.
Variance: Not available in closed form.
Probability density function (p.d.f):
The g-and-h distribution does not have a closed-form expression for its density. Instead, it is defined through its quantile function:
where
Cumulative distribution function (c.d.f):
The cumulative distribution function is typically evaluated numerically due to the lack of a closed-form expression.
dist <- dist_gh(A = 0, B = 1, g = 0, h = 0.5) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_gh(A = 0, B = 1, g = 0, h = 0.5) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
The g-and-k distribution is a flexible distribution often used to model univariate data. It is particularly known for its ability to handle skewness and heavy-tailed behavior.
dist_gk(A, B, g, k, c = 0.8)
dist_gk(A, B, g, k, c = 0.8)
A |
Vector of A (location) parameters. |
B |
Vector of B (scale) parameters. Must be positive. |
g |
Vector of g parameters. |
k |
Vector of k parameters. Must be at least -0.5. |
c |
Vector of c parameters. Often fixed at 0.8 which is the default. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a g-k random variable with parameters
A
, B
, g
, k
, and c
.
Support:
Mean: Not available in closed form.
Variance: Not available in closed form.
Probability density function (p.d.f):
The g-k distribution does not have a closed-form expression for its density. Instead, it is defined through its quantile function:
where , the standard normal quantile of u.
Cumulative distribution function (c.d.f):
The cumulative distribution function is typically evaluated numerically due to the lack of a closed-form expression.
dist <- dist_gk(A = 0, B = 1, g = 0, k = 0.5) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_gk(A = 0, B = 1, g = 0, k = 0.5) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
The GPD distribution function with parameters ,
and
is
dist_gpd(location, scale, shape)
dist_gpd(location, scale, shape)
location |
the location parameter |
scale |
the scale parameter |
shape |
the shape parameter |
for , where
. If
the distribution
is defined by continuity, giving
The support of the distribution is if
, and
if
.
The Pickands–Balkema–De Haan theorem states that for a large class of distributions, the tail (above some threshold) can be approximated by a GPD.
dist <- dist_gpd(location = 0, scale = 1, shape = 0)
dist <- dist_gpd(location = 0, scale = 1, shape = 0)
The Gumbel distribution is a special case of the Generalized Extreme Value
distribution, obtained when the GEV shape parameter is equal to 0.
It may be referred to as a type I extreme value distribution.
dist_gumbel(alpha, scale)
dist_gumbel(alpha, scale)
alpha |
location parameter. |
scale |
parameter. Must be strictly positive. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Gumbel random variable with location
parameter
mu
= , scale parameter
sigma
= .
Support: , the set of all real numbers.
Mean: , where
is Euler's
constant, approximately equal to 0.57722.
Median: .
Variance: .
Probability density function (p.d.f):
for in
, the set of all real numbers.
Cumulative distribution function (c.d.f):
In the (Gumbel) special case
for in
, the set of all real numbers.
dist <- dist_gumbel(alpha = c(0.5, 1, 1.5, 3), scale = c(2, 2, 3, 4)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_gumbel(alpha = c(0.5, 1, 1.5, 3), scale = c(2, 2, 3, 4)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
To understand the HyperGeometric distribution, consider a set of
objects, of which
are of the type I and
are of the type II. A sample with size
(
)
with no replacement is randomly chosen. The number of observed
type I elements observed in this sample is set to be our random
variable
.
dist_hypergeometric(m, n, k)
dist_hypergeometric(m, n, k)
m |
The number of type I elements available. |
n |
The number of type II elements available. |
k |
The size of the sample taken. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a HyperGeometric random variable with
success probability
p
= .
Support:
Mean:
Variance:
Probability mass function (p.m.f):
Cumulative distribution function (c.d.f):
dist <- dist_hypergeometric(m = rep(500, 3), n = c(50, 60, 70), k = c(100, 200, 300)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_hypergeometric(m = rep(500, 3), n = c(50, 60, 70), k = c(100, 200, 300)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist_inflated(dist, prob, x = 0)
dist_inflated(dist, prob, x = 0)
dist |
The distribution(s) to inflate. |
prob |
The added probability of observing |
x |
The value to inflate. The default of |
dist_inverse_exponential(rate)
dist_inverse_exponential(rate)
rate |
an alternative way to specify the scale. |
dist <- dist_inverse_exponential(rate = 1:5) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_inverse_exponential(rate = 1:5) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist_inverse_gamma(shape, rate = 1/scale, scale)
dist_inverse_gamma(shape, rate = 1/scale, scale)
shape , scale
|
parameters. Must be strictly positive. |
rate |
an alternative way to specify the scale. |
dist <- dist_inverse_gamma(shape = c(1,2,3,3), rate = c(1,1,1,2)) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_inverse_gamma(shape = c(1,2,3,3), rate = c(1,1,1,2)) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist_inverse_gaussian(mean, shape)
dist_inverse_gaussian(mean, shape)
mean , shape
|
parameters. Must be strictly positive. Infinite values are supported. |
dist <- dist_inverse_gaussian(mean = c(1,1,1,3,3), shape = c(0.2, 1, 3, 0.2, 1)) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_inverse_gaussian(mean = c(1,1,1,3,3), shape = c(0.2, 1, 3, 0.2, 1)) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist_logarithmic(prob)
dist_logarithmic(prob)
prob |
parameter. |
dist <- dist_logarithmic(prob = c(0.33, 0.66, 0.99)) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_logarithmic(prob = c(0.33, 0.66, 0.99)) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
A continuous distribution on the real line. For binary outcomes
the model given by where
is the Logistic
cdf()
is called logistic regression.
dist_logistic(location, scale)
dist_logistic(location, scale)
location , scale
|
location and scale parameters. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Logistic random variable with
location
= and
scale
= .
Support: , the set of all real numbers
Mean:
Variance:
Probability density function (p.d.f):
Cumulative distribution function (c.d.f):
Moment generating function (m.g.f):
where is the Beta function.
dist <- dist_logistic(location = c(5,9,9,6,2), scale = c(2,3,4,2,1)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_logistic(location = c(5,9,9,6,2), scale = c(2,3,4,2,1)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
The log-normal distribution is a commonly used transformation of the Normal
distribution. If follows a log-normal distribution, then
would be characteristed by a Normal distribution.
dist_lognormal(mu = 0, sigma = 1)
dist_lognormal(mu = 0, sigma = 1)
mu |
The mean (location parameter) of the distribution, which is the mean of the associated Normal distribution. Can be any real number. |
sigma |
The standard deviation (scale parameter) of the distribution. Can be any positive number. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Normal random variable with mean
mu
= and standard deviation
sigma
= . The
log-normal distribution
is characterised by:
Support: , the set of all real numbers greater than or equal to 0.
Mean:
Variance:
Probability density function (p.d.f):
Cumulative distribution function (c.d.f):
The cumulative distribution function has the form
Where is the CDF of a standard Normal distribution, N(0,1).
dist <- dist_lognormal(mu = 1:5, sigma = 0.1) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7) # A log-normal distribution X is exp(Y), where Y is a Normal distribution of # the same parameters. So log(X) will produce the Normal distribution Y. log(dist)
dist <- dist_lognormal(mu = 1:5, sigma = 0.1) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7) # A log-normal distribution X is exp(Y), where Y is a Normal distribution of # the same parameters. So log(X) will produce the Normal distribution Y. log(dist)
A placeholder distribution for handling missing values in a vector of distributions.
dist_missing(length = 1)
dist_missing(length = 1)
length |
The number of missing distributions |
dist <- dist_missing(3L) dist mean(dist) variance(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_missing(3L) dist mean(dist) variance(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist_mixture(..., weights = numeric())
dist_mixture(..., weights = numeric())
... |
Distributions to be used in the mixture. |
weights |
The weight of each distribution passed to |
dist_mixture(dist_normal(0, 1), dist_normal(5, 2), weights = c(0.3, 0.7))
dist_mixture(dist_normal(0, 1), dist_normal(5, 2), weights = c(0.3, 0.7))
The multinomial distribution is a generalization of the binomial
distribution to multiple categories. It is perhaps easiest to think
that we first extend a dist_bernoulli()
distribution to include more
than two categories, resulting in a dist_categorical()
distribution.
We then extend repeat the Categorical experiment several ()
times.
dist_multinomial(size, prob)
dist_multinomial(size, prob)
size |
The number of draws from the Categorical distribution. |
prob |
The probability of an event occurring from each draw. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Multinomial
random variable with success probability
p
= . Note that
is vector with
elements that sum to one. Assume
that we repeat the Categorical experiment
size
= times.
Support: Each is in
.
Mean: The mean of is
.
Variance: The variance of is
.
For
, the covariance of
and
is
.
Probability mass function (p.m.f):
Cumulative distribution function (c.d.f):
Omitted for multivariate random variables for the time being.
Moment generating function (m.g.f):
dist <- dist_multinomial(size = c(4, 3), prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4))) dist mean(dist) variance(dist) generate(dist, 10) # TODO: Needs fixing to support multiple inputs # density(dist, 2) # density(dist, 2, log = TRUE)
dist <- dist_multinomial(size = c(4, 3), prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4))) dist mean(dist) variance(dist) generate(dist, 10) # TODO: Needs fixing to support multiple inputs # density(dist, 2) # density(dist, 2, log = TRUE)
dist_multivariate_normal(mu = 0, sigma = diag(1))
dist_multivariate_normal(mu = 0, sigma = diag(1))
mu |
A list of numeric vectors for the distribution's mean. |
sigma |
A list of matrices for the distribution's variance-covariance matrix. |
mvtnorm::dmvnorm, mvtnorm::qmvnorm
dist <- dist_multivariate_normal(mu = list(c(1,2)), sigma = list(matrix(c(4,2,2,3), ncol=2))) dimnames(dist) <- c("x", "y") dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, cbind(2, 1)) density(dist, cbind(2, 1), log = TRUE) cdf(dist, 4) quantile(dist, 0.7) quantile(dist, 0.7, type = "marginal")
dist <- dist_multivariate_normal(mu = list(c(1,2)), sigma = list(matrix(c(4,2,2,3), ncol=2))) dimnames(dist) <- c("x", "y") dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, cbind(2, 1)) density(dist, cbind(2, 1), log = TRUE) cdf(dist, 4) quantile(dist, 0.7) quantile(dist, 0.7, type = "marginal")
A generalization of the geometric distribution. It is the number
of failures in a sequence of i.i.d. Bernoulli trials before
a specified number of successes (size
) occur. The probability of success in
each trial is given by prob
.
dist_negative_binomial(size, prob)
dist_negative_binomial(size, prob)
size |
target for number of successful trials, or dispersion parameter (the shape parameter of the gamma mixing distribution). Must be strictly positive, need not be integer. |
prob |
probability of success in each trial. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Negative Binomial random variable with
success probability
prob
= and the number of successes
size
=
.
Support:
Mean:
Variance:
Probability mass function (p.m.f):
Cumulative distribution function (c.d.f):
Too nasty, omitted.
Moment generating function (m.g.f):
dist <- dist_negative_binomial(size = 10, prob = 0.5) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_negative_binomial(size = 10, prob = 0.5) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
The Normal distribution is ubiquitous in statistics, partially because of the central limit theorem, which states that sums of i.i.d. random variables eventually become Normal. Linear transformations of Normal random variables result in new random variables that are also Normal. If you are taking an intro stats course, you'll likely use the Normal distribution for Z-tests and in simple linear regression. Under regularity conditions, maximum likelihood estimators are asymptotically Normal. The Normal distribution is also called the gaussian distribution.
dist_normal(mu = 0, sigma = 1, mean = mu, sd = sigma)
dist_normal(mu = 0, sigma = 1, mean = mu, sd = sigma)
mu , mean
|
The mean (location parameter) of the distribution, which is also the mean of the distribution. Can be any real number. |
sigma , sd
|
The standard deviation (scale parameter) of the distribution.
Can be any positive number. If you would like a Normal distribution with
variance |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Normal random variable with mean
mu
= and standard deviation
sigma
= .
Support: , the set of all real numbers
Mean:
Variance:
Probability density function (p.d.f):
Cumulative distribution function (c.d.f):
The cumulative distribution function has the form
but this integral does not have a closed form solution and must be
approximated numerically. The c.d.f. of a standard Normal is sometimes
called the "error function". The notation also stands
for the c.d.f. of a standard Normal evaluated at
. Z-tables
list the value of
for various
.
Moment generating function (m.g.f):
dist <- dist_normal(mu = 1:5, sigma = 3) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_normal(mu = 1:5, sigma = 3) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist_pareto(shape, scale)
dist_pareto(shape, scale)
shape , scale
|
parameters. Must be strictly positive. |
dist <- dist_pareto(shape = c(10, 3, 2, 1), scale = rep(1, 4)) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_pareto(shape = c(10, 3, 2, 1), scale = rep(1, 4)) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist_percentile(x, percentile)
dist_percentile(x, percentile)
x |
A list of values |
percentile |
A list of percentiles |
dist <- dist_normal() percentiles <- seq(0.01, 0.99, by = 0.01) x <- vapply(percentiles, quantile, double(1L), x = dist) dist_percentile(list(x), list(percentiles*100))
dist <- dist_normal() percentiles <- seq(0.01, 0.99, by = 0.01) x <- vapply(percentiles, quantile, double(1L), x = dist) dist_percentile(list(x), list(percentiles*100))
Poisson distributions are frequently used to model counts.
dist_poisson(lambda)
dist_poisson(lambda)
lambda |
vector of (non-negative) means. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Poisson random variable with parameter
lambda
= .
Support:
Mean:
Variance:
Probability mass function (p.m.f):
Cumulative distribution function (c.d.f):
Moment generating function (m.g.f):
dist <- dist_poisson(lambda = c(1, 4, 10)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_poisson(lambda = c(1, 4, 10)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist_poisson_inverse_gaussian(mean, shape)
dist_poisson_inverse_gaussian(mean, shape)
mean , shape
|
parameters. Must be strictly positive. Infinite values are supported. |
actuar::PoissonInverseGaussian
dist <- dist_poisson_inverse_gaussian(mean = rep(0.1, 3), shape = c(0.4, 0.8, 1)) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_poisson_inverse_gaussian(mean = rep(0.1, 3), shape = c(0.4, 0.8, 1)) dist mean(dist) variance(dist) support(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist_sample(x)
dist_sample(x)
x |
A list of sampled values. |
# Univariate numeric samples dist <- dist_sample(x = list(rnorm(100), rnorm(100, 10))) dist mean(dist) variance(dist) skewness(dist) generate(dist, 10) density(dist, 1) # Multivariate numeric samples dist <- dist_sample(x = list(cbind(rnorm(100), rnorm(100, 10)))) dimnames(dist) <- c("x", "y") dist mean(dist) variance(dist) generate(dist, 10) quantile(dist, 0.4) # Returns the marginal quantiles cdf(dist, matrix(c(0.3,9), nrow = 1))
# Univariate numeric samples dist <- dist_sample(x = list(rnorm(100), rnorm(100, 10))) dist mean(dist) variance(dist) skewness(dist) generate(dist, 10) density(dist, 1) # Multivariate numeric samples dist <- dist_sample(x = list(cbind(rnorm(100), rnorm(100, 10)))) dimnames(dist) <- c("x", "y") dist mean(dist) variance(dist) generate(dist, 10) quantile(dist, 0.4) # Returns the marginal quantiles cdf(dist, matrix(c(0.3,9), nrow = 1))
The Student's T distribution is closely related to the Normal()
distribution, but has heavier tails. As increases to
,
the Student's T converges to a Normal. The T distribution appears
repeatedly throughout classic frequentist hypothesis testing when
comparing group means.
dist_student_t(df, mu = 0, sigma = 1, ncp = NULL)
dist_student_t(df, mu = 0, sigma = 1, ncp = NULL)
df |
degrees of freedom ( |
mu |
The location parameter of the distribution.
If |
sigma |
The scale parameter of the distribution. |
ncp |
non-centrality parameter |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a central Students T random variable
with
df
= .
Support: , the set of all real numbers
Mean: Undefined unless , in which case the mean is
zero.
Variance:
Undefined if , infinite when
.
Probability density function (p.d.f):
dist <- dist_student_t(df = c(1,2,5), mu = c(0,1,2), sigma = c(1,2,3)) dist mean(dist) variance(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_student_t(df = c(1,2,5), mu = c(0,1,2), sigma = c(1,2,3)) dist mean(dist) variance(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
Tukey's studentized range distribution, used for Tukey's honestly significant differences test in ANOVA.
dist_studentized_range(nmeans, df, nranges)
dist_studentized_range(nmeans, df, nranges)
nmeans |
sample size for range (same for each group). |
df |
degrees of freedom for |
nranges |
number of groups whose maximum range is considered. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
Support: , the set of positive real numbers.
Other properties of Tukey's Studentized Range Distribution are omitted, largely because the distribution is not fun to work with.
dist <- dist_studentized_range(nmeans = c(6, 2), df = c(5, 4), nranges = c(1, 1)) dist cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_studentized_range(nmeans = c(6, 2), df = c(5, 4), nranges = c(1, 1)) dist cdf(dist, 4) quantile(dist, 0.7)
The density()
, mean()
, and variance()
methods are approximate as
they are based on numerical derivatives.
dist_transformed(dist, transform, inverse)
dist_transformed(dist, transform, inverse)
dist |
A univariate distribution vector. |
transform |
A function used to transform the distribution. This transformation should be monotonic over appropriate domain. |
inverse |
The inverse of the |
# Create a log normal distribution dist <- dist_transformed(dist_normal(0, 0.5), exp, log) density(dist, 1) # dlnorm(1, 0, 0.5) cdf(dist, 4) # plnorm(4, 0, 0.5) quantile(dist, 0.1) # qlnorm(0.1, 0, 0.5) generate(dist, 10) # rlnorm(10, 0, 0.5)
# Create a log normal distribution dist <- dist_transformed(dist_normal(0, 0.5), exp, log) density(dist, 1) # dlnorm(1, 0, 0.5) cdf(dist, 4) # plnorm(4, 0, 0.5) quantile(dist, 0.1) # qlnorm(0.1, 0, 0.5) generate(dist, 10) # rlnorm(10, 0, 0.5)
Note that the samples are generated using inverse transform sampling, and the means and variances are estimated from samples.
dist_truncated(dist, lower = -Inf, upper = Inf)
dist_truncated(dist, lower = -Inf, upper = Inf)
dist |
The distribution(s) to truncate. |
lower , upper
|
The range of values to keep from a distribution. |
dist <- dist_truncated(dist_normal(2,1), lower = 0) dist mean(dist) variance(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7) if(requireNamespace("ggdist")) { library(ggplot2) ggplot() + ggdist::stat_dist_halfeye( aes(y = c("Normal", "Truncated"), dist = c(dist_normal(2,1), dist_truncated(dist_normal(2,1), lower = 0))) ) }
dist <- dist_truncated(dist_normal(2,1), lower = 0) dist mean(dist) variance(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7) if(requireNamespace("ggdist")) { library(ggplot2) ggplot() + ggdist::stat_dist_halfeye( aes(y = c("Normal", "Truncated"), dist = c(dist_normal(2,1), dist_truncated(dist_normal(2,1), lower = 0))) ) }
A distribution with constant density on an interval.
dist_uniform(min, max)
dist_uniform(min, max)
min , max
|
lower and upper limits of the distribution. Must be finite. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Poisson random variable with parameter
lambda
= .
Support:
Mean:
Variance:
Probability mass function (p.m.f):
Cumulative distribution function (c.d.f):
Moment generating function (m.g.f):
dist <- dist_uniform(min = c(3, -2), max = c(5, 4)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_uniform(min = c(3, -2), max = c(5, 4)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
Generalization of the gamma distribution. Often used in survival and time-to-event analyses.
dist_weibull(shape, scale)
dist_weibull(shape, scale)
shape , scale
|
shape and scale parameters, the latter defaulting to 1. |
We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.
In the following, let be a Weibull random variable with
success probability
p
= .
Support: and zero.
Mean: , where
is
the gamma function.
Variance:
Probability density function (p.d.f):
Cumulative distribution function (c.d.f):
Moment generating function (m.g.f):
dist <- dist_weibull(shape = c(0.5, 1, 1.5, 5), scale = rep(1, 4)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
dist <- dist_weibull(shape = c(0.5, 1, 1.5, 5), scale = rep(1, 4)) dist mean(dist) variance(dist) skewness(dist) kurtosis(dist) generate(dist, 10) density(dist, 2) density(dist, 2, log = TRUE) cdf(dist, 4) quantile(dist, 0.7)
If a distribution is not yet supported, you can vectorise p/d/q/r functions
using this function. dist_wrap()
stores the distributions parameters, and
provides wrappers which call the appropriate p/d/q/r functions.
Using this function to wrap a distribution should only be done if the distribution is not yet available in this package. If you need a distribution which isn't in the package yet, consider making a request at https://github.com/mitchelloharawild/distributional/issues.
dist_wrap(dist, ..., package = NULL)
dist_wrap(dist, ..., package = NULL)
dist |
The name of the distribution used in the functions (name that is prefixed by p/d/q/r) |
... |
Named arguments used to parameterise the distribution. |
package |
The package from which the distribution is provided. If NULL, the calling environment's search path is used to find the distribution functions. Alternatively, an arbitrary environment can also be provided here. |
dist <- dist_wrap("norm", mean = 1:3, sd = c(3, 9, 2)) density(dist, 1) # dnorm() cdf(dist, 4) # pnorm() quantile(dist, 0.975) # qnorm() generate(dist, 10) # rnorm() library(actuar) dist <- dist_wrap("invparalogis", package = "actuar", shape = 2, rate = 2) density(dist, 1) # actuar::dinvparalogis() cdf(dist, 4) # actuar::pinvparalogis() quantile(dist, 0.975) # actuar::qinvparalogis() generate(dist, 10) # actuar::rinvparalogis()
dist <- dist_wrap("norm", mean = 1:3, sd = c(3, 9, 2)) density(dist, 1) # dnorm() cdf(dist, 4) # pnorm() quantile(dist, 0.975) # qnorm() generate(dist, 10) # rnorm() library(actuar) dist <- dist_wrap("invparalogis", package = "actuar", shape = 2, rate = 2) density(dist, 1) # actuar::dinvparalogis() cdf(dist, 4) # actuar::pinvparalogis() quantile(dist, 0.975) # actuar::qinvparalogis() generate(dist, 10) # actuar::rinvparalogis()
## S3 method for class 'distribution' family(object, ...)
## S3 method for class 'distribution' family(object, ...)
object |
The distribution(s). |
... |
Additional arguments used by methods. |
dist <- c( dist_normal(1:2), dist_poisson(3), dist_multinomial(size = c(4, 3), prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4))) ) family(dist)
dist <- c( dist_normal(1:2), dist_poisson(3), dist_multinomial(size = c(4, 3), prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4))) ) family(dist)
Generate random samples from probability distributions.
## S3 method for class 'distribution' generate(x, times, ...)
## S3 method for class 'distribution' generate(x, times, ...)
x |
The distribution(s). |
times |
The number of samples. |
... |
Additional arguments used by methods. |
Used to extract a specified prediction interval at a particular confidence level from a distribution.
hdr(x, ...)
hdr(x, ...)
x |
Object to create hilo from. |
... |
Additional arguments used by methods. |
This function is highly experimental and will change in the future. In particular, improved functionality for object classes and visualisation tools will be added in a future release.
Computes minimally sized probability intervals highest density regions.
## S3 method for class 'distribution' hdr(x, size = 95, n = 512, ...)
## S3 method for class 'distribution' hdr(x, size = 95, n = 512, ...)
x |
The distribution(s). |
size |
The size of the interval (between 0 and 100). |
n |
The resolution used to estimate the distribution's density. |
... |
Additional arguments used by methods. |
Used to extract a specified prediction interval at a particular confidence level from a distribution.
The numeric lower and upper bounds can be extracted from the interval using
<hilo>$lower
and <hilo>$upper
as shown in the examples below.
hilo(x, ...)
hilo(x, ...)
x |
Object to create hilo from. |
... |
Additional arguments used by methods. |
# 95% interval from a standard normal distribution interval <- hilo(dist_normal(0, 1), 95) interval # Extract the individual quantities with `$lower`, `$upper`, and `$level` interval$lower interval$upper interval$level
# 95% interval from a standard normal distribution interval <- hilo(dist_normal(0, 1), 95) interval # Extract the individual quantities with `$lower`, `$upper`, and `$level` interval$lower interval$upper interval$level
Returns a hilo
central probability interval with probability coverage of
size
. By default, the distribution's quantile()
will be used to compute
the lower and upper bound for a centered interval
## S3 method for class 'distribution' hilo(x, size = 95, ...)
## S3 method for class 'distribution' hilo(x, size = 95, ...)
x |
The distribution(s). |
size |
The size of the interval (between 0 and 100). |
... |
Additional arguments used by methods. |
This function returns TRUE
for distributions and FALSE
for all other objects.
is_distribution(x)
is_distribution(x)
x |
An object. |
TRUE if the object inherits from the distribution class.
dist <- dist_normal() is_distribution(dist) is_distribution("distributional")
dist <- dist_normal() is_distribution(dist) is_distribution("distributional")
Is the object a hdr
is_hdr(x)
is_hdr(x)
x |
An object. |
Is the object a hilo
is_hilo(x)
is_hilo(x)
x |
An object. |
kurtosis(x, ...) ## S3 method for class 'distribution' kurtosis(x, ...)
kurtosis(x, ...) ## S3 method for class 'distribution' kurtosis(x, ...)
x |
The distribution(s). |
... |
Additional arguments used by methods. |
likelihood(x, ...) ## S3 method for class 'distribution' likelihood(x, sample, ..., log = FALSE) log_likelihood(x, ...)
likelihood(x, ...) ## S3 method for class 'distribution' likelihood(x, sample, ..., log = FALSE) log_likelihood(x, ...)
x |
The distribution(s). |
... |
Additional arguments used by methods. |
sample |
A list of sampled values to compare to distribution(s). |
log |
If |
Returns the empirical mean of the probability distribution. If the method does not exist, the mean of a random sample will be returned.
## S3 method for class 'distribution' mean(x, ...)
## S3 method for class 'distribution' mean(x, ...)
x |
The distribution(s). |
... |
Additional arguments used by methods. |
Returns the median (50th percentile) of a probability distribution. This is
equivalent to quantile(x, p=0.5)
.
## S3 method for class 'distribution' median(x, na.rm = FALSE, ...)
## S3 method for class 'distribution' median(x, na.rm = FALSE, ...)
x |
The distribution(s). |
na.rm |
Unused, included for consistency with the generic function. |
... |
Additional arguments used by methods. |
Allows extension package developers to define a new distribution class compatible with the distributional package.
new_dist(..., class = NULL, dimnames = NULL)
new_dist(..., class = NULL, dimnames = NULL)
... |
Parameters of the distribution (named). |
class |
The class of the distribution for S3 dispatch. |
dimnames |
The names of the variables in the distribution (optional). |
Construct hdr intervals
new_hdr( lower = list_of(.ptype = double()), upper = list_of(.ptype = double()), size = double() )
new_hdr( lower = list_of(.ptype = double()), upper = list_of(.ptype = double()), size = double() )
lower , upper
|
A list of numeric vectors specifying the region's lower and upper bounds. |
size |
A numeric vector specifying the coverage size of the region. |
A "hdr" vector
Mitchell O'Hara-Wild
new_hdr(lower = list(1, c(3,6)), upper = list(10, c(5, 8)), size = c(80, 95))
new_hdr(lower = list(1, c(3,6)), upper = list(10, c(5, 8)), size = c(80, 95))
Class constructor function to help with manually creating hilo interval objects.
new_hilo(lower = double(), upper = double(), size = double())
new_hilo(lower = double(), upper = double(), size = double())
lower , upper
|
A numeric vector of values for lower and upper limits. |
size |
Size of the interval between [0, 100]. |
A "hilo" vector
Earo Wang & Mitchell O'Hara-Wild
new_hilo(lower = rnorm(10), upper = rnorm(10) + 5, size = 95)
new_hilo(lower = rnorm(10), upper = rnorm(10) + 5, size = 95)
Create a new support region vector
new_support_region(x = numeric(), limits = list(), closed = list())
new_support_region(x = numeric(), limits = list(), closed = list())
x |
A list of prototype vectors defining the distribution type. |
limits |
A list of value limits for the distribution. |
closed |
A list of logical(2L) indicating whether the limits are closed. |
parameters(x, ...) ## S3 method for class 'distribution' parameters(x, ...)
parameters(x, ...) ## S3 method for class 'distribution' parameters(x, ...)
x |
The distribution(s). |
... |
Additional arguments used by methods. |
dist <- c( dist_normal(1:2), dist_poisson(3), dist_multinomial(size = c(4, 3), prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4))) ) parameters(dist)
dist <- c( dist_normal(1:2), dist_poisson(3), dist_multinomial(size = c(4, 3), prob = list(c(0.3, 0.5, 0.2), c(0.1, 0.5, 0.4))) ) parameters(dist)
Computes the quantiles of a distribution.
## S3 method for class 'distribution' quantile(x, p, ..., log = FALSE)
## S3 method for class 'distribution' quantile(x, p, ..., log = FALSE)
x |
The distribution(s). |
p |
The probability of the quantile. |
... |
Additional arguments passed to methods. |
log |
If |
skewness(x, ...) ## S3 method for class 'distribution' skewness(x, ...)
skewness(x, ...) ## S3 method for class 'distribution' skewness(x, ...)
x |
The distribution(s). |
... |
Additional arguments used by methods. |
support(x, ...) ## S3 method for class 'distribution' support(x, ...)
support(x, ...) ## S3 method for class 'distribution' support(x, ...)
x |
The distribution(s). |
... |
Additional arguments used by methods. |
A generic function for computing the variance of an object.
variance(x, ...) ## S3 method for class 'numeric' variance(x, ...) ## S3 method for class 'matrix' variance(x, ...) ## S3 method for class 'numeric' covariance(x, ...)
variance(x, ...) ## S3 method for class 'numeric' variance(x, ...) ## S3 method for class 'matrix' variance(x, ...) ## S3 method for class 'numeric' covariance(x, ...)
x |
An object. |
... |
Additional arguments used by methods. |
The implementation of variance()
for numeric variables coerces the input to
a vector then uses stats::var()
to compute the variance. This means that,
unlike stats::var()
, if variance()
is passed a matrix or a 2-dimensional
array, it will still return the variance (stats::var()
returns the
covariance matrix in that case).
variance.distribution()
, covariance()
Returns the empirical variance of the probability distribution. If the method does not exist, the variance of a random sample will be returned.
## S3 method for class 'distribution' variance(x, ...)
## S3 method for class 'distribution' variance(x, ...)
x |
The distribution(s). |
... |
Additional arguments used by methods. |