**Introduction**

If you have been reading up on Machine Learning, you would have come across this term a lot, probability. Probability is the study of the likelihood of an event(s) occurring. Let’s explore more about probability before we get into probability distribution and its types.

Probability in layman terms is the possibility of one or more events occurring given the total outcomes. As a straightforward example, given a fair coin, the probability of getting a head when you flip it 1/2=0.5 since there are only 2 outcomes and one of them is heads. Bear in mind probability of any event is always between 0 and 1.

So, **what is a probability distribution**?

The probability distribution is the listing of outcomes along with their individual probabilities. This is usually represented as a table with the possible outcomes listed along with the calculated probability values.

A quick probability distribution for a fair coin flip will be similar to the one shown below.

**Probability**

- Event heads: 0.5
- Event Tails: 0.5

Table of Contents

## Probability Distribution definition

A probability distribution is used to show the spread of probability across all possible events. There are only 2 outcomes in our example, and since the coin is fair, the probability is equally divided between the 2 events.

A more formal definition from Wikipedia reads, ‘A probability distribution is a mathematical description of the probabilities of events, subsets of the sample space.

Typically, probability distributions are associated with Random variables. Random variables whose values depend on the outcomes are random in nature, without following a predetermined pattern. There are different functions for discrete random variable and continuous random variables.

Now, let’s move onto the types of probability distribution functions.

## Types of Probability Distribution functions

Let’s take a look at the various probability distribution functions.

For discrete random variables, popular probability distribution functions used are.

### Bernoulli Distribution

Bernoulli distribution is related to experiments with only 2 outcomes. A pass/fail, or a yes/no or 0/1 kind of scenario is where Bernoulli distribution can be applied to. In Bernoulli, the random variable takes a value of 1/yes/pass with a probability of p and a value of 0/No/fail with a probability of q=1-p (which is the remaining probability).

Our previous example of tossing a coin is the best example of a Bernoulli distribution.

The PMF (Probability Mass Function) of a Bernoulli distribution is expressed as.

F(k,p)= {p, if k=1 and q=1-p if k=0.

This can also be expressed as,

f(k,p)= pk (1-p) (1-k) for k€{0,1}.

### Binomial Distribution

Binomial probability distributions build on Bernoulli distribution by counting the number of successes in a limited number of consecutive Bernoulli trials. As an example, let’s expand on the coin flip. Here flipping the coin several times and counting the number of heads that the Binomial distribution function can give.

Any single success/failure is a Bernoulli trial. The Binomial distribution is used to model the number of successes in a sample drawn from a population with N elements.

### Negative Binomial Distribution

Negative Binomial distribution is Binomial distribution with an unlimited number of trials or sample size but a finite number of successes. Unlimited here means the number of trials will continue until the number of successes is achieved. In Binomial distribution, the number of trials is already determined before the start of the experiment.

So, a Negative Binomial distribution is focused on the number of successes.

### Geometric Distribution

The geometric distribution is a special case of Negative Binomial distribution, where the desired number of success is just one.

As soon as the success is achieved, the trials stop. Geometric Distribution attempts to give you the probabilities of a random discrete variable where the number of trials is unlimited. Still, the trials stop as soon as the first success is achieved.

There are also other probability types of discrete random variable distributions like Poisson and** **Hypergeometric distributions.

As far as continuous random variables are concerned, there are a couple of continuous probability distributions worth mentioning,

Probabilities associated with continuous random variables are defined by functions called Probability Density Functions, unlike the discrete ones, which have a Probability Mass function.

### Uniform Distribution

The uniform distribution function defines variables that can take on values that seem to be constant, thus giving a uniform look to the graph plotted by the function. A simple example is that a random variable represents the waiting time of a facility that goes in cycles, like a shuttle.

### Exponential Distribution

Exponential distribution, which is fairly common in practice, allows for a random variable to take up values related to waiting times for any activity at an average constant frequency. Imagine that an event occurs at an average frequency of, and you arrive at the scene and wait for that event to occur for a time period t.

This waiting period is a random variable that follows an exponential distribution with repeated observations made. An example that you can quote is that of the amount of time a car battery lasts before failure, referred to as mean time between failures (MTBF).

A function of the exponential distribution can be expressed as, where c is the logarithmic constant e=2. 71828..

## Conclusion

If you are interested in Machine learning or Data Science, you should hold on to a pg diploma in data science to advance your potential career. You can gain more knowledge by having a post-graduate diploma in data science from Jigsaw Academy.