Take a Chance on Me!

Data ScienceData ScientistProbabilityStatistics

October 4, 2023

When we enter the world of data science, it can be so easy to focus on all the complex algorithms and forget that sometimes some fundamental or "basic" maths that could still be useful, and the more complex maths is built upon.

In this blog post I'll be covering some basics of these basics which are all too easily forgotten about!

Theoretical

The likelihood of an event occurring.

Well, first up what even is an event?

A specific outcome or combination of outcomes.

Even these definitions expect you to sort of know what's going on in the context. So let's rewind and give a simple (and much used!) example...

Say you want to know how likely it is to get heads if you flip a fair coin. There are two sides the coin can land on, heads or tails. The initial question was to know how likely it was to get heads, this is the favourable outcome. Each of heads and tails is an outcome - a possible result of the coin flip. Together all the possible outcomes are referred to as the sample space. Based on this, we could rewrite the definition of probability to be something like...

The likelihood of a specific outcome or combination of outcomes occurring when you do something with different possible end results.

If we want to put it into equation form...

P(A) = Preferred Outcome / All Outcomes = Favourable Outcomes / Sample Space

*P(A) is short hand for the probability of outcome A

If, however, we have two fair coins to flip and you want to know the likelihood of both coins landing on heads you would need to combine the probability of each coin landing on heads. Both coin flips are independent of each other, meaning that the result of one coin flip does not influence the result of the other. In this case you would multiply the probability of getting heads on each coin together:

P(heads and heads) = P(heads on coin 1) . P(heads on coin 2)

* . is standard multiplication often denoted as x in standard maths

Experimental

When we can't calculate the probability of a certain outcome, we can conduct multiple trials, making up an experiment, to give us an estimate of what the probability would be. This is known as the expected value:

The average outcome we expect if we run an experiment many times.

This average outcome is known as an experimental probability, because we are using an experiment to determine the probability rather than theory. If you run enough trials in the experiment you can get a pretty good approximation of the theoretical probability. In equation form, this looks something like...

P(A) = No. Successful Trials / No. All Trials

This is basically the experimental version of above...a successful trial is one where you get the outcome you are wanting.

If we want to know the outcome we expect to occur, i.e. expected outcome, when we run an experiment we can do this by multiplying the probability by the number of times the trial is run.

E(A) = P(A) . n

* where n is the number of trials

This is just an introduction to some basic probability terminology, sometimes a lot of these terms might be used interchangeably but they have distinct meanings and it's worth knowing them well and understanding them as you progress with data and statistics!

Shruti Turner.

Take a Chance on Me!

Theoretical

Experimental

Share Now

More Stories

Git Ready, Set, Flow - An Intro to Git Flow Practices

Tickets, Please?