The set of all possible outcomes is often called Ξ©
An event π΄ can be seen as the set of all outcomes that make the event true
For example, \[Fever=\{Temp>37.5Β°C\}\]
An event will become either true or false after an experiment
For example, a dice can be either 4 or not
We want to give a value to our rational belief that the event will become true after the experiment
The numeric value is called Probability
It is useful to think that the probability of an event is the area in the drawing
The total area of Ξ© is 1
Usually we do not know the shape of π΄
Our rational beliefs depend on our knowledge
If we represent our knowledge (or hypothesis) by π, the the probability of an event π΄ is written as \[β(A|Z)\] We read βthe probability of event π΄, given that we know πβ
For example, βthe probability that we get a 4, given that the dice is symmetricalβ
The order is relevant \[β(A|Z)β β(Z|A)\] There are two events, π΄ and π
The one written after |
is what we assume to be true
The one written before |
is what we are asking for
One we know, the other we do not
Now outcomes are limited only to the π region
We measure the area of \(β(A|Z)\) with respect to the area of π instead of Ξ©
The shape of π is often unknown
If, given our knowledge π, the event π΅ is more plausible than the event π΄, then \[β(A|Z)β€β(B|Z)\]
For example, the probability that we get either 4, 5 or 6 is greater than the probability that we get a 4, given that the dice is symmetrical \[β(\{4\}|Z)β€β(\{4,5,6\}|Z)\]
On the other hand, if we get new information, the probabilities may change
The same event π΄ may be more plausible under a new hypothesis π than under the initial hypothesis π
Then \[β(A|Z)β€β(A|Y)\]
It has been proven that probabilities must be like this
A probability is a number between 0 and 1 inclusive \[β(A) β₯ 0\textrm{ and }β(A)β€1\]
The probability of an sure event is 1 \[β(\textrm{True}) = 1\]
The probability of an impossible event is 0 \[β(\textrm{False}) = 0\]
We are interested in non-trivial events, that are usually combinations of smaller events
For example, we may ask βwhat is the probability that, in a group of π people, at least two persons have the same birthdayβ
Fortunately, any complex event can be decomposed into simpler events, combined with and, or and not connectors
Exercise: decompose the birthday event into simpler ones
If the event π΄ becomes more and more plausible, then the opposite event not π΄ becomes less and less plausible
It can be shown that we always have \[β(\textrm{not }A) = 1-β(A)\]
The probability of of π΄ and π΅ happening simultaneously must be connected to the probability of each one
It can be shown that there are only two ways to calculate it
It can be proven that the only way to combine \(β(A)\) and \(β(B|A)\) to get \(β(A,B)\) is to multiply them.
Both are true, since \(β(A,B)=β(B,A).\) The order that we write them is irrelevant.
We know how to calculate \(β(A\textrm{ and }B)\) and \(β(\textrm{not }A)\)
We also know the De Morganβs law, to swap ANDs with ORs
\[\textrm{not }(A \textrm{ or }B) = (\textrm{not }A) \textrm{ and }(\textrm{not }B)\]
Therefore we can write
\[ \begin{aligned} β(A \textrm{ or }B) & = 1 - β(\textrm{not }(A \textrm{ or }B))\\ & = 1-β( (\textrm{not }A) \textrm{ and }(\textrm{not }B)) \end{aligned} \]
\[β(A \textrm{ or }B) = 1-β( (\textrm{not }A) \textrm{ and }(\textrm{not }B)) \\ = 1-β(\textrm{not }A)β P(\textrm{not }B|\textrm{not }A)\]
using negation rule \[ \begin{aligned} β(A \textrm{ or }B) & = 1-β(\textrm{not }A)β (1- β(B|\textrm{not }A)) \\ & = 1-β(\textrm{not }A) + β(\textrm{not }A)β P(B|\textrm{not }A) \end{aligned} \]
\[ \begin{aligned} β(A \textrm{ or }B) & = 1 -β(\textrm{not }A) + β(\textrm{not }A,B) \\ β(A \textrm{ or }B) & = 1 -(1-β(A)) + β(\textrm{not }A|B)β(B) \\ β(A \textrm{ or }B) & = β(A) + (1-β(A|B))β(B) \\ β(A \textrm{ or }B) & = β(A) + β(B)-β(A|B)β(B) \\ β(A \textrm{ or }B) & = β(A) + β(B)-β(A,B) \end{aligned} \] You need to remember only the last line
The previous lines justify why the last one is always true
If A and B can happen at the same time, then \(β(A) + β(B)\) counts the intersection twice
So we have to take out the intersection \(β(A,B)\) \[β(A \textrm{ or }B) = \\ β(A) + β(B)-β(A,B)\]
If there are three compatible events, things get messy
\[\begin{aligned} & β(A \textrm{ or }B \textrm{ or }C) \\ & β(A) + β(B \textrm{ or }C)-β(A,(B \textrm{ or }C)) \\ & β(A) + β(B) + β(C)-β(B,C) - β(A,B \textrm{ or }A,C) \\ & β(A) + β(B) + β(C)-β(B,C) - (β(A,B) + β(A,C) - β(A,B,C)) \\ & β(A) + β(B) + β(C)-β(B,C) - β(A,B) - β(A,C) + β(A,B,C) \end{aligned} \]
It gets worse with more events
Using De Morganβs rule
\[\begin{aligned} & β(A \textrm{ or }B \textrm{ or }C) \\ & 1 - β((\textrm{not }A) \textrm{ and }(\textrm{not }B) \textrm{ and }(\textrm{not }C))\\ & 1 - β(\textrm{not }A)β β(\textrm{not }B | \textrm{not }A)β β(\textrm{not }C | \textrm{not }A, \textrm{not }B)\\ & 1 - (1-β(A))β (1-β(B | \textrm{not }A))β (1-β(C | \textrm{not }A, \textrm{not }B)) \end{aligned} \]
This is often easier to calculate
Letβs say we have three people, with birthday \(x_1, x_2\) and \(x_3.\)
The probability that there are at least two people with the same birthday is \[β(x_2=x_1 \textrm{ or }x_3=x_2 \textrm{ or }x_3=x_1)\] which can be rewritten as \[1-β(x_2β x_1 \textrm{ and }x_3β x_2 \textrm{ and }x_3β x_1)\]
We want to calculate \[1-β(x_2β x_1 \textrm{ and }x_3β x_2 \textrm{ and }x_3β x_1)\] We can separate like this (only the first and) \[1-β(x_2β x_1)β β(x_3β x_2 \textrm{ and }x_3β x_1|x_2β x_1)\] Assuming 365 possible birthdays, we have \[1-\frac{364}{365}β \frac{363}{365}\]
What is the probability that, in a group of N people, at least two of them share the same birthday?
How many people do we need to have at least 50% probability of least two of them sharing the same birthday?
if A and B cannot happen at the same time, then \((A \textrm{ and }B)\) is impossible, therefore \(β(A,B)=0\)
In that case (and only in that case) \[β(A \textrm{ or }B) = β(A) + β(B)\]
In particular we have \[β(A) = β(A\textrm{ and }(B \textrm{ or }\textrm{not }B)) = β(A,B) + β(A, \textrm{not }B)\] because
If we partition Ξ© into π subsets \(A_i\), such that they cover all Ξ© \[\Omega=A_1 βͺ A_2 βͺ β¦ βͺ A_n\] and each pair of events are mutually incompatible \[A_i β© A_j=\phi\] then we have \[β(\Omega)=β(A_1) + β(A_2) + β¦ + β(A_n)=1\]
One kind of events are the set of each single outcome
If \(a_i β Ξ©\) is an outcome, then \(A_i=\{a_i\}\) is an event
βThe experiment outcome is exactly \(a_i\)β
It is easy to see that these events are mutually incompatible and cover all Ξ©
Thus, \[β(\{a_1\}) + β(\{a_2\}) + β¦ + β(\{a_n\})=1\]