Definition of conditional probability:
Bayes' Theorem :
Now, suppose we give a diagnostic test for a disease. Let's suppose that the percentage of the population with the disease, written as a fraction, is P(d). This fraction is also the probability of having the disease of a person randomly selected from the population. We will refer to it as the prevalence.
Next, we define:
Sensitivity of the test - rate of true positives of the test. That is to say, if the person tested has the disease, then the test result will be positive.
Specificity of the test - rate of true negatives of the test. That is to say, if the person tested does not have the disease, then the test result will be negative.
In general, there will be false positives and false negatives. In developing the test, the developer must try to estimate all four rates, sensitivity, specificity, false positives, false negatives.
Now the question: I get tested for a disease and the result is positive. What is the probability that I actually have the disease?
In probabilistic notation:
Sensitivity = P(+∣d) where P denotes probability, + means test is positive and d means you have the disease.
Specificity = P(-∣˜d) where P is probability, - means test is negative and ˜d means you do not have the disease.
Prevalence = P(d) expressed as a probability, this is a number to estimate from epidemiological data.
In mathematical notation, the question is: What is P(d∣+). That is to say, what is the probability of having the disease given that the test result is positive.
Now, the information we have (sensitivity and specificity) is "turned around' with respect to the question. We have P(B∣A) and we want P(A∣B), so we use Bayes' Theorem:
P(d∣+) = sensitivity x prevalence / P(+).
We just have to calculate P(+). Here, we say that the event "+" happens when :
either (+ and d) or (+ and ~d), or in mathematical set notation as above:
So, since the probability of the union of two disjoint sets is the sum of the probabilities (more to be said here):
P(+) = P(+∣d) P(d) + P(+∣˜d) P(˜d)Noting that the probability of "not A" is one minus the probability of A, we can replace the "+" and the ~ in the last term of the last equation by one minus the opposite event, giving us:
P(+) = P(+|d) P(d) + (1 - P(-|~d)) (1 - P(d)).Plugging the right-hand side of this equation into equation 1 above we get the equation in words:
Probability of being ill is:"sensitivity*prevalence" divided by "sensitivity*prevalence + (1-specificity)*(1-prevalence)", or
Probability of being ill after testing positive is:
Review of and comments on the video:
Important!!!! Our video finds that, for your own personal interest, testing for the Corona virus will not necessarily give you a useful conclusion if there are not many victims so far, or if you do not have any other information (like symptoms). So, if the government asks you to be tested, what do you do?
In this case, your friendly neighborhood epidemiologists are trying to find out crucial information about the actual number of cases, so GET TESTED. Just remember, you are helping the community understand the disease. We would discourage getting tested just because you are curious.
Get the lecture notes and r-programmes from the video here.
Contact me at: firstname.lastname@example.org