What is Bayes' Theorem? How is it useful in a machine learning context?
Bayes' Theorem gives you the posterior probability of an event given what is known as prior knowledge.
Mathematically, it's expressed as the true positive rate of a condition sample divided by the sum of the false positive rate of the population and the true positive rate of a condition. Say you had a 60% chance of actually having the flu after a flu test, but out of people who had the flu, the test will be false 50% of the time, and the overall population only has a 5% chance of having the flu. Would you actually have a 60% chance of having the flu after having a positive test?
Bayes' Theorem says no. It says that you have a (.6 0.05) (True Positive Rate of a Condition Sample) / (.60.05)(True Positive Rate of a Condition Sample) + (.5*0.95) (False Positive Rate of a Population) = 0.0594 or 5.94% chance of getting a flu.
Bayes' Theorem is the basis behind a branch of machine learning that most notably includes the Naive Bayes classifier. That's something important to consider when you're faced with machine learning interview questions.
Learn More :
Data Science
- What features would you use to predict the Uber ETA for ride requests?
- How would you evaluate the predictions of an Uber ETA model?
- Describe how you would build a model to predict Uber ETAs after a rider requests a ride.
- Suppose you're working as a data scientist at Facebook. How would you measure the success of private stories on Instagram, where only certain chosen friends can see the story?
- Precision vs Accuracy Vs Recall?
- Error vs variance vs bias?
- False negatives vs false positives? When is either one worse than the other?
- Describe your data science process start to finish?
- Data science vs machine learning vs AI?
- How would you find correlation between a categorical variable and a continuous variable?
- How do you treat null/missing values? Name 3 methodologies.
- How can outlier values be treated?
- What is data normalization? Name 2 normalization methodologies.
- What is the role/importance of data cleaning?
- What are success metrics vs tracking metrics?
- What kind of metric would you make to measure success of a program (marketing) and how do you define them?
- Let's say an app was getting a redesign. How do you know if the redesign was successful?
- We noticed a steep decline in users in a certain area of the world, how would you address/asses?
- What are the two methods used for the calibration in Supervised Learning?
- Which method is frequently used to prevent overfitting?
- What is the difference between heuristic for rule learning and heuristics for decision trees?
- What is Perceptron in Machine Learning?
- Explain the two components of Bayesian logic program?
- What are Bayesian Networks (BN) ?
- Why instance based learning algorithm sometimes referred as Lazy learning algorithm?