Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Essential Statistics for Non-STEM Data Analysts

You're reading from   Essential Statistics for Non-STEM Data Analysts Get to grips with the statistics and math knowledge needed to enter the world of data science with Python

Arrow left icon
Product type Paperback
Published in Nov 2020
Publisher Packt
ISBN-13 9781838984847
Length 392 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Rongpeng Li Rongpeng Li
Author Profile Icon Rongpeng Li
Rongpeng Li
Arrow right icon
View More author details
Toc

Table of Contents (19) Chapters Close

Preface 1. Section 1: Getting Started with Statistics for Data Science
2. Chapter 1: Fundamentals of Data Collection, Cleaning, and Preprocessing FREE CHAPTER 3. Chapter 2: Essential Statistics for Data Assessment 4. Chapter 3: Visualization with Statistical Graphs 5. Section 2: Essentials of Statistical Analysis
6. Chapter 4: Sampling and Inferential Statistics 7. Chapter 5: Common Probability Distributions 8. Chapter 6: Parametric Estimation 9. Chapter 7: Statistical Hypothesis Testing 10. Section 3: Statistics for Machine Learning
11. Chapter 8: Statistics for Regression 12. Chapter 9: Statistics for Classification 13. Chapter 10: Statistics for Tree-Based Methods 14. Chapter 11: Statistics for Ensemble Methods 15. Section 4: Appendix
16. Chapter 12: A Collection of Best Practices 17. Chapter 13: Exercises and Projects 18. Other Books You May Enjoy

Building a naïve Bayes classifier from scratch

In this section, we will study one of the most classic and important classification algorithms, the naïve Bayes classification. We covered Bayes' theorem in previous chapters several times, but now is a good time to revisit its form.

Suppose A and B are two random events; the following relationship holds as long as P(B) ≠ 0:

Some terminologies to review: P(A|B) is called the posterior probability as it is the probability of event A after knowing the outcome of event B. P(A), on another hand, is called the prior probability because it contains no information about event B.

Simply put, the idea of the Bayes classifier is to set the classification category variable as our A and the features (there can be many of them) as our B. We predict the classification results as posterior probabilities.

Then why the naïve Bayes classifier? The naïve Bayes classifier assumes that different features are...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image