Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Bayesian Analysis with Python
Bayesian Analysis with Python

Bayesian Analysis with Python: A practical guide to probabilistic modeling , Third Edition

eBook
€20.98 €29.99
Paperback
€37.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Bayesian Analysis with Python

Chapter 2
Programming Probabilistically

Our golems rarely have a physical form, but they too are often made of clay living in silicon as computer code. – Richard McElreath

Now that we have a very basic understanding of probability theory and Bayesian statistics, we are going to learn how to build probabilistic models using computational tools. Specifically, we are going to learn about probabilistic programming with PyMC [Abril-Pla et al.2023]. The basic idea is that we use code to specify statistical models and then PyMC will solve those models for us. We will not need to write Bayes’ theorem in explicit form. This is a good strategy for two reasons. First, many models do not lead to an analytic closed form, and thus we can only solve those models using numerical techniques. Second, modern Bayesian statistics is mainly done by writing code. We will be able to see that probabilistic programming offers an effective way to build and solve complex models and...

2.1 Probabilistic programming

Bayesian statistics is conceptually very simple. We have the knowns and the unknowns, and we use Bayes’ theorem to condition the latter on the former. If we are lucky, this process will reduce the uncertainty about the unknowns. Generally, we refer to the knowns as data and treat it like constants, and the unknowns as parameters and treat them as random variables.

Although conceptually simple, fully probabilistic models often lead to analytically intractable expressions. For many years, this was a real problem and one of the main issues that hindered the adoption of Bayesian methods beyond some niche applications. The arrival of the computational era and the development of numerical methods that, at least in principle, can be used to solve any inference problem, have dramatically transformed the Bayesian data analysis practice. We can think of these numerical methods as universal inference engines. The possibility of automating the inference process...

2.2 Summarizing the posterior

Generally, the first task we will perform after sampling from the posterior is to check what the results look like. The plot_trace function from ArviZ is ideally suited to this task:

Code 2.3

az.plot_trace(idata)
PIC

Figure 2.1: A trace plot for the posterior of our_first_model

Figure 2.1 shows the default result when calling az.plot_trace; we get two subplots for each unobserved variable. The only unobserved variable in our model is θ. Notice that y is an observed variable representing the data; we do not need to sample that because we already know those values. Thus we only get two subplots. On the left, we have a Kernel Density Estimation (KDE) plot; this is like the smooth version of the histogram. Ideally, we want all chains to have a very similar KDE, like in Figure 2.1. On the right, we get the individual values at each sampling step; we get as many lines as chains. Ideally, we want it to be something that looks noisy, with no clear...

2.3 Posterior-based decisions

Sometimes, describing the posterior is not enough. We may need to make decisions based on our inferences and reduce a continuous estimation to a dichotomous one: yes-no, healthy-sick, contaminated-safe, and so on. For instance, is the coin fair? A fair coin is one with a θ value of exactly 0.5. We can compare the value of 0.5 against the HDI interval. From Figure 2.3, we can see that the HDI goes from 0.03 to 0.7 and hence 0.5 is included in the HDI. We can interpret this as an indication that the coin may be tail-biased, but we cannot completely rule out the possibility that the coin is actually fair. If we want a sharper decision, we will need to collect more data to reduce the spread of the posterior, or maybe we need to find out how to define a more informative prior.

2.3.1 Savage-Dickey density ratio

One way to evaluate how much support the posterior provides for a given value is to compare the ratio of the posterior and prior densities at...

2.4 Gaussians all the way down

Gaussians are very appealing from a mathematical point of view. Working with them is relatively easy, and many operations applied to Guassians return another Gaussian. Additionally, many natural phenomena can be nicely approximated using Gaussians; essentially, almost every time that we measure the average of something, using a big enough sample size, that average will be distributed as a Gaussian. The details of when this is true, when this is not true, and when this is more or less true, are elaborated in the central limit theorem (CLT); you may want to stop reading now and search about this really central statistical concept (terrible pun intended).

Well, we were saying that many phenomena are indeed averages. Just to follow a cliché, the height (and almost any other trait of a person, for that matter) is the result of many environmental factors and many genetic factors, and hence we get a nice Gaussian distribution for the height of adult people...

2.5 Posterior predictive checks

One of the nice elements of the Bayesian toolkit is that once we have a posterior p(θ|Y ), it is possible to use it to generate predictions p(). Mathematically, this can be done by computing:

 ∫ p(˜Y | Y ) = p(˜Y | θ) p(θ | Y )dθ

This distribution is known as the posterior predictive distribution. It is predictive because it is used to make predictions, and posterior because it is computed using the posterior distribution. So we can think of this as the distribution of future data given the model, and observed data.

Using PyMC is easy to get posterior predictive samples; we don’t need to compute any integral. We just need to call the sample_posterior_predictive function and pass the InferenceData object as the first argument. We also need to pass the model object, and we can use the extend_inferencedata argument to add the posterior predictive samples to the InferenceData object. The code is:

Code 2.14

pm.sample_posterior_predictive(idata_g, model=model_g, ...

2.6 Robust inferences

One objection we may have with model_g is that we are assuming a Normal distribution, but we have two data points away from the bulk of the data. By using a Normal distribution for the likelihood, we are indirectly assuming that we are not expecting to see a lot of data points far away from the bulk. Figure 2.13 shows the result of combining these assumptions with the data. Since the tails of the Normal distribution fall quickly as we move away from the mean, the Normal distribution (at least an anthropomorphized one) is surprised by seeing those two points and reacts in two ways, moving its mean towards those points and increasing its standard deviation. Another intuitive way of interpreting this is by saying that those points have an excessive weight in determining the parameters of the Normal distribution.

So, what can we do? One option is to check for errors in the data. If we retrace our steps we may find an error in the code while cleaning or preprocessing...

2.7 InferenceData

InferenceData is a rich container for the results of Bayesian inference. A modern Bayesian analysis potentially generates many sets of data including posterior samples and posterior predictive samples. But we also have observed data, samples from the prior, and even statistics generated by the sampler. All this data, and more, can be stored in an InferenceData object. To help keep all this information organized, each one of these sets of data has its own group. For instance, the posterior samples are stored in the posterior group. The observed data is stored in the observed_data group.

Figure 2.18 shows an HTML representation of the InferenceData for model_g. We can see 4 groups: posterior, posterior_predictive, sample_stats, and observed_data. All of them are collapsed except for the posterior group. We can see we have two coordinates chain and draw of dimensions 4 and 1000 respectively. We also have 2 variables μ and σ.

PIC

Figure 2.18: InferenceData...

2.8 Groups comparison

One pretty common statistical analysis is group comparison. We may be interested in how well patients respond to a certain drug, the reduction of car accidents by the introduction of new traffic regulations, student performance under different teaching approaches, and so on. Sometimes, this type of question is framed under the hypothesis testing scenario and the goal is to declare a result statistically significant. Relying only on statistical significance can be problematic for many reasons: on the one hand, statistical significance is not equivalent to practical significance; on the other hand, a really small effect can be declared significant just by collecting enough data.

The idea of hypothesis testing is connected to the concept of p-values. This is not a fundamental connection but a cultural one; people are used to thinking that way mostly because that’s what they learn in most introductory statistical courses. There is a long record of studies and...

2.9 Summary

Although Bayesian statistics is conceptually simple, fully probabilistic models often lead to analytically intractable expressions. For many years, this was a huge barrier, hindering the wide adoption of Bayesian methods. Fortunately, maths, statistics, physics, and computer science came to the rescue in the form of numerical methods that are capable—at least in principle—of solving any inference problem. The possibility of automating the inference process has led to the development of probabilistic programming languages, allowing a clear separation between model definition and inference. PyMC is a Python library for probabilistic programming with a very simple, intuitive, and easy-to-read syntax that is also very close to the statistical syntax used to describe probabilistic models.

We introduced the PyMC library by revisiting the coin-flip model from Chapter 1, this time without analytically deriving the posterior. PyMC models are defined inside a context manager...

2.10 Exercises

  1. Using PyMC, change the parameters of the prior Beta distribution in our_first_model to match those of the previous chapter. Compare the results to the previous chapter.

  2. Compare the model our_first_model with prior θ Beta(1,1) with a model with prior θ (0,1). Are the posteriors similar or different? Is the sampling slower, faster, or the same? What about using a Uniform over a different interval such as [-1, 2]? Does the model run? What errors do you get?

  3. PyMC has a function pm.model_to_graphviz that can be used to visualize the model. Use it to visualize the model our_first_model. Compare the result with the Kruschke diagram. Use pm.model_to_graphviz to visualize model comparing_groups.

  4. Read about the coal mining disaster model that is part of the PyMC documentation ( https://shorturl.at/hyCX2). Try to implement and run this model by yourself.

  5. Modify model_g, change the prior for the mean to a Gaussian distribution centered at the...

Join our community Discord space

Join our Discord community to meet like-minded people and learn alongside more than 5000 members at: https://packt.link/bayesian

PIC

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Conduct Bayesian data analysis with step-by-step guidance
  • Gain insight into a modern, practical, and computational approach to Bayesian statistical modeling
  • Enhance your learning with best practices through sample problems and practice exercises
  • Purchase of the print or Kindle book includes a free PDF eBook.

Description

The third edition of Bayesian Analysis with Python serves as an introduction to the main concepts of applied Bayesian modeling using PyMC, a state-of-the-art probabilistic programming library, and other libraries that support and facilitate modeling like ArviZ, for exploratory analysis of Bayesian models; Bambi, for flexible and easy hierarchical linear modeling; PreliZ, for prior elicitation; PyMC-BART, for flexible non-parametric regression; and Kulprit, for variable selection. In this updated edition, a brief and conceptual introduction to probability theory enhances your learning journey by introducing new topics like Bayesian additive regression trees (BART), featuring updated examples. Refined explanations, informed by feedback and experience from previous editions, underscore the book's emphasis on Bayesian statistics. You will explore various models, including hierarchical models, generalized linear models for regression and classification, mixture models, Gaussian processes, and BART, using synthetic and real datasets. By the end of this book, you will possess a functional understanding of probabilistic modeling, enabling you to design and implement Bayesian models for your data science challenges. You'll be well-prepared to delve into more advanced material or specialized statistical modeling if the need arises.

Who is this book for?

If you are a student, data scientist, researcher, or developer looking to get started with Bayesian data analysis and probabilistic programming, this book is for you. The book is introductory, so no previous statistical knowledge is required, although some experience in using Python and scientific libraries like NumPy is expected.

What you will learn

  • Build probabilistic models using PyMC and Bambi
  • Analyze and interpret probabilistic models with ArviZ
  • Acquire the skills to sanity-check models and modify them if necessary
  • Build better models with prior and posterior predictive checks
  • Learn the advantages and caveats of hierarchical models
  • Compare models and choose between alternative ones
  • Interpret results and apply your knowledge to real-world problems
  • Explore common models from a unified probabilistic perspective
  • Apply the Bayesian framework's flexibility for probabilistic thinking

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jan 31, 2024
Length: 394 pages
Edition : 3rd
Language : English
ISBN-13 : 9781805125419
Category :
Languages :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Jan 31, 2024
Length: 394 pages
Edition : 3rd
Language : English
ISBN-13 : 9781805125419
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 119.97
Bayesian Analysis with Python
€37.99
Mastering NLP from Foundations to LLMs
€39.99
Transformers for Natural Language Processing and Computer Vision
€41.99
Total 119.97 Stars icon

Table of Contents

14 Chapters
Chapter 1 Thinking Probabilistically Chevron down icon Chevron up icon
Chapter 2 Programming Probabilistically Chevron down icon Chevron up icon
Chapter 3 Hierarchical Models Chevron down icon Chevron up icon
Chapter 4 Modeling with Lines Chevron down icon Chevron up icon
Chapter 5 Comparing Models Chevron down icon Chevron up icon
Chapter 6 Modeling with Bambi Chevron down icon Chevron up icon
Chapter 7 Mixture Models Chevron down icon Chevron up icon
Chapter 8 Gaussian Processes Chevron down icon Chevron up icon
Chapter 9 Bayesian Additive Regression Trees Chevron down icon Chevron up icon
Chapter 10 Inference Engines Chevron down icon Chevron up icon
Chapter 11 Where to Go Next Chevron down icon Chevron up icon
Bibliography Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.8
(21 Ratings)
5 star 81%
4 star 19%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Jon Barley Nov 08, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
An excellent introduction into practical Bayesian analysis with many illuminating examples.
Feefo Verified review Feefo
RP Aug 13, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
If you had to buy just one book on Bayesian analysis, this is the one to get. It takes a lot of skill to write a concise, readable book on such a complicated topic.
Amazon Verified review Amazon
ben Jun 18, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Osvaldo Martin’s “Bayesian Analysis with Python” is an exceptional resource for anyone looking to delve into the world of Bayesian inference using Python. The book is tailored for readers who possess a basic understanding of Python but may not have extensive knowledge of statistics or Bayesian methods. This accessibility makes it an ideal starting point for beginners while still offering depth for more experienced readers.One of the book’s standout features is its practical approach. Each chapter concludes with exercises that reinforce the concepts covered, and there is even a dedicated Discord space provided by the publisher for further discussion and learning. The introductory chapters lay a strong foundation in both theoretical and computational aspects of Bayesian inference, with “Thinking Probabilistically” and “Programming Probabilistically” offering a seamless blend of theory and hands-on coding with PyMC, one of the leading probabilistic programming languages.For those new to the subject, reading the first two chapters in tandem can be particularly beneficial, combining conceptual understanding with computational implementation. Subsequent chapters delve into specific modeling approaches, such as hierarchical models, generalized linear models, mixture models, Gaussian processes, and Bayesian adaptive regression trees (BART). Each of these chapters provides valuable insights and practical knowledge that can be directly applied to real-world problems.The book also covers essential topics in practice, including model comparison and evaluation, which are crucial for any data scientist. The chapter on Bambi is especially noteworthy, demonstrating how formula syntax can be used to efficiently build PyMC models, accompanied by clear visual representations of the models using Graphviz.Additionally, the chapter on inference engines serves as a comprehensive reference for understanding the mechanics behind Bayesian samplers and inference methods, making it a valuable resource for both teaching and practical application.Overall, “Bayesian Analysis with Python” is an excellent book for anyone interested in Bayesian inference. It successfully bridges the gap between core concepts and practical implementation, and it does so using the robust Bayesian “tech stack” of PyMC, ArviZ, and Bambi. The book also provides an excellent list of further resources, including books, code repositories, paid courses, and open-source community hangouts, making it a well-rounded and highly recommended read for aspiring Bayesian analysts.
Amazon Verified review Amazon
Banachan Mar 06, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This is a great practical guide to probabilistic modeling using Python, especially for those interested in or working with Bayesian data analysis. It covers a broad range of topics, from basic concepts to more advanced modeling techniques, so I can see it being an invaluable resource to both practitioners and students.The book provides a thorough exploration, specifically on addressing Bayesian analysis techniques. It's quite an enjoyable and comprehensive guide for both beginners and advanced practitioners. Good emphasis on fundamentals, then transitions to more complex concepts and application. I thought it provided a good balance between theory and practical skills.Summary of pros are that it encourages hands-on learning, with wide coverage on the subject. Doesn't hurt that it's an updated edition with up-to-date approaches. Some of the cons are that it might be complex in some sections for pure beginners. Code examples are mostly in Python (so I guess this could be a con or pro depending on how look at it).I still like that it has a lot of examples, code snippets, and real-world scenarios included, with good explanations. Covers model construction, prior selection, model comparison, to predictive analysis. I surmise that this allows for a variety of learning needs for most folks. Author style is both authoritative and accessible. I would recommend this book for academia, professional development, or just for personal interest in data science.
Amazon Verified review Amazon
Nicole M Radziwill May 07, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
While Python is my go-to language for things like NLP, I usually use R for everything else. After spending a solid long weekend with Martin's new book "Bayesian Analysis with Python" I can confirm that this book will be just what ONE audience needs, but may disappoint others. As a gentle introduction to Bayesian approaches for people who are well versed in intro statistics and have a solid foundation in Python, it's perfect. But if you're missing that mathematical statistics background (or if you're rusty on Python) this book may present a struggle.As a result, this is five stars for the target audience and four for the other audiences.The writing is clear and easy to follow, but sometimes encourages you to "review the code for understanding" where the text could have explained each of the lines of code in sequence. The book also assumes that the reader has a fundamental understanding of distributions and mathematical notation, which may not be the case for all programmers or data analysts. As a professor this would have been a great book to use from an introductory Bayesian methods course for juniors or seniors in STEM with at least one or two semesters of Python. For this group, the book is particularly strong, because it takes a computation-first approach but fills in the gaps with just enough theory.Highlights include:- There is a simple discussion on ROPE and loss functions that is valuable- There is a good discussion about how to do linear regression the Bayesian way (hint: all parameters treated as priors)- Some interesting mixture models using the Palmer Penguins dataset- The best part was the MCMC with Metropolis-Hastings to calculate the value of piDO buy this book if you have a solid foundation in Python (and a Python environment already set up) and want to spend a few weeks (or a couple months) expanding your understanding into building and running simple Bayesian models. If you have the time to spend, this will deepen your understanding.DO NOT buy this book if you are a programmer who needs to start building Bayesian models at work within the next couple days! It's not going to help you work that next ticket in the queue.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.