Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Building Data Science Solutions with Anaconda
Building Data Science Solutions with Anaconda

Building Data Science Solutions with Anaconda: A comprehensive starter guide to building robust and complete models

eBook
$9.99 $37.99
Paperback
$46.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Building Data Science Solutions with Anaconda

Chapter 1: Understanding the AI/ML landscape

In this opening chapter, we'll give you a little appreciation and context to the why behind AI and machine learning (ML). The only data we have comes from the past, and using that will help us predict the future. We'll take a look at the massive amount of data that is coming into the world today and try to get a sense of the scale of what we have to work with.

The main goal of any type of software or algorithm is to solve business and real-world problems, so we'll also take a look at how the applications take shape. If we use a food analogy, data would be the ingredients, the algorithm would be the chef, and the meal created would be the model. You'll learn about the most commonly used types of models within the broader landscape and how to know what to use.

There are a huge number of tools that you could use as a data scientist, and so we will also touch on how you can use solutions such as those provided by Anaconda to be able to do the actual work you want to and be able to take action as your models grow stale (which they will). By the end of this chapter, you'll have an understanding of the value and landscape of AI and be able to jumpstart any project that you want to build.

AI is the most exciting technology of our age and, throughout this first chapter, these topics will give you the solid foundation that we'll build upon through the rest of the book. These are all key concepts that will be commonplace in your day-to-day journey, and which you'll find to be invaluable in accomplishing what you need to.

In this chapter, we're going to cover the following main topics:

  • Understanding the current state of AI and ML
  • Understanding the massive generation of new data
  • How to create business value with AI
  • Understanding the main types of ML models
  • Dealing with out-of-date models
  • Installing packages with Anaconda

Introducing Artificial Intelligence (AI)

AI is moving fast. It has now become so commonplace that it's become an expectation that systems are intelligent. For example, not too long ago, the technology to compete against a human mind in chess was a groundbreaking piece of AI to be marveled at. Now we don't even give it a second thought. Millions of tactical and strategic calculations a second is now just a simple game that can be found on any computer or played on hundreds of websites.

That seemingly was intelligence… that was artificial. Simple right? With spam blockers, recommendation engines, and the best delivery route, the goalposts keep shifting so much that now, all of what was once thought of as AI is simply now regarded as everyday tools.

What was once considered AI is now just thought of as simply software. It seems that AI just means problems that are still unsolved. As those become normal, day-to-day operations, they can fade away from what we generally think of as AI. This is known as the Larry Tesler Theorem, which states "Artificial intelligence is whatever hasn't been done yet."

For example, if you asked someone what AI is, they would probably talk about autonomous driving, drone delivery, and robots that can perform very complex actions. All of these examples are very much in the realm of unsolved problems, and as (or if) they become solved, they may no longer be thought of as AI as the newer, harder problems take their place.

Before we dive any further, let's make sure we are aligned on a few terms that will be a focal point for the rest of the book.

Defining AI

It's important to call out the fact that there is no universal label as to what AI is, but for the purpose of this book, we will use the following definition:

"Artificial Intelligence (AI) is the development of computer systems to allow them to perform tasks that mimic the intelligence of humans. This can use vision, text, reading comprehension, complex problem solving, labeling, or other forms of input."

Defining a data scientist

Along with the definition of AI, defining what a data scientist is can also lead you to many different descriptions. Know that as with AI, the field of data science can be a very broad category. Josh Wills tweeted that a data scientist is the following:

"A person who is better at statistics than any software engineer and better at software engineering than any statistician."

While there may be some truth to that, we'll use the following definition instead:

"A data scientist is someone who gains insight and knowledge from data by analyzing, applying statistics, and implementing an AI approach in order to be able to answer questions and solve problems."

If you are reading this, then you probably fall into that category. There are many tools that a data scientist needs to be able to utilize to work toward the end goal, and we'll learn about many of those in this book.

Now that we've set a base level of understanding of what AI is, let's take a look at where the state of the world is regarding AI, and also learn about where ML fits into the picture.

Understanding the current state of AI and ML

The past is the only place where we can gather data to make predictions about the future. This is one of the core value propositions of AI and ML, and this is true for the field itself. I'll spare you from too much of the history lesson, but know that the techniques and approaches used today aren't new. In fact, neural networks have been around for over 60 years! Knowing this, keep in mind on your data science journey that a white paper or approach that you deem as old or out of date might just not have reached the right point for technology or data to catch up to it.

These systems allow for much greater scalability, distribution, and speed than if we had humans perform those same tasks. We will dive more into specific problem types later in the chapter.

Currently, one of the most well-known approaches to creating AI is neural networks, in which data scientists drew inspiration from how the human brain works. Neural networks were only a genuinely viable path when two things happened:

  • We made the connection in 2012 that, just like our brain, we could get vastly better results if we created multiple layers.
  • GPUs became fast enough to be able to train models in a reasonable timeframe.

This huge leap in AI techniques would not have been possible if we had not come back to the ideas of the past with fresh eyes and newer hardware.

Before more advanced GPUs were used, it simply took too long to train a model, and so this wasn't practical. Think about an assembly line making a car. If that moved along at one meter a day, that would be an effective end result, but it would take an extremely long time to produce a car (Henry Ford's 1914 assembly line moved at two meters a minute). Similar to 4k (and 8k) TVs being particularly useless until streaming or Blu-ray formats allowed us to have content that could even be shown in 4k, sometimes, other supporting technology needs to improve before the applications can be fully realized.

The massive increase in computational power in the last decade has unlocked the ability for the tensor computations to really shine and has taken us a long way from the Cornell report on The Perceptron (https://bit.ly/perceptron-cornell), the first paper to mention the ideas that would become the neural networks we use today. GPU power has increased at a rate such that the massive number of training runs can be done in hours, not years.

Tensors themselves are a common occurrence in physics and other forms of engineering and are an example of how data science has a heavy influence from other fields and has an especially strong relationship with mathematics. Now they are a staple tool in training deep learning models using neural networks.

Tensors

A tensor is simply a data structure that is commonly used in neural networks, but is a mathematical term. It can refer to matrices, vectors, and any n-dimensional arrays, but is mostly used to describe the latter when it comes to neural networks. It is where TensorFlow, the popular Google library, gets its name.

Deep learning is a technique in the field of AI and, more specifically, ML, but aren't they the same thing? The answer is no. Understanding the difference will help you focus on particular subsets and ensure that you have a clear picture of what is out there. Let's take a more in-depth look next.

Knowing the difference between AI and ML

Machine Learning (ML) is simply a machine being able to infer things based on input data without having to be specifically told what things are. It learns and deduces patterns and tries its best to fit new data into that pattern. ML is, in fact, a subset of the larger AI field, and since both terms are so widely used, it's valuable to get some brief examples of different types of AI and how the subsets fit into the broader term.

Let's look at a simple Venn diagram that shows the relationship between AI, ML, and deep learning. You'll see that AI is the broader concept, with ML and deep learning being specific subsets:

Figure 1.1 – Hierarchy of AI, ML, and deep learning

Figure 1.1 – Hierarchy of AI, ML, and deep learning

An example of AI that isn't ML is an expert system. This is a rule-based system that is designed for a very specific case, and in some ways can come down to if-else statements. It is following a hand-coded system behind the scenes, but that can be very powerful. A traffic light that switches to green if there is more than x number of cars in the North/South lane, but fewer than y cars in the East/West lane, would be an example.

These expert systems have been around for a long time, and the chess game was an example of that. The famous Deep Thought from Carnegie Mellon searched about 500 million possible outcomes per move to hunt down the best one. It was enough to put even the best chess players on the ropes. It later gave way to Deep Blue, which started to look like something closer to ML as it used a Bayesian structure to achieve its world conquest.

That's not AI! You might say. In an odd twist… IBM agrees with you, at least in the late 90s, as they actually claimed that it wasn't AI. This was likely due to the term having negative connotations associated with it. However, this mentality has changed in modern times. Many of the early promises of AI have come to fruition, solving many issues we knew we wanted to solve, and creating whole new sectors such as chatbots.

AI can be complex image detection, such as for self-driving, and voice recognition systems, such as Amazon's Alexa, but it can also be a system made up of relatively simple instructions. Think about how many simple tasks you carry out based on incredibly simple patterns. I'm hungry, I should eat. Those clothes are red, those are white, so they belong in different bins. Pretty simple right? The fact is that AI is a massive term that can include much more than what it's given credit for.

Much of what AI has become in the last 10 years is due to the high amount of data that it has access to. In the next section, we'll take a look in a little more detail at what that looks like.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Learn from an AI patent-holding engineering manager with deep experience in Anaconda tools and OSS
  • Get to grips with critical aspects of data science such as bias in datasets and interpretability of models
  • Gain a deeper understanding of the AI/ML landscape through real-world examples and practical analogies

Description

You might already know that there's a wealth of data science and machine learning resources available on the market, but what you might not know is how much is left out by most of these AI resources. This book not only covers everything you need to know about algorithm families but also ensures that you become an expert in everything, from the critical aspects of avoiding bias in data to model interpretability, which have now become must-have skills. In this book, you'll learn how using Anaconda as the easy button, can give you a complete view of the capabilities of tools such as conda, which includes how to specify new channels to pull in any package you want as well as discovering new open source tools at your disposal. You’ll also get a clear picture of how to evaluate which model to train and identify when they have become unusable due to drift. Finally, you’ll learn about the powerful yet simple techniques that you can use to explain how your model works. By the end of this book, you’ll feel confident using conda and Anaconda Navigator to manage dependencies and gain a thorough understanding of the end-to-end data science workflow.

Who is this book for?

If you’re a data analyst or data science professional looking to make the most of Anaconda’s capabilities and deepen your understanding of data science workflows, then this book is for you. You don’t need any prior experience with Anaconda, but a working knowledge of Python and data science basics is a must.

What you will learn

  • Install packages and create virtual environments using conda
  • Understand the landscape of open source software and assess new tools
  • Use scikit-learn to train and evaluate model approaches
  • Detect bias types in your data and what you can do to prevent it
  • Grow your skillset with tools such as NumPy, pandas, and Jupyter Notebooks
  • Solve common dataset issues, such as imbalanced and missing data
  • Use LIME and SHAP to interpret and explain black-box models

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : May 27, 2022
Length: 330 pages
Edition : 1st
Language : English
ISBN-13 : 9781800561564
Category :
Concepts :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : May 27, 2022
Length: 330 pages
Edition : 1st
Language : English
ISBN-13 : 9781800561564
Category :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 153.97
Machine Learning with PyTorch and Scikit-Learn
$54.99
Building Data Science Solutions with Anaconda
$46.99
The Pandas Workshop
$51.99
Total $ 153.97 Stars icon
Banner background image

Table of Contents

15 Chapters
Part 1: The Data Science Landscape – Open Source to the Rescue Chevron down icon Chevron up icon
Chapter 1: Understanding the AI/ML landscape Chevron down icon Chevron up icon
Chapter 2: Analyzing Open Source Software Chevron down icon Chevron up icon
Chapter 3: Using the Anaconda Distribution to Manage Packages Chevron down icon Chevron up icon
Chapter 4: Working with Jupyter Notebooks and NumPy Chevron down icon Chevron up icon
Part 2: Data Is the New Oil, Models Are the New Refineries Chevron down icon Chevron up icon
Chapter 5: Cleaning and Visualizing Data Chevron down icon Chevron up icon
Chapter 6: Overcoming Bias in AI/ML Chevron down icon Chevron up icon
Chapter 7: Choosing the Best AI Algorithm Chevron down icon Chevron up icon
Chapter 8: Dealing with Common Data Problems Chevron down icon Chevron up icon
Part 3: Practical Examples and Applications Chevron down icon Chevron up icon
Chapter 9: Building a Regression Model with scikit-learn Chevron down icon Chevron up icon
Chapter 10: Explainable AI - Using LIME and SHAP Chevron down icon Chevron up icon
Chapter 11: Tuning Hyperparameters and Versioning Your Model Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Full star icon 5
(12 Ratings)
5 star 100%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Paul Burnett Dec 04, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
You books and videos are intelligent and cover key concepts. I tend to bounce around multiple authors on relevant ai themes. I look at the library modules tools and the power of the information your site gives me. 5 star to all you team. It been a pleasure learning with you. Paul burnett Biomedical eng. and data software programmer in ai.
Feefo Verified review Feefo
Yiqiao Yin Jul 28, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
It's great reading this project. I feel like I come from a unique place because I did not come from a place where I find downloading packages and create environments helpful. Hence, this book is really helping me to reshape some of my views. I also found it valuable that this book is able to provide some of the fundamental building block in conda.In conda, I typically create an environment I like and fire up a jupyter lab. Then I do my dev work in there. For me personally, this is a pretty efficient workflow. Hence, it's really a overview for me for the first 6 chapters to review some of these concepts. However, if you are intro level, this book is a great start. It depends on your level really.In addition, as my YouTube suggested. The book goes above and beyond to introduce something on top of machine learning. Coming from statistics background, I really appreciate that the author discusses biases and variances. Moreover, the later chapters discussed shap and lime value which is also something I investigated during my graduate program.Overall, I really enjoyed reading this book and I recommend to all others to read this book too!
Amazon Verified review Amazon
Jerimiah Jun 21, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Whether you are someone wanting to get started with Data Science, or an experienced practitioner who has been away from the field for a while (like me), this book provides you everything you need to know. Each chapter builds on the last, but they are also relatively independent of each other, so if you need to quickly brush up on a specific subject/method (like versioning your ML models), it's easy to do so. The metaphors used throughout the book are vivid and memorable, to help the reader get a better intuition for complicated concepts. The projects and examples use realistic scenarios that do a great job of walking the reader through the code and steps to building an ML model in the same way they would do them in real life. I'll definitely be keeping this book handy as a reference for when I ever need to work on an ML project in the future!
Amazon Verified review Amazon
Karl Weinmeister Jul 03, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Dan's book covers an end-to-end path from setting up your environment to building a model. Readers will see how to apply popular open-source packages for data science, in particular pandas and scikit-learn. The book also introduces conceptual topics such as how to select an appropriate ML model type. I would highly recommend this book for emerging data scientists who want to get up-to-speed quickly on common concepts and tools.
Amazon Verified review Amazon
Jamie Vernon May 27, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The authors really knocked it out of the park with this one. As someone with little experience with AI or Am, this was a great read. Can’t wait to get deeper into the space.. 5 stars all day
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.