Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Mastering Numerical Computing with NumPy

You're reading from   Mastering Numerical Computing with NumPy Master scientific computing and perform complex operations with ease

Arrow left icon
Product type Paperback
Published in Jun 2018
Publisher Packt
ISBN-13 9781788993357
Length 248 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (3):
Arrow left icon
Tiago Antao Tiago Antao
Author Profile Icon Tiago Antao
Tiago Antao
Mert Cuhadaroglu Mert Cuhadaroglu
Author Profile Icon Mert Cuhadaroglu
Mert Cuhadaroglu
Umit Mert Cakmak Umit Mert Cakmak
Author Profile Icon Umit Mert Cakmak
Umit Mert Cakmak
Arrow right icon
View More author details
Toc

Table of Contents (11) Chapters Close

Preface 1. Working with NumPy Arrays FREE CHAPTER 2. Linear Algebra with NumPy 3. Exploratory Data Analysis of Boston Housing Data with NumPy Statistics 4. Predicting Housing Prices Using Linear Regression 5. Clustering Clients of a Wholesale Distributor Using NumPy 6. NumPy, SciPy, Pandas, and Scikit-Learn 7. Advanced Numpy 8. Overview of High-Performance Numerical Computing Libraries 9. Performance Benchmarks 10. Other Books You May Enjoy

Trimmed statistics

As you will have noticed in the previous section, the distributions of our features are very dispersed. Handling the outliers in your model is a very important part of your analysis. It is also very crucial when you look at descriptive statistics. You can be easily confused and misinterpret the distribution because of these extreme values. SciPy has very extensive statistical functions for calculating your descriptive statistics in regards to trimming your data. The main idea of using the trimmed statistics is to remove the outliers (tails) in order to reduce their effect on statistical calculations. Let's see how we can use these functions and how they will affect our feature distribution:

In [58]: np.set_printoptions(suppress= True, linewidth= 125)
samples = dataset.data
CRIM = samples[:,0:1]
minimum = np.round(np.amin(CRIM), decimals...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image