What do you get with Print?

Instant access to your digital copy whilst your Print order is Shipped

Paperback book shipped to your preferred address

Redeem a companion digital copy on all Print orders

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Python Data Analysis

Chapter 2. NumPy Arrays

After installing NumPy and other key Python-programming libraries and getting some code to work, it's time to pass over NumPy arrays. This chapter acquaints you with the fundamentals of NumPy and arrays. At the end of this chapter, you will have a basic understanding of NumPy arrays and their related functions.

The topics we will address in this chapter are as follows:

Data types
Array types
Type conversions
Creating arrays
Indexing
Fancy indexing
Slicing
Manipulating shapes

NumPy numerical types

Python has an integer type, a float type, and complex type; nonetheless, this is not sufficient for scientific calculations. In practice, we still demand more data types with varying precisions and, consequently, different storage sizes of the type. For this reason, NumPy has many more data types. The bulk of the NumPy mathematical types ends with a number. This number designates the count of bits related to the type. The following table (adapted from the NumPy user guide) presents an overview of NumPy numerical types:

Type	Description
`bool`	Boolean (`True` or `False`) stored as a bit
`inti`	Platform integer (normally either `int32` or `int6` `4`)
`int8`	Byte (-128 to 127)
`int16`	Integer (-32768 to 32767)
`int32`	Integer (-2 31 to 2 31 -1)
`int64`	Integer (-2 63 to 2 63 -1)
`uint8`	Unsigned integer (0 to 255)
`uint16`	Unsigned integer (0 to 65535)
`uint32`	Unsigned integer (0 to 2 ** 32 - 1)
`uint64`	Unsigned integer (0 to 2 ** 64 - 1)
`float16...`

NumPy numerical types

Type	Description
`bool`	Boolean (`True` or `False`) stored as a bit
`inti`	Platform integer (normally either `int32` or `int6` `4`)
`int8`	Byte (-128 to 127)
`int16`	Integer (-32768 to 32767)
`int32`	Integer (-2 31 to 2 31 -1)
`int64`	Integer (-2 63 to 2 63 -1)
`uint8`	Unsigned integer (0 to 255)
`uint16`	Unsigned integer (0 to 65535)
`uint32`	Unsigned integer (0 to 2 ** 32 - 1)
`uint64`	Unsigned integer (0 to 2 ** 64 - 1)
`float16...`

Description

This book is for programmers, scientists, and engineers who have knowledge of the Python language and know the basics of data science. It is for those who wish to learn different data analysis methods using Python and its libraries. This book contains all the basic ingredients you need to become an expert data analyst.

What you will learn

Install open source Python modules on various platforms

Get to know about the fundamentals of NumPy including arrays

Manipulate data with pandas

Retrieve, process, store, and visualize data

Understand signal processing and timeseries data analysis

Work with relational and NoSQL databases

Discover more about data modeling and machine learning

Get to grips with interoperability and cloud computing

What do you get with Print?

Instant access to your digital copy whilst your Print order is Shipped

Paperback book shipped to your preferred address

Redeem a companion digital copy on all Print orders

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Frequently bought together

Python Data Analysis

Oct 2014 348 pages

3.9 (16)

eBook

₱1571.99 ~~₱2245.99~~

R for Data Science

Dec 2014 364 pages

3.4 (5)

eBook

₱1571.99 ~~₱2245.99~~

Python Data Science Essentials

Apr 2015 258 pages

4.2 (6)

eBook

₱1256.99 ~~₱1796.99~~

Total ₱ 7,859.97

₱2806.99

₱2245.99

Total ₱ 7,859.97

Filter reviews by

All

Amazon verified reviews

Miss Fatty Nov 23, 2014

A really a good source for analysts/scientists using the python language. The author does a great job introducing the reader to core Python libraries, I found the information to be digestible even with my limited knowledge of the subject matter. I must say very well written: covers concrete practical examples to show you the capabilities of a suite of tools. I am very happy with the purchase. This book is helping me move my skill set beyond R and into Python.

Amazon Verified review

Mikel Viera Nov 25, 2014

If you're looking for a book that discusses Python data analysis in a broadpractical sense, this is the book. The author conveys his data analysisexperience into the text really well. The chapters are genuinely helpfulwith well written tutorials on Pythons excellent libraries: NumPy, SciPy,Pandas, IPython, Matplotlib etc.One of the better books I've read on the subject . Highly recommended!

Harris Nov 23, 2014

Really happy with it so far. This is very nicely written and a must have for data analysts using Python. Author covers examples with all the main libraries including NumPY, SciPY, Ipython in a clear concise manner. Well structured and focused throughout. Nice to see Sci-kit coverage too! (Looking forward to the Machine learning chapters).

Daniel Lee Feb 22, 2015

As an experienced Python developer, I really enjoyed the book. That being said, it has a targeted audience and if you aren't in it, you'll probably be happier not picking it up. The book is aimed at people who:1. Already know Python2. Already know data analysis3. Want a broad overview rather than in-depth explanationsThe book got interesting for me in chapter 2, where NumPy arrays were explained, as well as a lot of the operations that can be done on them. I have limited experience with NumPy, and I found the section just in-depth enough to be interesting and just shallow enough to allow me to use it as a springboard to find what I needed to in other areas. As NumPy is a dependency for just about anything that does any number crunching in Python, it was good to have these basics.What really opened my eye, though, was chapter 4 - the pandas primer. I'm a heavy R user for research and have often wished that I could do more data analysis in Python. pandas allows you to do this as comfortably, if not more comfortably, than in Python, and because the package is in Python it's a lot faster and a lot easier to embed in other software. I'm in the middle of a project at the moment and had been struggling with R - the implementation of an analysis milestone was just too slow. After reading this chapter, I implemented it in Python and was able to notice amaing speed improvements, not to mention the maintainability advantages. Data analysis is fun again!The rest of the book is more of a quick blowthrough of what's possible in Python. Although the author doesn't discuss any topic in depth, he does refer you to other books and websites. The purpose was to show readers what's possible. In my opinion, this is basically a longer preview of several data analysis possibilities. If you know roughly what you want to do in the following areas:* Processing common formats like CSV, HDF or XML* Interfacing with databases* Interfacing with other languages* Visualizing data* Signal processing and machine learning* Text analysis* Profiling, performance optimization and parallelizationthen you'll find some good recommendations here with short descriptions and examples that will help you decide which of the packages introduced here are interesting. If you want more, you'll have to dig on your own, but the author gives you good starting points and with such a variety of topics I don't know how one could realistically expect anything else.All in all, a great read, especially for data analysts whose main question is "Can I do this in Python, and if yes, how?"

Michael Bright Feb 17, 2015

This book provides a very broad overview of toolkits and methods available forperforming Data Analysis with Python - and it does an excellent job at that.Whilst breadth of coverage will always be a trade off against depth, the bookprovides good balance by providing runnable code and data examplesacross the broad spectrum of tools and techniques referred to.It also provides many internet references to allow to dig deeper into particular subjects.Even in chapters where reference is made to interfacing to non-Pythonenvironments (Matlab/Octave, R, Java, SWIG, Boost, Fortran), external cloudenvironments (GAE, PythonAnywhere, Wakari) or Performance tools (Profiling,Cython, JobLib, JUG, MPI) or many Databases/tools code examples are given.Each chapter finishes with a summary as well as a preface of the followingchapter - there are many useful forward and backward references in the bookmaking it easy to digest and find information.The book starts with setting up the environment (either building from sourceor installing on various operating systems), followed by a brief introduction to Numpybefore demonstrating how much faster Numpy is than Python native arrays whilstbenefiting from array abstractions.Chapters 2,3,4 cover Numpy in detail (data types, slicing and shaping arrays,boolean indexing to select subsets of data), some Statistics and LinearAlgebra (probability distributions, SciPy, removing extreme values, plottingimages and graphs), Pandas (DataFrames and Series, concatenating frames, basicanalysis and aggregation of data).Chapter 5 covers retrieving of data from a wide variety of sources/formats(CSV, .npy Pickled files, HDF5, REST/json, RSS, HTML/BeautifulSoup, Excel)complete with working examples.Chapter 6 delves into Data Visualization using matplotlib, or Pandas.plots,for a variety of types of plots (histogram, scatter, bubble, lag, log, autocorrelation plots).Chapter 7 covers Signal Processing and Time series. It shows the statsmodelsubpackage and dicusses moving averages, windows, (boxcar, triangle, hamming),co-integration/autocorrelation, auto-regression, Fourier transforms.Chapter 8 provides information on interfacing with Databases and brieflycovers many ways to do this using Sqlite, SQLAlchemy, Pony, Dataset, MongoDB,Redis or Cassandra, again with usable examples.Chapters 9 and 10 go deeper into analysis covering analysis of textual data,social media data through natural language processing (NLTK), Bayesclassification, sentiment analysis, followed by predictive analysis usingscikit-learn, support vector machines, ElasticNetCV, neural nets, decisiontrees and clustering.Chapter 11 covers interfacing to external environments through the integrationwith other toolkits such as Matlab/Octave, languages such as R, Fortran, Java,C and use of some Python supporting cloud platforms (Googles' GAE,PythonAnywhere, Wakari).The final chapter 12 addresses performance profiling and concurrency -covering tools such as ****PROFILER****, Cython, several process pool andparallel processing tools and JUG for MapReduce.Overall this book, and indeed each chapter, provide an excellent coverageof its' stated subject with many examples and links to external information.It is well worth buying for someone relatively new to Data Analysis in Pythonwho wants an introduction to the broad panoply of tools and techniquescurrently available.A few minor points follow which I feel could improve the book.Just a few of the examples given were not particularly interesting (e.g.grouping of foods/pricing/weather) but given the diversity of examples in thebook this is understandable.Although there are many diverse examples - using diverse toolkits, but alsodata scenarios I felt that these exercises often lacked the final step ofproviding some analysis. Examples would finish with a set of figures, or aplot which may be considered to be self evident but I felt lacked a conclusionof the form "notice how there is a cluster of values around ...","in this graph we see that there is a correlation ...", or"and so this result demonstrates the validity of our assumption that ...".The first Appendix is called "Key Concepts" - I'd have called this a Glossaryincluding an entry for all the tools touched on in the book and I've had madereference to this glossary early on in the book.I'd also have liked to see a section comparing the main tools so that it isclear in which situation to use different tools.That said the appendix is useful and it is followed by other useful appendices"Useful Functions" summarizing the most useful functions of Numpy, Pandas etc,and "Online Resources" which provides an extensive list of resources (tools,datasources, informative sites) referred to throughout the book and finally avery complete index.At various chapters there we see module 'subpackage lists' whilst it'sinteresting to be able to produce such lists, they're not actually veryreadable in the book.Once again, a very excellent introduction.

Python Data Analysis: Learn how to apply powerful data analysis techniques with popular open source Python modules

What do you get with Print?

Python Data Analysis

Chapter 2. NumPy Arrays

The NumPy array object

The advantages of NumPy arrays

Creating a multidimensional array

Selecting NumPy array elements

NumPy numerical types

One-dimensional slicing and indexing

The NumPy array object

The advantages of NumPy arrays

Creating a multidimensional array

Selecting NumPy array elements

NumPy numerical types

One-dimensional slicing and indexing

Manipulating array shapes

Creating array views and copies

Page 1 of 13

Description

What you will learn

Product Details

What do you get with Print?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs

Python Data Analysis: Learn how to apply powerful data analysis techniques with popular open source Python modules

What do you get with Print?

Contact Details

Shipping Address

Billing Address

The advantages of NumPy arrays

Description

What you will learn

Product Details

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs