Packt+ | Advance your knowledge in tech

You're reading from Data Analysis with Python A Modern Approach

Product type Paperback

Published in Dec 2018

Publisher Packt

ISBN-13 9781789950069

Length 490 pages

Edition 1st Edition

Languages

Python

Tools

Jupyter

Concepts

Data Analysis

Author (1):

David Taieb

View More author details

Table of Contents (14) Chapters

Preface

1. Programming and Data Science – A New Toolset FREE CHAPTER

2. Python and Jupyter Notebooks to Power your Data Analysis

3. Accelerate your Data Analysis with Python Libraries

4. Publish your Data Analysis to the Web - the PixieApp Tool

5. Python and PixieDust Best Practices and Advanced Concepts

6. Analytics Study: AI and Image Recognition with TensorFlow

7. Analytics Study: NLP and Big Data with Twitter Sentiment Analysis

8. Analytics Study: Prediction - Financial Time Series Analysis and Forecasting

9. Analytics Study: Graph Algorithms - US Domestic Flight Data Analysis

10. The Future of Data Analysis and Where to Develop your Skills

A. PixieApp Quick-Reference

Other Books You May Enjoy

Leave a review – let other readers know what you think

Index

Part 1 – Acquiring the data with Spark Structured Streaming

To acquire the data, we use Tweepy which provides an elegant Python client library to access the Twitter APIs. The APIs covered by Tweepy are very extensive and covering them in detail is beyond the scope of this book, but you can find the complete API reference at the Tweepy official website: http://tweepy.readthedocs.io/en/v3.6.0/cursor_tutorial.html.

You can install the Tweepy library directly from PyPi using the pip install command. The following command shows how to install it from a Notebook using the ! directive:

!pip install tweepy

Note

Note: The current Tweepy version used is 3.6.0. Do not forget to restart the kernel after installing the library.

Architecture diagram for the data pipeline

Before we start diving into each component of the data pipeline, it would be good to take a look at its overall architecture and understand the computation flow.

As shown in the following diagram, we start by creating a Tweepy stream that...

The rest of the chapter is locked

You're reading from Data Analysis with Python A Modern Approach

Table of Contents (14) Chapters

Part 1 – Acquiring the data with Spark Structured Streaming

Note

Architecture diagram for the data pipeline

Authors (1)

Other recommended products

Personalised recommendations for you

You're reading from Data Analysis with Python A Modern Approach

Table of Contents (14) Chapters

Part 1 – Acquiring the data with Spark Structured Streaming

Note

Architecture diagram for the data pipeline

Unlock this book and the full library FREE for 7 days

Authors (1)

Other recommended products

Personalised recommendations for you