Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletter Hub

Free Learning

You're reading from Hands-On Big Data Modeling Effective database design techniques for data architects and business intelligence professionals

Product type Paperback

Published in Nov 2018

Publisher Packt

ISBN-13 9781788620901

Length 306 pages

Edition 1st Edition

Languages

Python

Tools

Bitcoin

Concepts

Big Data

Authors (3):

James Lee

Tao Wei

Suresh Kumar Mukhiya

View More author details

Table of Contents (17) Chapters

Preface

1. Introduction to Big Data and Data Management FREE CHAPTER

2. Data Modeling and Management Platforms

3. Defining Data Models

4. Categorizing Data Models

5. Structures of Data Models

6. Modeling Structured Data

7. Modeling with Unstructured Data

8. Modeling with Streaming Data

9. Streaming Sensor Data

10. Concept and Approaches of Big-Data Management

11. DBMS to BDMS

12. Modeling Bitcoin Data Points with Python

13. Modeling Twitter Feeds Using Python

14. Modeling Weather Data Points with Python

15. Modeling IMDb Data Points with Python

16. Other Books You May Enjoy

Leave a review - let other readers know what you think

Importing Twitter feed data

In this chapter, we are going to use the Twitter dataset. The dataset can be downloaded from Kaggle (https://www.kaggle.com/kingburrito666/better-donald-trump-tweets). In addition to that, you can find the dataset inside this book's GitHub repository. The dataset is found inside the CH13 folder. The file is called Donald-Tweets!.csv.

Let's dive into the exercise by running the Jupyter notebook we've been using throughout this book. Once you have the notebook up and running, import the required essential packages. Most of the Python packages should be very familiar by now:

from textblob import TextBlob
import math
import pandas as pd
import numpy as np
import os
from pandas import DataFrame
from sklearn.cluster import KMeans
from sklearn import preprocessing
import matplotlib.pyplot as plt
from matplotlib import style
import matplotlib.pyplot as...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (3)

Lee

James Lee is a passionate software wizard working at one of the top Silicon Valley-based start-ups specializing in big data analysis. He has also worked at Google and Amazon. In his day job, he works with big data technologies, including Cassandra and Elasticsearch, and is an absolute Docker geek and IntelliJ IDEA lover. Apart from his career as a software engineer, he is keen on sharing his knowledge with others and guiding them, especially in relation to start-ups and programming. He has been teaching courses and conducting workshops on Java programming / IntelliJ IDEA since he was 21. James also enjoys skiing and swimming, and is a passionate traveler.

See other products by Lee

Wei

Tao Wei is a passionate software engineer who works in a leading Silicon Valley-based big data analysis company. Previously, Tao worked in big IT companies, including IBM and Cisco. He has intensive experience in designing and building distributed, large-scale systems with proven high availability and reliability. Tao has an MS degree in computer science from McGill University and many years' experience as a teaching assistant in a variety of computer science classes. In his spare time, he enjoys reading and swimming, and is a passionate photographer.

See other products by Wei

Kumar Mukhiya

Suresh Kumar Mukhiya is a PhD candidate, currently affiliated to the Western Norway University of Applied Sciences (HVL). He is a big data enthusiast, specializing in Information Systems, Model-Driven Software Engineering, Big Data Analysis, Artificial Intelligence and Frontend development. He has completed a Masters in Information Systems from the Norwegian University of Science and Technology (NTNU, Norway) along with a thesis in processing mining. He also holds a bachelor's degree in computer science and information technology (BSc.CSIT) from Tribhuvan University, Nepal, where he was decorated with the Vice-Chancellor's Award for obtaining the highest score. He is a passionate photographer and a resilient traveler.

See other products by Kumar Mukhiya