Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Mastering Social Media Mining with Python

You're reading from   Mastering Social Media Mining with Python Unearth deeper insight from your social media data with advanced Python techniques for acquisition and analysis

Arrow left icon
Product type Paperback
Published in Jul 2016
Publisher Packt
ISBN-13 9781783552016
Length 338 pages
Edition 1st Edition
Languages
Concepts
Arrow right icon
Author (1):
Arrow left icon
Marco Bonzanini Marco Bonzanini
Author Profile Icon Marco Bonzanini
Marco Bonzanini
Arrow right icon
View More author details
Toc

Table of Contents (10) Chapters Close

Preface 1. Social Media, Social Data, and Python FREE CHAPTER 2. #MiningTwitter – Hashtags, Topics, and Time Series 3. Users, Followers, and Communities on Twitter 4. Posts, Pages, and User Interactions on Facebook 5. Topic Analysis on Google+ 6. Questions and Answers on Stack Exchange 7. Blogs, RSS, Wikipedia, and Natural Language Processing 8. Mining All the Data! 9. Linked Data and the Semantic Web

Processing data in Python

After introducing some the most important Python packages for data analytics, we take a small step back to describe some of the tools of interest to load and manipulate data from different formats with Python.

Most social media APIs provide data in JSON or XML. Python comes well equipped, from this point of view, with packages to support these formats that are part of the standard library.

For convenience, we will focus on JSON as this format can be mapped nicely into Python dictionaries and it's easier to read and understand. The interface of the JSON library is pretty straightforward, you can either load or dump data, from and to JSON to Python dictionaries.

Let's consider the following snippet:

# Chap01/demo_json.py 
import json 
 
if __name__ == '__main__': 
  user_json = '{"user_id": "1", "name": "Marco"}' 
  user_data = json.loads(user_json) 
  print(user_data['name']) 
  # Marco 
 
  user_data['likes'] = ['Python', 'Data Mining'] 
  user_json = json.dumps(user_data, indent=4) 
  print(user_json) 
  # { 
  #     "user_id": "1", 
  #     "name": "Marco", 
  #     "likes": [ 
  #         "Python", 
  #         "Data Mining" 
  #     ] 
  # } 

The json.loads() and json.dumps() functions manage the conversion from JSON strings to Python dictionaries and back. There are also two counterparts, json.load() and json.dump(), which operate with file pointers, in case you want to load or save JSON data from/to files.

The json.dumps() function also takes a second parameter, indent, to specify the number of characters of the indentation, which is useful for pretty printing.

When manually analyzing more complex JSON files, it's probably convenient to use an external JSON viewer that performs pretty printing within the browser, allowing the users to collapse and expand the structure as they wish.

There are several free tools for this, some of them are web-based services, such as JSON Viewer (http://jsonviewer.stack.hu). The user simply needs to paste a piece of JSON, or pass a URL that serves a piece of JSON, and the viewer will load it and display it in a user-friendly format.

The following image shows how the JSON document from the previous example is shown in JSON Viewer:

Processing data in Python
Figure 1.8: An example of pretty-printed JSON on JSON Viewer

As we can see in Figure 1.8, the likes field is a list, that can be collapsed to hide its element and ease the visualization. While this example is minimal, this feature becomes extremely handy to inspect complex documents with several nested layers.

Tip

When using a web-based service or browser extension, loading large JSON documents for pretty printing can clog up your browser and slow your system down.

You have been reading a chapter from
Mastering Social Media Mining with Python
Published in: Jul 2016
Publisher: Packt
ISBN-13: 9781783552016
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image