Summary
This chapter introduced the different aspects of a multifaceted topic such as data mining applied to social media using Python. We have gone through some of the challenges and opportunities that make this topic interesting to study and valuable to businesses that want to gather meaningful insights from social media data.
After introducing the topic, we also discussed the overall process of social media mining, including aspects such as authentication with OAuth. We also analyzed the details of the Python tools that should be part of the data toolbox of any data mining practitioner. Depending on the social media platform we're analyzing, and the type of insights that we are concerned about, Python offers robust and well-established packages for machine learning, NLP, and SNA.
We recommend that you set up a Python development environment with virtualenv
as described in pip and virtualenv section of this chapter, as this allows us to keep the global development environment clean.
The next chapter will focus on Twitter, particularly discussing on how to get access to Twitter data via the Twitter API and how to dice and slice such data in order to produce interesting information.