Data pull and pre-processing
In the previous step, we obtained two DataFrames:
- Our own pins through the Pinterest API
- Search results from the scraping tool
Now we will create different graph structures to analyze the relationships between users and topics.
Pinterest API data
One may wonder how we can build a relevant graph structure from a user's own pins. Intuitively, the only information which may be used to build a network is a board name. However, we can extract much more interesting relationships from the Description and Title and build a graph with them.
For this purpose we will extract bigrams, which will be considered as topics, and we will check how strong the links between these bigrams are.
Bigram extraction
Firstly, we use the code presented in previous chapters to find the most relevant bigrams in our dataset.
We import all the necessary libraries:
import nltk from nltk.collocations import * from nltk.corpus import stopwords import re
We define a function which will perform data cleaning...