Application usage overview
The home web page is as follows:
The user can type in the movie name, if they want to know the review's sentiments and relevance. For example, we look for Batman vs Superman Dawn of Justice in the following screenshot:
The application collects and scrapes 18 reviews from the Bing search engine and, using the Scrapy library, it analyzes their sentiment (15 positive and 3 negative). All data is stored in Django models, ready to be used to calculate the relevance of each page using the PageRank algorithm (the links at the bottom of the page as seen in the preceding screenshot). In this case, using the PageRank algorithm, we have the following:
This is a list of the most relevant pages to our movie review search, setting a depth parameter 2 on the scraping crawler (refer the following section for further details). Note that to have a good result on page relevance, you have to crawl thousands of pages (the preceding screenshot shows results for around 50 crawled pages...