Search engine choice and the application code
Since scraping directly from the most relevant search engines such as Google, Bing, Yahoo, and others is against their term of service, we need to take initial review pages from their REST API (using scraping services such as Crawlera, http://crawlera.com/, is also possible). We decided to use the Bing service, which allows 5,000 queries per month for free.
In order to do that, we register to the Microsoft Service to obtain the key needed to allow the search. Briefly, we followed these steps:
Register online on https://datamarket.azure.com.
In My Account, take the Primary Account Key.
Register a new application (under DEVELOPERS | REGISTER; put Redire ct URI:
https://www.
bing.com
)
After that, we can write a function that retrieves as many URLs relevant to our query as we want:
num_reviews = 30 def bing_api(query): keyBing = API_KEY # get Bing key from: https://datamarket.azure.com/account/keys credentialBing = 'Basic ' + (':%s' % keyBing...