In order for this script to work, we'll need to have the script created in the Getting screenshots of a website with QtWeb Kit recipe. This should be saved in the Pythonxx/Lib
folder and named something clear and memorable. Here, we've named that script screenshot.py
. The naming of your script is particularly essential as we reference it with an important declaration.
This is the script that we will be using:
We first create our import declarations. In this script, we use the screenshot
script we created before and also the requests
library. The requests
library is used so that we can check the status of a request before trying to convert it to an image. We don't want to waste time trying to convert sites that don't exist.
Next, we import our libraries:
The next step sets up the array of common port numbers that we will be iterating over. We also set up a string with the IP address we will be using:
Next, we create strings to hold the protocol part of the URL that we will be building later; this just makes the code later on a little bit neater:
Next, we create our method, which will do the work of building the URL string. After we've created the URL, we check whether we get a 200
response code back for our get
request. If the request is successful, we convert the web page returned to an image and save it with the filename being the successful port number. The code is wrapped in a try
block because if the site doesn't exist when we make the request, it will throw an error:
Now that our method is ready, we simply iterate over each port in the port list and call our method. We do this once for the HTTP protocol and then with HTTPS:
And that's it. Simply run the script and it will save the images to the same location as the script.
You might notice that the script takes a while to run. This is because it has to check each port in turn. In practice, you would probably want to make this a multithreaded script so that it can check multiple URLs at the same time. Let's take a quick look at how we can modify the code to achieve this.
First, we'll need a couple more import declarations:
Next, we need to create a new function that we will call threader
. This new function will handle putting our testAndSave
functions into the queue:
Now that we have our new function, we just need to set up a new Queue
object and make a few threading calls. We will take out the testAndSave
calls from our FOR
loop over the portList
variable and replace it with this code:
So, our new script in total now looks like this:
If we run this now, we will get a much quicker execution of our code as the web requests are now being executed in parallel with each other.
You could try to further expand the script to work on a range of IP addresses too; this can be handy when you're testing an internal network range.